Agents Framework 介绍

246 阅读11分钟

LangChain

文档

中文 LangChain 框架介绍 | 🦜️🔗 Langchain

image.png

安装

How to install LangChain packages | 🦜️🔗 LangChain

  • 安装主包 : 代表了 langchain 的入口
pip install langchain
  • 安装 Core: 负责 LangChain 其他系统部分的编排和使用,会在安装 langchain 时自动安装,也可以独立安装。
pip install langchain-core
  • 类似集成 ollama, openAI, Anthropic 会有独立的集成包
# install package
pip install -U langchain-ollama
  • 其他没有独立集成包的则都放入 langchain-community 统一集成包中
pip install langchain-community

定义 Chain

Abstract base class for creating structured sequences of calls to components.

Chains should be used to encode a sequence of calls to components like models, document retrievers, other chains, etc., and provide a simple interface to this sequence.

Build a simple LLM application with chat models and prompt templates

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

# Define prompt
prompt = ChatPromptTemplate.from_messages(
    [("system", "Write a concise summary of the following:\\n\\n{context}")]
)

# Instantiate chain
chain = create_stuff_documents_chain(model, prompt)

# Invoke chain
result = chain.invoke({"context": docs})
print(result)

LCEL

The LangChain Expression Language (LCEL) takes a declarative approach to building new Runnables from existing Runnables.

This means that you describe what should happen, rather than how it should happen, allowing LangChain to optimize the run-time execution of the chains.

We often refer to a Runnable created using LCEL as a "chain". It's important to remember that a "chain" is Runnable and it implements the full Runnable Interface.

LCEL

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate.from_messages(
    [("user", "Tell me a {adjective} joke")],
)

chain = prompt | ChatOpenAI() | StrOutputParser()

chain.invoke({"adjective": "funny"})

legacy-chains

LCEL aims to provide consistency around behavior and customization over legacy subclassed chains such as LLMChain and ConversationalRetrievalChain. Many of these legacy chains hide important details like prompts, and as a wider variety of viable models emerge, customization has become more and more important.

If you are currently using one of these legacy chains, please see this guide for guidance on how to migrate.

For guides on how to do specific tasks with LCEL, check out the relevant how-to guides.

from langchain.chains import LLMChain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate.from_messages(
    [("user", "Tell me a {adjective} joke")],
)

legacy_chain = LLMChain(llm=ChatOpenAI(), prompt=prompt)

legacy_result = legacy_chain({"adjective": "funny"})
legacy_result

定义 Tool

自定义 Tool: How to create tools | 🦜️🔗 LangChain

使用Function Call功能时,你需要定义(并不是真的写程序去定义一个函数,而仅仅是用文字来描述一个函数)一些function(需要指定函数名,函数用途的描述,参数名,参数描述),传给LLM,当用户输入一个问题时,LLM通过文本分析是否需要调用某一个function,如果需要调用,那么LLM返回一个json,json包括需要调用的function名,需要输入到function的参数名,以及参数值。总而言之,function call帮我们做了两件事情 :1.判断是否要调用某个预定义的函数。2.如果要调用,从用户输入的文本里提取出函数所需要的函数值。

定义 Agent

在 LangChain 中,Agents 是使用 LLMs 推理能力,决策需要执行的动作和输入的参数,在执行之后,将返回的结果反馈给 LLM ,再次决定需要执行的动作,或者是结束任务,返回结果。其中动作的执行,是通过调用 tool-calling. 完成。

在 LangChain 中,Agent 是一个代理,接收用户的输入,采取相应的行动然后返回行动的结果。Agent 的作用是代表用户或其他系统完成任务,例如数据收集、数据处理、决策支持等。Agent 可以是自主的,具备一定程度的智能和自适应性,以便在不同的情境中执行任务。

构造 Agent

How to migrate from legacy LangChain agents to LangGraph | 🦜️🔗 LangChain

v0.3

python.langchain.com/docs/tutori…

# 1. 构造联网搜索工具
from tool.baidu_search_tool import BaiduSearchTool
search_tool = BaiduSearchTool()

# 2. 构造 LLM 模型
from langchain_ollama import ChatOllama
model = ChatOllama(
    model="qwen2.5:7b",
    temperature=0,
)

# 3. 构造 ReAct 模式的 Agent, 并将前面的 LLM 模型和工具传入
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage
agent_executor = create_react_agent(model, [search_tool], debug=True)

# 4. 启动 Agent
response = agent_executor.invoke({"messages": [HumanMessage(content="请告诉我深圳今天日期和天气?")]})

messages = response["messages"]

print(f"response.messages: {messages}")
print(f"response: {response}")

V0.2

1. initialize_agent

使用 langchain.agents 中的 initialize_agent 方法构造 AgentExecutor。传入两个必要参数:llm 语言模型和 tools 工具列表。外加可以选参数 AgentType, 指定 Agent 类型;

下面是 initialize_agent 方法的声明代码:

from langchain.agents import initialize_agent,
agent = initialize_agent(tools, self._llm, agent=AgentType.REACT_DOCSTORE)
def initialize_agent(
    tools: Sequence[BaseTool],
    llm: BaseLanguageModel,
    agent: Optional[AgentType] = None,
    callback_manager: Optional[BaseCallbackManager] = None,
    agent_path: Optional[str] = None,
    agent_kwargs: Optional[dict] = None,
    *,
    tags: Optional[Sequence[str]] = None,
    **kwargs: Any,
) -> AgentExecutor:
    """Load an agent executor given tools and LLM.

    Args:
        tools: List of tools this agent has access to.
        llm: Language model to use as the agent.
        agent: Agent type to use. If None and agent_path is also None, will default to
            AgentType.ZERO_SHOT_REACT_DESCRIPTION.
        callback_manager: CallbackManager to use. Global callback manager is used if
            not provided. Defaults to None.
        agent_path: Path to serialized agent to use.
        agent_kwargs: Additional keyword arguments to pass to the underlying agent
        tags: Tags to apply to the traced runs.
        **kwargs: Additional keyword arguments passed to the agent executor

    Returns:
        An agent executor
    """

AgentType 的常见类型如下: 代理类型# – LangChain中文网

  • AgentType.ZERO_SHOT_REACT_DESCRIPTION: 此代理使用ReAct框架,仅基于工具的描述来确定要使用的工具。可以提供任意数量的工具。此代理需要为每个工具提供描述。
  • AgentType.REACT_DOCSTORE: 这个代理使用ReAct框架与文档存储进行交互。必须提供两个工具:一个Search工具和一个Lookup工具(它们必须被命名为这样)。Search工具应该搜索文档,而Lookup工具应该查找最近找到的文档中的一个术语。
  • AgentType.SELF_ASK_WITH_SEARCH: 这个代理使用一个被命名为Intermediate Answer的工具。这个工具应该能够查找问题的事实性答案。这个代理相当于最初的self ask with search paper(opens in a new tab),其中提供了Google搜索API作为工具
  • AgentType.CONVERSATIONAL_REACT_DESCRIPTION: 这个代理程序旨在用于对话环境中。提示设计旨在使代理程序有助于对话。 它使用ReAct框架来决定使用哪个工具,并使用内存来记忆先前的对话交互。
2. Agent 构建系列方法

Agent Types | 🦜️🔗 Langchain

  1. create_openai_functions_agent(llm, tools, prompt)
  2. create_openai_tools_agent(llm, tools, prompt)
  3. create_xml_agent(llm, tools, prompt)
  4. create_json_chat_agent(llm, tools, prompt)
  5. create_structured_chat_agent(llm, tools, prompt)
  6. create_react_agent(llm, tools, prompt)
  7. create_self_ask_with_search_agent(llm, tools, prompt)

这些方法在 LLM 语言模型和 Tools 列表的基础上,额外增加了 prompt 模版 ,prompt 是特定格式,可以调用接口 hub.pull 获取。

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")

Multi-Agent

LangGraph: Multi-Agent Workflows

llamaIndex

文档

Agents - LlamaIndex

安装

Installation and Setup - LlamaIndex

pip install llama-index

定义 Workflow

Workflow in LlamaIndex is an event-driven abstraction used to chain together several events. Workflows are made up of steps, with each step responsible for handling certain event types and emitting new events.

Workflows in LlamaIndex work by decorating function with a @step decorator. This is used to infer the input and output types of each workflow for validation, and ensures each step only runs when an accepted event is ready.

You can create a Workflow to do anything! Build an agent, a RAG flow, an extraction flow, or anything else you want.

Helloworld

from llama_index.core.workflow import (
    StartEvent,
    StopEvent,
    Workflow,
    step,
)


class MyWorkflow(Workflow):
    @step
    async def my_step(self, ev: StartEvent) -> StopEvent:
        # do something here
        return StopEvent(result="Hello, world!")

import asyncio

async def main():
    w = MyWorkflow(timeout=10, verbose=False)
    result = await w.run()
    print(result)

if __name__ == "__main__":
    asyncio.run(main())

workflow 基本范式

OllmamaWorkflow

import asyncio
from llama_index.core.workflow import (
    StartEvent,
    StopEvent,
    Workflow,
    step,
)
from llama_index.llms.ollama import Ollama

class OllamaGenerator(Workflow):
    @step
    async def generate(self, ev: StartEvent) -> StopEvent:
        llm = Ollama(model="deepseek-r1:8b")
        response = await llm.acomplete(ev.query)
        return StopEvent(result=str(response))

async def ollama():
    w = OllamaGenerator(timeout=100, verbose=False)
    result = await w.run(query="Who are you?")
    print(result)

if __name__ == "__main__":
    asyncio.run(ollama())

为什么 event-driven

Other frameworks and LlamaIndex itself have attempted to solve this problem previously with directed acyclic graphs (DAGs) but these have a number of limitations that workflows do not:

  • Logic like loops and branches needed to be encoded into the edges of graphs, which made them hard to read and understand.
  • Passing data between nodes in a DAG created complexity around optional and default values and which parameters should be passed.
  • DAGs did not feel natural to developers trying to developing complex, looping, branching AI applications.

The event-based pattern and vanilla python approach of Workflows resolves these problems.

For simple RAG pipelines and linear demos we do not expect you will need Workflows, but as your application grows in complexity, we hope you will reach for them.

A key feature of Workflows is their enablement of branching and looping logic, more simply and flexibly than graph-based approaches. docs.llamaindex.ai/en/stable/u…

Examples

定义 Agent

Agents - LlamaIndex An "agent" is an automated reasoning and decision engine. It takes in a user input/query and can make internal decisions for executing that query in order to return the correct result. The key agent components can include, but are not limited to:

  • Breaking down a complex question into smaller ones
  • Choosing an external Tool to use + coming up with parameters for calling the Tool
  • Planning out a set of tasks
  • Storing previously completed tasks in a memory module

Agents - LlamaIndex Building a data agent requires the following core components:

  • A reasoning loop
  • Tool abstractions

The reasoning loop depends on the type of agent. We have support for the following agents:

构造 Agent

Building a basic agent - LlamaIndex

Examples

docs.llamaindex.ai/en/stable/e…

This notebook walks through setting up a Workflow to construct a ReAct agent from (mostly) scratch.

React calling agents work by prompting an LLM to either invoke tools/functions, or return a final response.

Our workflow will be stateful with memory, and will be able to call the LLM to select tools and process incoming user messages.

Agents - LlamaIndex

RAG

docs.llamaindex.ai/en/stable/u…

Stages within RAG

There are five key stages within RAG, which in turn will be a part of most larger applications you build. These are:

  • Loading: this refers to getting your data from where it lives -- whether it's text files, PDFs, another website, a database, or an API -- into your workflow. LlamaHub provides hundreds of connectors to choose from.
  • Indexing: this means creating a data structure that allows for querying the data. For LLMs this nearly always means creating vector embeddings, numerical representations of the meaning of your data, as well as numerous other metadata strategies to make it easy to accurately find contextually relevant data.
  • Storing: once your data is indexed you will almost always want to store your index, as well as other metadata, to avoid having to re-index it.
  • Querying: for any given indexing strategy there are many ways you can utilize LLMs and LlamaIndex data structures to query, including sub-queries, multi-step queries and hybrid strategies.
  • Evaluation: a critical step in any flow is checking how effective it is relative to other strategies, or when you make changes. Evaluation provides objective measures of how accurate, faithful and fast your responses to queries are.

Important concepts within RAG

There are also some terms you'll encounter that refer to steps within each of these stages.

Loading stage

Nodes and Documents: A Document is a container around any data source - for instance, a PDF, an API output, or retrieve data from a database. A Node is the atomic unit of data in LlamaIndex and represents a "chunk" of a source Document. Nodes have metadata that relate them to the document they are in and to other nodes.

Connectors: A data connector (often called a Reader) ingests data from different data sources and data formats into Documents and Nodes.

Indexing Stage

Indexes: Once you've ingested your data, LlamaIndex will help you index the data into a structure that's easy to retrieve. This usually involves generating vector embeddings which are stored in a specialized database called a vector store. Indexes can also store a variety of metadata about your data.

Embeddings: LLMs generate numerical representations of data called embeddings. When filtering your data for relevance, LlamaIndex will convert queries into embeddings, and your vector store will find data that is numerically similar to the embedding of your query.

Querying Stage#

Retrievers: A retriever defines how to efficiently retrieve relevant context from an index when given a query. Your retrieval strategy is key to the relevancy of the data retrieved and the efficiency with which it's done.

Routers: A router determines which retriever will be used to retrieve relevant context from the knowledge base. More specifically, the RouterRetriever class, is responsible for selecting one or multiple candidate retrievers to execute a query. They use a selector to choose the best option based on each candidate's metadata and the query.

Node Postprocessors: A node postprocessor takes in a set of retrieved nodes and applies transformations, filtering, or re-ranking logic to them.

Response Synthesizers: A response synthesizer generates a response from an LLM, using a user query and a given set of retrieved text chunks.

Embeddings

docs.llamaindex.ai/en/stable/m… Embeddings are used in LlamaIndex to represent your documents using a sophisticated numerical representation. Embedding models take text as input, and return a long list of numbers used to capture the semantics of the text. These embedding models have been trained to represent text this way, and help enable many applications, including search!

At a high level, if a user asks a question about dogs, then the embedding for that question will be highly similar to text that talks about dogs.

When calculating the similarity between embeddings, there are many methods to use (dot product, cosine similarity, etc.). By default, LlamaIndex uses cosine similarity when comparing embeddings.

There are many embedding models to pick from. By default, LlamaIndex uses text-embedding-ada-002 from OpenAI. We also support any embedding model offered by Langchain here, as well as providing an easy to extend base class for implementing your own embeddings.

docs.llamaindex.ai/en/stable/m…

Graph RAG

docs.llamaindex.ai/en/stable/e…

docs.llamaindex.ai/en/stable/m…

GraphRAG (Graphs + Retrieval Augmented Generation) combines the strengths of Retrieval Augmented Generation (RAG) and Query-Focused Summarization (QFS) to effectively handle complex queries over large text datasets. While RAG excels in fetching precise information, it struggles with broader queries that require thematic understanding, a challenge that QFS addresses but cannot scale well. GraphRAG integrates these approaches to offer responsive and thorough querying capabilities across extensive, diverse text corpora.

This notebook provides guidance on constructing the GraphRAG pipeline using the LlamaIndex PropertyGraph abstractions.

NOTE:  This is an approximate implementation of GraphRAG. We are currently developing a series of cookbooks that will detail the exact implementation of GraphRAG.

Examples

docs.llamaindex.ai/en/stable/e…

CrewAI

文档

Open source

Introduction - CrewAI

image.png

GitHub - joaomdmoura/crewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

crewAI: Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

The power of AI collaboration has too much to offer. CrewAI is designed to enable AI agents to assume roles, share goals, and operate in a cohesive unit - much like a well-oiled crew. Whether you're building a smart assistant platform, an automated customer service ensemble, or a multi-agent research team, CrewAI provides the backbone for sophisticated multi-agent interactions.

安装

github.com/joaomdmoura…

pip install crewai

If you want to also install crewai-tools, which is a package with tools that can be used by the agents, but more dependencies, you can install it with, example bellow uses it:

pip install 'crewai[tools]'

Workflow

Tools

  • 概念://docs.crewai.com/concepts/tools

定义 Agent

RAG

AutoGen

Agent runtime is a key concept of this framework. Besides delivering messages, it also manages agents’ lifecycle. So the creation of agents are handled by the runtime

AutoGen also supports a distributed agent runtime, which can host agents running on different processes or machines, with different identities, languages and dependencies.

To learn how to use agent runtime, communication, message handling, and subscription, please continue reading the sections following this quick start.