LangGraph 完全精通指南

4 阅读23分钟

目录

  1. 第一部分:LangGraph基础
  2. 第二部分:状态图(StateGraph)
  3. 第三部分:节点与边
  4. 第四部分:条件边与路由
  5. 第五部分:消息处理
  6. 第六部分:工具集成
  7. 第七部分:持久化与检查点
  8. 第八部分:多智能体系统
  9. 第九部分:LangGraph Swarm(Ultra)
  10. 第十部分:高级特性与优化
  11. 第十一部分:完整应用案例

第一部分:LangGraph基础

1.1 什么是LangGraph?

LangGraph是LangChain生态中专门用于构建有状态、可控的多智能体应用的框架。与LangChain的链式调用不同,LangGraph将复杂的AI工作流表示为图结构,提供了对执行流程的精细化控制。

LangGraph vs LangChain

维度LangChainLangGraph
抽象层级高层链式API低层图结构
流程控制线性或条件分支复杂图形化流程
状态管理隐式(通过消息)显式(自定义State)
循环处理有限循环完整循环支持
多代理基础支持原生支持
持久化需额外实现内置Checkpointer
时间旅行不支持完整支持
人工审核手动实现原生Interrupt

1.2 安装与配置

# 基础安装
pip install -U langgraph

# 与LangChain集成
pip install langchain langchain-openai langchain-community

# 数据库后端(可选)
pip install langgraph-checkpoint-postgres  # PostgreSQL
pip install langgraph-checkpoint-sqlite    # SQLite

# Swarm版本(多智能体)
pip install langgraph-swarm

# 开发工具
pip install langgraph-cli  # 命令行工具

1.3 LangGraph的核心架构

┌─────────────────────────────────────────────────┐
         Graph Execution Engine                   
  ┌──────────────────────────────────────────┐   
        StateGraph(状态图)                    
                                                
    ┌─────┐      ┌─────┐      ┌─────┐        
    │Node1│──────│Node2│──────│Node3│        
    └─────┘      └─────┘      └─────┘        
                                                
  └──────────────────────────────────────────┘   
                                                 
  ┌──────────────────────────────────────────┐   
    State(中央状态对象)                       
    - messages: list                           
    - custom_fields: ...                       
  └──────────────────────────────────────────┘   
                                                 
  ┌──────────────────────────────────────────┐   
    Checkpointer(持久化层)                    
    - SQLite / PostgreSQL                      
    - Thread管理                                
  └──────────────────────────────────────────┘   
└─────────────────────────────────────────────────┘

1.4 快速开始示例

from langgraph.graph import StateGraph, START, END
from langchain.chat_models import init_chat_model
from langchain.tools import tool
from typing_extensions import TypedDict
from typing import Annotated
from langgraph.graph.message import add_messages

# 1. 定义State
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    counter: int

# 2. 定义节点函数
def node_1(state):
    print("执行节点1")
    return {"counter": state["counter"] + 1}

def node_2(state):
    print("执行节点2")
    return {"counter": state["counter"] + 1}

# 3. 创建图
graph = StateGraph(AgentState)

# 4. 添加节点
graph.add_node("node1", node_1)
graph.add_node("node2", node_2)

# 5. 添加边
graph.add_edge(START, "node1")
graph.add_edge("node1", "node2")
graph.add_edge("node2", END)

# 6. 编译
app = graph.compile()

# 7. 执行
result = app.invoke({
    "messages": [],
    "counter": 0
})

print(f"最终计数: {result['counter']}")  # 输出:2

第二部分:状态图(StateGraph)

2.1 State定义详解

State是LangGraph的核心,代表图执行过程中的中央数据对象。

from typing_extensions import TypedDict
from typing import Annotated, List, Optional
from langgraph.graph.message import add_messages

# 方法1:基础TypedDict
class SimpleState(TypedDict):
    """简单状态"""
    query: str
    result: str

# 方法2:包含消息的State
class MessageState(TypedDict):
    """包含消息列表的状态"""
    messages: Annotated[list, add_messages]  # add_messages自动处理消息合并
    user_id: str

# 方法3:复杂的企业级State
class ComplexAgentState(TypedDict):
    """复杂企业级状态"""
    # 基础信息
    messages: Annotated[list, add_messages]
    user_id: str
    session_id: str
    
    # 执行元数据
    current_step: str
    execution_time: float
    tool_calls_count: int
    
    # 数据字段
    search_results: Optional[list]
    analysis_data: Optional[dict]
    final_answer: Optional[str]
    
    # 错误处理
    error_message: Optional[str]
    retry_count: int
    
    # 审计追踪
    actions_log: Annotated[list, lambda x, y: x + y]

# 方法4:动态状态生成
def create_custom_state(fields: dict):
    """动态生成State类"""
    return TypedDict('DynamicState', fields)

custom_state = create_custom_state({
    'messages': list,
    'custom_field_1': str,
    'custom_field_2': int
})

2.2 Annotated Reducers

Annotated用于定义如何合并状态字段:

from typing import Annotated
from langgraph.graph.message import add_messages

# 常用Reducers
class State(TypedDict):
    # 消息合并(推荐用于聊天历史)
    messages: Annotated[list, add_messages]
    
    # 覆盖(最后一个值覆盖之前的)
    current_node: Annotated[str, lambda _, new: new]
    
    # 列表连接
    results: Annotated[list, lambda prev, new: prev + new]
    
    # 字典合并
    metadata: Annotated[dict, lambda prev, new: {**prev, **new}]
    
    # 自定义reducer
    def custom_reducer(prev, new):
        if new is None:
            return prev
        return max(prev, new) if prev else new
    
    score: Annotated[int, custom_reducer]

# add_messages的工作原理
# 如果消息列表中最后一个消息与新消息来自同一角色
# 它会合并文本内容(用于处理流式输出)
messages = [
    {"role": "user", "content": "你好"},
    {"role": "assistant", "content": "你好"},
]
# 添加流式数据后,会合并为单个消息

2.3 图的编译与配置

from langgraph.graph import StateGraph
from langgraph.checkpoint.sqlite import SqliteSaver

# 基础编译
graph = StateGraph(AgentState)
# ... 添加节点和边
app = graph.compile()

# 带检查点的编译(持久化)
checkpointer = SqliteSaver.from_conn_string(":memory:")
app = graph.compile(
    checkpointer=checkpointer
)

# 配置调试选项
from langgraph.graph import CompiledStateGraph

app = graph.compile(
    checkpointer=checkpointer,
    interrupt_before=["node_name"],  # 在节点前中断
    interrupt_after=["node_name"],   # 在节点后中断
)

# 执行配置
config = {
    "configurable": {
        "thread_id": "user_123"  # 用于持久化
    }
}

result = app.invoke(initial_state, config=config)

第三部分:节点与边

3.1 节点的定义

节点是图中的计算单元,接收State并返回更新。

# 方法1:简单函数节点
def process_node(state):
    """处理数据的节点"""
    # 读取状态
    messages = state["messages"]
    counter = state["counter"]
    
    # 执行处理
    new_counter = counter + 1
    
    # 返回状态更新
    return {"counter": new_counter}

# 方法2:复杂的节点(带错误处理)
def robust_node(state):
    """具有错误处理的节点"""
    try:
        data = state.get("data", [])
        if not data:
            return {"error_message": "数据为空"}
        
        # 处理数据
        result = sum(data)
        return {
            "result": result,
            "error_message": None
        }
    except Exception as e:
        return {
            "error_message": str(e),
            "retry_count": state.get("retry_count", 0) + 1
        }

# 方法3:异步节点
import asyncio

async def async_node(state):
    """异步处理节点"""
    # 模拟异步操作
    await asyncio.sleep(1)
    
    return {
        "result": "异步处理完成"
    }

# 方法4:节点类(面向对象)
class ProcessorNode:
    """节点类实现"""
    def __init__(self, config):
        self.config = config
    
    def __call__(self, state):
        """调用节点"""
        # 使用配置进行处理
        return {"processed": True}

# 方法5:与LLM集成的节点
def llm_node(state, llm=None):
    """与LLM集成的节点"""
    if llm is None:
        llm = init_chat_model("gpt-4o-mini")
    
    # 从状态中提取消息
    messages = state["messages"]
    
    # 调用LLM
    response = llm.invoke(messages)
    
    # 更新消息列表
    from langchain.messages import AIMessage
    
    return {
        "messages": [AIMessage(content=response.content)]
    }

# 添加节点到图
graph.add_node("process", process_node)
graph.add_node("robust", robust_node)
graph.add_node("async", async_node)
graph.add_node("processor", ProcessorNode(config={}))
graph.add_node("llm", llm_node)

3.2 边的定义

边定义了节点之间的连接关系。

from langgraph.graph import START, END

# 方法1:基础边
graph.add_edge(START, "node_a")      # 从开始到节点a
graph.add_edge("node_a", "node_b")   # 从节点a到节点b
graph.add_edge("node_b", END)        # 从节点b到结束

# 方法2:条件边(稍后详细讨论)
graph.add_conditional_edges(
    "node_a",
    decide_next_node,
    {
        "path_1": "node_b",
        "path_2": "node_c"
    }
)

# 方法3:多源多目标边
graph.add_edge("node_a", "node_b")
graph.add_edge("node_a", "node_c")  # node_a到多个节点
graph.add_edge("node_b", "node_d")
graph.add_edge("node_c", "node_d")  # 多个节点到node_d

# 方法4:循环边(重试或迭代)
def should_retry(state):
    """判断是否需要重试"""
    if state.get("retry_count", 0) < 3 and state.get("error_message"):
        return "retry"
    return "end"

graph.add_conditional_edges(
    "process_node",
    should_retry,
    {
        "retry": "process_node",  # 循环回同一节点
        "end": END
    }
)

3.3 节点的输入与输出

# 节点可以接收额外的依赖
def node_with_dependencies(state, config):
    """接收config的节点"""
    # 访问配置
    thread_id = config.get("configurable", {}).get("thread_id")
    user_id = config.get("configurable", {}).get("user_id")
    
    return {
        "processed": True,
        "thread_id": thread_id
    }

# 节点可以返回部分State更新
def partial_update_node(state):
    """只更新部分字段"""
    # 这个节点只更新counter字段
    # 其他字段保持不变
    return {
        "counter": state["counter"] + 1
        # 不需要返回所有字段
    }

# 节点可以返回None(不更新状态)
def conditional_update_node(state):
    """条件性更新"""
    if state.get("should_update"):
        return {"result": "已更新"}
    return None  # 不更新状态

# 节点可以实现条件消息更新
def smart_message_node(state):
    """智能消息处理"""
    messages = state["messages"]
    
    # 只添加新消息,不修改现有消息
    new_message = {"role": "assistant", "content": "处理完成"}
    
    return {
        "messages": [new_message]  # add_messages会自动合并
    }

第四部分:条件边与路由

4.1 基础条件边

from langgraph.graph import StateGraph

# 条件路由函数
def route_to_next_node(state):
    """根据状态决定下一个节点"""
    if state.get("needs_processing"):
        return "process"
    elif state.get("needs_analysis"):
        return "analysis"
    else:
        return "default"

# 添加条件边
graph.add_conditional_edges(
    "start_node",
    route_to_next_node,
    {
        "process": "process_node",
        "analysis": "analysis_node",
        "default": "default_node"
    }
)

# 映射到END
graph.add_conditional_edges(
    "decision_node",
    lambda state: "continue" if state.get("flag") else "stop",
    {
        "continue": "next_node",
        "stop": END
    }
)

4.2 高级路由模式

Router Node模式

from langchain.chat_models import init_chat_model
from langchain.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

class RouteDecision(BaseModel):
    """路由决策"""
    next_node: str = Field(description="下一个节点")
    reason: str = Field(description="选择原因")

def router_node(state):
    """使用LLM进行智能路由"""
    llm = init_chat_model("gpt-4o-mini")
    
    prompt = f"""
    根据当前消息,决定下一步操作:
    
    消息:{state['messages'][-1]['content']}
    
    可用选项:
    - search: 需要搜索信息
    - analyze: 需要分析数据
    - generate: 需要生成内容
    - end: 完成任务
    
    返回JSON格式的决策。
    """
    
    response = llm.invoke(prompt)
    parser = JsonOutputParser(pydantic_object=RouteDecision)
    decision = parser.parse(response.content)
    
    return {
        "next_action": decision.next_node,
        "reason": decision.reason
    }

graph.add_node("router", router_node)

def route_based_on_llm(state):
    """根据路由节点的输出决定"""
    action = state.get("next_action")
    return action or "search"

graph.add_conditional_edges(
    "router",
    route_based_on_llm,
    {
        "search": "search_node",
        "analyze": "analyze_node",
        "generate": "generate_node",
        "end": END
    }
)

Fan-out/Fan-in模式

# 一个节点分散到多个节点,然后汇总

def fan_out_decision(state):
    """决定是否需要并行处理"""
    if state.get("parallel_process"):
        return "parallel"
    return "sequential"

# 并行处理
def parallel_node_a(state):
    return {"result_a": "处理结果A"}

def parallel_node_b(state):
    return {"result_b": "处理结果B"}

def parallel_node_c(state):
    return {"result_c": "处理结果C"}

def merge_results(state):
    """合并并行结果"""
    return {
        "merged_result": {
            "a": state.get("result_a"),
            "b": state.get("result_b"),
            "c": state.get("result_c")
        }
    }

# 构建图
graph.add_conditional_edges(
    "start",
    fan_out_decision,
    {
        "parallel": "parallel_a",
        "sequential": "sequential_node"
    }
)

# 并行分支
graph.add_edge("parallel_a", "parallel_b")
graph.add_edge("parallel_b", "parallel_c")
graph.add_edge("parallel_c", "merge")

# 顺序分支
graph.add_edge("sequential_node", "merge")

# 汇总
graph.add_node("merge", merge_results)
graph.add_edge("merge", END)

4.3 复杂决策逻辑

def complex_routing(state):
    """复杂的多条件路由"""
    counter = state.get("counter", 0)
    has_error = state.get("error_message") is not None
    priority = state.get("priority", "normal")
    
    # 优先级逻辑
    if has_error:
        if state.get("retry_count", 0) < 3:
            return "retry"
        else:
            return "error_handler"
    
    # 计数器逻辑
    if counter >= 10:
        return "finalize"
    
    # 优先级逻辑
    if priority == "high":
        return "fast_track"
    
    return "normal"

graph.add_conditional_edges(
    "process",
    complex_routing,
    {
        "retry": "process",
        "error_handler": "error_node",
        "finalize": "finalize_node",
        "fast_track": "fast_track_node",
        "normal": "normal_node"
    }
)

第五部分:消息处理

5.1 消息系统详解

from langchain.messages import (
    HumanMessage,
    AIMessage,
    SystemMessage,
    ToolMessage,
    MessageLikeRepresentation
)
from typing import Annotated
from langgraph.graph.message import add_messages

# 5.1.1 标准消息类型
messages = [
    SystemMessage(content="你是一个有用的助手"),
    HumanMessage(content="请解释什么是AI"),
    AIMessage(content="AI是人工智能..."),
    ToolMessage(
        content="搜索结果:...",
        tool_call_id="tool_123"
    )
]

# 5.1.2 自定义消息类
class CustomMessage(MessageLikeRepresentation):
    """自定义消息类"""
    role: str
    content: str
    metadata: dict = {}

# 5.1.3 消息转换
def convert_to_dict(messages):
    """将消息转换为字典格式"""
    return [
        {
            "role": m.type if hasattr(m, 'type') else m.get('role'),
            "content": m.content if hasattr(m, 'content') else m.get('content')
        }
        for m in messages
    ]

# 5.1.4 消息过滤和选择
def get_last_n_messages(messages, n=5):
    """获取最后N条消息"""
    return messages[-n:]

def get_messages_by_role(messages, role):
    """按角色过滤消息"""
    return [m for m in messages if m.get('role') == role]

def summarize_messages(messages):
    """消息总结(用于长对话)"""
    # 保持系统消息和最后几条消息
    system_msgs = [m for m in messages if m.get('role') == 'system']
    recent_msgs = messages[-10:]  # 最后10条
    
    # 合并
    return system_msgs + recent_msgs

5.2 消息处理节点

def message_processor_node(state):
    """处理消息的节点"""
    messages = state["messages"]
    
    # 获取最后一条消息
    last_message = messages[-1] if messages else None
    
    # 处理逻辑
    if isinstance(last_message, dict):
        role = last_message.get("role")
        content = last_message.get("content")
    else:
        role = last_message.type
        content = last_message.content
    
    # 添加处理结果
    new_message = AIMessage(content=f"已处理来自{role}的消息")
    
    return {
        "messages": [new_message]
    }

def message_filter_node(state):
    """过滤消息的节点"""
    messages = state["messages"]
    
    # 去除冗余消息
    filtered = []
    seen_content = set()
    
    for msg in messages:
        content = msg.content if hasattr(msg, 'content') else msg.get('content', '')
        
        if content not in seen_content:
            filtered.append(msg)
            seen_content.add(content)
    
    # 如果消息数超过限制,进行总结
    if len(filtered) > 20:
        # 保留系统消息和最后10条
        system_msgs = [m for m in filtered if m.type == 'system']
        recent_msgs = filtered[-10:]
        filtered = system_msgs + recent_msgs
    
    return {
        "messages": filtered
    }

def message_enrichment_node(state):
    """丰富消息的节点"""
    messages = state["messages"]
    
    if not messages:
        return {}
    
    last_message = messages[-1]
    
    # 添加元数据
    enriched_message = {
        **last_message,
        "timestamp": __import__('datetime').datetime.now().isoformat(),
        "processed": True
    }
    
    return {
        "messages": [enriched_message]
    }

5.3 与LLM集成的消息流

from langchain.chat_models import init_chat_model
from typing import Annotated
from langgraph.graph.message import add_messages

class ConversationState(TypedDict):
    messages: Annotated[list, add_messages]

def llm_conversation_node(state, llm=None):
    """LLM对话节点"""
    if llm is None:
        llm = init_chat_model("gpt-4o")
    
    messages = state["messages"]
    
    # 调用LLM
    response = llm.invoke(messages)
    
    # 返回新消息
    return {
        "messages": [response]
    }

# 在图中使用
graph = StateGraph(ConversationState)
graph.add_node("assistant", llm_conversation_node)

# 节点可以接收配置的llm
def configurable_llm_node(state, config=None):
    """配置化的LLM节点"""
    llm_name = config.get("configurable", {}).get("model", "gpt-4o-mini")
    llm = init_chat_model(llm_name)
    
    return {
        "messages": [llm.invoke(state["messages"])]
    }

第六部分:工具集成

6.1 预构建工具节点

LangGraph提供了预构建的工具处理节点,简化工具集成。

from langgraph.prebuilt import ToolNode, tools_condition
from langchain.tools import tool

# 定义工具
@tool
def search_web(query: str) -> str:
    """搜索网络"""
    return f"搜索结果:关于'{query}'的内容..."

@tool
def calculate(expression: str) -> float:
    """计算表达式"""
    return eval(expression)

tools = [search_web, calculate]

# 创建工具节点
tool_node = ToolNode(tools)

# 添加到图
graph.add_node("tools", tool_node)

# 使用条件边连接工具节点
from langgraph.prebuilt import tools_condition

def llm_node(state):
    """LLM调用工具的节点"""
    llm = init_chat_model("gpt-4o").bind_tools(tools)
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

graph.add_node("llm", llm_node)

# 条件路由:根据是否有工具调用决定
graph.add_conditional_edges(
    "llm",
    tools_condition,
    {
        "tools": "tools",
        "__end__": END
    }
)

graph.add_edge("tools", "llm")

6.2 自定义工具处理

def custom_tool_executor(state):
    """自定义工具执行节点"""
    messages = state["messages"]
    last_message = messages[-1]
    
    # 检查是否有工具调用
    if not hasattr(last_message, 'tool_calls') or not last_message.tool_calls:
        return {}
    
    tool_map = {
        "search_web": search_web,
        "calculate": calculate
    }
    
    results = []
    for tool_call in last_message.tool_calls:
        tool_name = tool_call["name"]
        tool_args = tool_call["args"]
        
        if tool_name in tool_map:
            try:
                result = tool_map[tool_name].invoke(tool_args)
                results.append(
                    ToolMessage(
                        content=str(result),
                        tool_call_id=tool_call["id"],
                        name=tool_name
                    )
                )
            except Exception as e:
                results.append(
                    ToolMessage(
                        content=f"错误:{str(e)}",
                        tool_call_id=tool_call["id"],
                        is_error=True
                    )
                )
    
    return {"messages": results}

# 添加到图
graph.add_node("tool_executor", custom_tool_executor)

6.3 工具错误处理与重试

def robust_tool_execution(state):
    """带重试的工具执行"""
    messages = state["messages"]
    last_message = messages[-1]
    
    if not hasattr(last_message, 'tool_calls'):
        return {}
    
    results = []
    for tool_call in last_message.tool_calls:
        max_retries = 3
        for attempt in range(max_retries):
            try:
                tool_name = tool_call["name"]
                tool_args = tool_call["args"]
                
                # 执行工具
                result = execute_tool(tool_name, tool_args)
                
                results.append(
                    ToolMessage(
                        content=str(result),
                        tool_call_id=tool_call["id"]
                    )
                )
                break
            except Exception as e:
                if attempt == max_retries - 1:
                    results.append(
                        ToolMessage(
                            content=f"工具执行失败(已重试{max_retries}次):{str(e)}",
                            tool_call_id=tool_call["id"],
                            is_error=True
                        )
                    )
                else:
                    # 重试
                    import time
                    time.sleep(2 ** attempt)  # 指数退避
    
    return {"messages": results}

第七部分:持久化与检查点

7.1 检查点系统

LangGraph的检查点系统是实现持久化、人工审核、时间旅行的基础。

from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.checkpoint.postgres import PostgreSaver

# SQLite检查点(本地开发)
sqlite_checkpointer = SqliteSaver.from_conn_string(
    "sqlite:///langgraph.db"
)

# PostgreSQL检查点(生产环境)
postgres_checkpointer = PostgreSaver.from_conn_string(
    "postgresql://user:password@localhost/langgraph_db"
)

# 编译带检查点的图
app = graph.compile(checkpointer=sqlite_checkpointer)

# 执行时需要提供thread_id
config = {
    "configurable": {
        "thread_id": "user_123"
    }
}

# 首次执行
result = app.invoke(initial_state, config=config)

# 查询检查点
checkpointer = sqlite_checkpointer

# 获取特定线程的所有检查点
checkpoints = checkpointer.list(config=config)
for checkpoint in checkpoints:
    print(f"检查点ID: {checkpoint['id']}")
    print(f"步骤: {checkpoint['step']}")

7.2 线程管理

# 同一用户的对话保存在同一线程中
user_thread_id = "user_123"

# 第一次交互
config = {"configurable": {"thread_id": user_thread_id}}
result1 = app.invoke(
    {"messages": [HumanMessage(content="什么是AI?")]},
    config=config
)

# 获取当前线程状态
thread_state = app.get_state(config)
print(f"消息数: {len(thread_state.values['messages'])}")

# 第二次交互(会基于之前的状态继续)
result2 = app.invoke(
    {"messages": [HumanMessage(content="它有什么应用?")]},
    config=config
)

# 访问同一线程的历史
history = checkpointer.list(config=config)
print(f"历史检查点数: {len(history)}")

7.3 时间旅行(Time Travel)

# 获取某个过去的检查点
config = {"configurable": {"thread_id": "user_123"}}

# 列出所有检查点
checkpoints = checkpointer.list(config=config)

# 选择某个检查点返回
historical_checkpoint_id = checkpoints[0]["id"]

# 从该检查点恢复状态
config["configurable"]["checkpoint_id"] = historical_checkpoint_id
historical_state = app.get_state(config)

print(f"恢复到的状态:{historical_state.values}")

# 从历史点继续执行(创建新的分支)
new_config = {
    "configurable": {
        "thread_id": "user_123_branch",  # 新线程ID
        "checkpoint_id": historical_checkpoint_id
    }
}

# 这会从历史点创建一个新分支
result = app.invoke(
    {"messages": [HumanMessage(content="另一个问题")]},
    config=new_config
)

7.4 人工审核(Human-in-the-Loop)

# 在某些关键节点前中断,等待人工审核

app = graph.compile(
    checkpointer=sqlite_checkpointer,
    interrupt_before=["critical_decision_node"],  # 在此节点前中断
    interrupt_after=["data_modification_node"]    # 在此节点后中断
)

# 执行到中断点
config = {"configurable": {"thread_id": "user_123"}}
try:
    result = app.invoke(initial_state, config=config)
except GraphInterrupt as e:
    print(f"图在节点执行前被中断:{e}")

# 获取当前状态(检查中断前的状态)
current_state = app.get_state(config)
print(f"当前状态:{current_state.values}")

# 人工审核和修改状态
reviewed_state = current_state.values.copy()
reviewed_state["approved"] = True
reviewed_state["reviewer_comment"] = "已批准"

# 更新状态
app.update_state(config, reviewed_state)

# 继续执行
result = app.invoke(None, config=config)

7.5 故障恢复

def potentially_failing_node(state):
    """可能失败的节点"""
    # 模拟可能的失败
    if state.get("trigger_error"):
        raise ValueError("预期的错误发生")
    
    return {"processed": True}

graph.add_node("potentially_failing", potentially_failing_node)

# 执行可能出错的图
config = {"configurable": {"thread_id": "user_123"}}

try:
    result = app.invoke(
        {"messages": [], "trigger_error": True},
        config=config
    )
except ValueError as e:
    print(f"发生错误:{e}")
    
    # 修复状态(去除触发条件)
    current_state = app.get_state(config)
    fixed_state = current_state.values.copy()
    fixed_state["trigger_error"] = False
    
    # 更新状态
    app.update_state(config, fixed_state)
    
    # 重新执行(会从错误的节点重新开始)
    result = app.invoke(None, config=config)
    print(f"恢复后的结果:{result}")

第八部分:多智能体系统

8.1 多智能体架构

from typing_extensions import TypedDict
from typing import Annotated, Literal
from langgraph.graph.message import add_messages

# 定义多智能体状态
class MultiAgentState(TypedDict):
    messages: Annotated[list, add_messages]
    current_agent: str
    completed_agents: Annotated[list, lambda x, y: list(set(x + y))]
    final_result: str

# 定义多个智能体
def researcher_agent(state):
    """研究智能体"""
    llm = init_chat_model("gpt-4o")
    prompt = "作为研究员,请分析提供的信息..."
    
    response = llm.invoke(state["messages"] + [HumanMessage(content=prompt)])
    
    return {
        "messages": [
            AIMessage(content=f"[研究员] {response.content}"),
            HumanMessage(content="")  # 转移信息
        ],
        "current_agent": "writer",
        "completed_agents": ["researcher"]
    }

def writer_agent(state):
    """写手智能体"""
    llm = init_chat_model("gpt-4o")
    prompt = "基于研究结果,请编写一个专业的总结..."
    
    response = llm.invoke(state["messages"] + [HumanMessage(content=prompt)])
    
    return {
        "messages": [
            AIMessage(content=f"[写手] {response.content}"),
            HumanMessage(content="")
        ],
        "current_agent": "editor",
        "completed_agents": ["researcher", "writer"]
    }

def editor_agent(state):
    """编辑智能体"""
    llm = init_chat_model("gpt-4o")
    prompt = "请检查文本的质量并提出改进建议..."
    
    response = llm.invoke(state["messages"] + [HumanMessage(content=prompt)])
    
    return {
        "messages": [
            AIMessage(content=f"[编辑] {response.content}")
        ],
        "current_agent": "final",
        "completed_agents": ["researcher", "writer", "editor"],
        "final_result": response.content
    }

# 构建多智能体图
graph = StateGraph(MultiAgentState)

graph.add_node("researcher", researcher_agent)
graph.add_node("writer", writer_agent)
graph.add_node("editor", editor_agent)

graph.add_edge(START, "researcher")
graph.add_edge("researcher", "writer")
graph.add_edge("writer", "editor")
graph.add_edge("editor", END)

app = graph.compile()

8.2 智能体间通信

# 智能体可以通过共享状态进行通信

class CommunicationState(TypedDict):
    messages: Annotated[list, add_messages]
    data_store: Annotated[dict, lambda x, y: {**x, **y}]  # 数据共享
    agent_status: Annotated[dict, lambda x, y: {**x, **y}]  # 状态共享

def agent_a(state):
    """智能体A - 产生数据"""
    # 智能体A执行任务并产生数据
    data = {"result": "来自A的数据"}
    
    return {
        "data_store": data,
        "agent_status": {"A": "completed"}
    }

def agent_b(state):
    """智能体B - 消费智能体A的数据"""
    # 智能体B访问A的数据
    data_from_a = state["data_store"].get("result")
    
    # 基于A的数据执行任务
    processed = f"处理后的数据:{data_from_a}"
    
    return {
        "data_store": {"processed": processed},
        "agent_status": {"B": "completed"}
    }

def agent_c(state):
    """智能体C - 汇总"""
    # 智能体C访问所有之前的数据
    final = {
        "all_data": state["data_store"],
        "status": state["agent_status"]
    }
    
    return {
        "data_store": final,
        "agent_status": {"C": "completed"}
    }

8.3 智能体协调

# 一个协调节点管理智能体的执行

def coordinator(state):
    """协调多个智能体的执行"""
    # 分析消息确定需要哪些智能体
    last_message = state["messages"][-1]
    content = last_message.content
    
    agents_needed = []
    if "分析" in content:
        agents_needed.append("analyzer")
    if "生成" in content:
        agents_needed.append("generator")
    if "验证" in content:
        agents_needed.append("validator")
    
    return {
        "agents_needed": agents_needed,
        "current_agent_index": 0
    }

def execute_agents(state):
    """按顺序执行所需的智能体"""
    agents_needed = state.get("agents_needed", [])
    current_index = state.get("current_agent_index", 0)
    
    if current_index >= len(agents_needed):
        return {"execution_complete": True}
    
    # 执行当前智能体
    agent_name = agents_needed[current_index]
    # ... 执行逻辑
    
    return {"current_agent_index": current_index + 1}

第九部分:LangGraph Swarm(Ultra)

9.1 Swarm简介

LangGraph Swarm是LangGraph的Ultra版本,优化了多智能体协作。

# 安装Swarm
pip install langgraph-swarm

9.2 Swarm基础使用

from langgraph_swarm import Agent, Swarm
from langgraph.graph.message import add_messages
from typing_extensions import TypedDict
from typing import Annotated

# 定义智能体
class SwarmState(TypedDict):
    messages: Annotated[list, add_messages]

# 创建多个智能体
booking_agent = Agent(
    name="booking",
    description="负责处理预订请求",
    system_prompt="你是一个专业的预订员,帮助用户预订酒店和机票"
)

support_agent = Agent(
    name="support",
    description="负责客户支持",
    system_prompt="你是一个客户支持代表,处理用户的问题和投诉"
)

# 定义智能体间的转移
booking_agent.add_transfer(support_agent, "如果用户有问题,转移到支持团队")
support_agent.add_transfer(booking_agent, "如果需要预订,转移到预订团队")

# 创建Swarm
swarm = Swarm(
    agents=[booking_agent, support_agent],
    initial_agent=booking_agent
)

# 执行
result = swarm.run(
    messages=[
        {"role": "user", "content": "我想预订一个酒店"}
    ]
)

9.3 高级智能体转移

from langgraph_swarm import TransferCondition

def should_transfer_to_support(state):
    """判断是否应该转移到支持团队"""
    messages = state["messages"]
    last_message = messages[-1]
    
    problematic_keywords = ["问题", "投诉", "无法", "错误"]
    
    if any(keyword in last_message.content for keyword in problematic_keywords):
        return True
    return False

# 条件转移
booking_agent.add_conditional_transfer(
    support_agent,
    condition=should_transfer_to_support,
    description="如果检测到问题,转移到支持"
)

# 手动转移
class ManualSwarmState(TypedDict):
    messages: Annotated[list, add_messages]
    next_agent: str

def intelligent_router(state):
    """智能路由器决定下一个智能体"""
    # 分析消息
    last_content = state["messages"][-1].content
    
    if "预订" in last_content:
        return "booking"
    elif "支持" in last_content:
        return "support"
    
    return state.get("next_agent", "booking")

swarm.set_router(intelligent_router)

第十部分:高级特性与优化

10.1 流式处理

# LangGraph支持完整的流式处理

app = graph.compile()

# 流式执行
for event in app.stream({"messages": []}):
    print(f"事件:{event}")

# 指定流模式
for event in app.stream(
    {"messages": []},
    stream_mode="values"  # 输出完整状态
):
    print(f"状态:{event}")

for event in app.stream(
    {"messages": []},
    stream_mode="updates"  # 只输出更新
):
    print(f"更新:{event}")

# 异步流处理
async def async_stream_example():
    async for event in app.astream({"messages": []}):
        print(f"异步事件:{event}")

import asyncio
asyncio.run(async_stream_example())

10.2 调试与监控

import logging
from langsmith import Client

# 启用LangSmith追踪
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = "your-key"
os.environ["LANGSMITH_PROJECT"] = "langgraph-debug"

# 设置日志
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

# 自定义调试节点
def debug_node(state):
    """调试节点"""
    logger.debug(f"当前状态:{state}")
    logger.debug(f"消息数:{len(state.get('messages', []))}")
    
    return {}

# 添加到图
graph.add_node("debug", debug_node)

# 获取执行统计
def get_execution_stats(app, config):
    """获取执行统计"""
    thread = app.get_state(config)
    
    stats = {
        "nodes_executed": len(thread.values.get("messages", [])),
        "current_state": thread.values,
        "checkpoint_id": thread.config.get("checkpoint_id")
    }
    
    return stats

10.3 性能优化

# 1. 状态剪枝 - 减少保存的数据

def prune_state(state):
    """清理不必要的状态"""
    # 只保留必要的消息
    if len(state["messages"]) > 100:
        # 保留系统消息和最后20条
        system = [m for m in state["messages"] if m.type == "system"]
        recent = state["messages"][-20:]
        state["messages"] = system + recent
    
    return state

# 2. 批量处理

def batch_process_items(items, batch_size=10):
    """批量处理项目"""
    for i in range(0, len(items), batch_size):
        batch = items[i:i+batch_size]
        # 处理批次
        yield batch

# 3. 缓存优化

from functools import lru_cache

@lru_cache(maxsize=128)
def cached_expensive_operation(key):
    """缓存昂贵的操作"""
    return expensive_computation(key)

# 4. 异步并发

import asyncio

async def concurrent_execution():
    """并发执行多个操作"""
    tasks = [
        async_operation_1(),
        async_operation_2(),
        async_operation_3()
    ]
    
    results = await asyncio.gather(*tasks)
    return results

10.4 错误处理与恢复

from langgraph.errors import GraphInterrupt, GraphError

def node_with_error_handling(state):
    """带错误处理的节点"""
    try:
        # 执行可能失败的操作
        result = risky_operation(state)
        return {"result": result}
    
    except ValueError as e:
        # 特定错误处理
        logger.error(f"值错误:{e}")
        return {"error": str(e), "should_retry": True}
    
    except Exception as e:
        # 通用错误处理
        logger.exception(f"未预期的错误:{e}")
        return {"error": str(e), "should_retry": False}

# 全局错误处理
def handle_graph_error(error, state):
    """处理图执行错误"""
    if isinstance(error, GraphInterrupt):
        logger.info("图被中断")
    elif isinstance(error, GraphError):
        logger.error(f"图错误:{error}")
    else:
        logger.exception(f"未知错误:{error}")

# 重试逻辑
def retry_node(state):
    """重试逻辑"""
    max_retries = 3
    retry_count = state.get("retry_count", 0)
    
    if retry_count >= max_retries:
        return {"error": "已达最大重试次数"}
    
    try:
        result = operation(state)
        return {"result": result, "retry_count": 0}
    except Exception as e:
        return {
            "error": str(e),
            "retry_count": retry_count + 1
        }

第十一部分:完整应用案例

案例1:研究论文自动化系统

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode, tools_condition
from langchain.tools import tool
from typing_extensions import TypedDict
from typing import Annotated
from langgraph.graph.message import add_messages
from langgraph.checkpoint.sqlite import SqliteSaver

# 定义工具
@tool
def search_papers(query: str) -> str:
    """搜索学术论文"""
    return f"找到关于'{query}'的论文..."

@tool
def analyze_paper(paper_id: str) -> str:
    """分析论文内容"""
    return f"论文{paper_id}的分析..."

@tool
def generate_summary(content: str) -> str:
    """生成论文总结"""
    return f"总结:{content[:100]}..."

tools = [search_papers, analyze_paper, generate_summary]

# 定义状态
class ResearchState(TypedDict):
    messages: Annotated[list, add_messages]
    papers_found: list
    analysis_results: dict
    final_summary: str

# 定义节点
def research_llm(state):
    """研究LLM"""
    llm = init_chat_model("gpt-4o").bind_tools(tools)
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

def process_results(state):
    """处理结果"""
    # 提取工具结果
    papers = []
    for msg in state["messages"]:
        if hasattr(msg, 'content'):
            papers.append(msg.content)
    
    return {"papers_found": papers}

# 构建图
graph = StateGraph(ResearchState)

graph.add_node("llm", research_llm)
graph.add_node("tools", ToolNode(tools))
graph.add_node("process", process_results)

graph.add_edge(START, "llm")
graph.add_conditional_edges(
    "llm",
    tools_condition,
    {"tools": "tools", "__end__": "process"}
)
graph.add_edge("tools", "llm")
graph.add_edge("process", END)

# 编译
checkpointer = SqliteSaver.from_conn_string("sqlite:///research.db")
app = graph.compile(checkpointer=checkpointer)

# 执行
config = {"configurable": {"thread_id": "research_session_1"}}
result = app.invoke(
    {
        "messages": [
            {"role": "user", "content": "请研究深度学习的最新进展"}
        ],
        "papers_found": [],
        "analysis_results": {},
        "final_summary": ""
    },
    config=config
)

案例2:客户服务工作流

# 定义状态
class CustomerServiceState(TypedDict):
    messages: Annotated[list, add_messages]
    customer_id: str
    issue_category: str
    severity: str
    resolution_attempts: int
    resolved: bool

# 定义节点
def classify_issue(state):
    """分类问题"""
    llm = init_chat_model("gpt-4o-mini")
    
    classification_prompt = f"""
    分类以下客户问题:
    {state['messages'][-1]['content']}
    
    类别:账户、计费、技术支持、退货、其他
    严重程度:低、中、高、紧急
    """
    
    response = llm.invoke(classification_prompt)
    
    return {
        "messages": [AIMessage(content=response.content)],
        "issue_category": "classification",  # 从响应中提取
        "severity": "medium"
    }

def route_to_handler(state):
    """路由到相应的处理器"""
    category = state["issue_category"]
    
    if category == "账户":
        return "account_handler"
    elif category == "计费":
        return "billing_handler"
    elif category == "技术支持":
        return "technical_handler"
    else:
        return "general_handler"

def escalate_if_needed(state):
    """如果需要,升级问题"""
    if state["severity"] == "紧急" and not state["resolved"]:
        return "escalate"
    elif state["resolution_attempts"] >= 3:
        return "escalate"
    return "end"

# 构建图
graph = StateGraph(CustomerServiceState)

graph.add_node("classify", classify_issue)
graph.add_node("account_handler", account_handler)
graph.add_node("billing_handler", billing_handler)
graph.add_node("technical_handler", technical_handler)
graph.add_node("general_handler", general_handler)
graph.add_node("escalate", escalation_node)

graph.add_edge(START, "classify")
graph.add_conditional_edges(
    "classify",
    route_to_handler,
    {
        "account_handler": "account_handler",
        "billing_handler": "billing_handler",
        "technical_handler": "technical_handler",
        "general_handler": "general_handler"
    }
)

# 每个处理器的后续路由
for handler in ["account_handler", "billing_handler", "technical_handler", "general_handler"]:
    graph.add_conditional_edges(
        handler,
        escalate_if_needed,
        {
            "escalate": "escalate",
            "end": END
        }
    )

graph.add_edge("escalate", END)

app = graph.compile()

案例3:数据处理管道

# 定义状态
class DataPipelineState(TypedDict):
    messages: Annotated[list, add_messages]
    raw_data: list
    cleaned_data: list
    processed_data: dict
    analysis_result: str
    pipeline_status: str

# 定义节点
def data_ingestion(state):
    """数据摄入"""
    # 从源加载数据
    data = load_data_from_source()
    return {"raw_data": data}

def data_cleaning(state):
    """数据清洗"""
    cleaned = []
    for item in state["raw_data"]:
        # 清洗逻辑
        if validate(item):
            cleaned.append(item)
    
    return {"cleaned_data": cleaned}

def data_processing(state):
    """数据处理"""
    processed = {}
    
    for item in state["cleaned_data"]:
        # 处理逻辑
        key = item["id"]
        processed[key] = transform(item)
    
    return {"processed_data": processed}

def analysis(state):
    """分析"""
    llm = init_chat_model("gpt-4o")
    
    analysis_prompt = f"""
    分析以下数据:
    {state['processed_data']}
    
    提供关键洞察。
    """
    
    response = llm.invoke(analysis_prompt)
    
    return {
        "messages": [AIMessage(content=response.content)],
        "analysis_result": response.content,
        "pipeline_status": "completed"
    }

# 构建图
graph = StateGraph(DataPipelineState)

graph.add_node("ingest", data_ingestion)
graph.add_node("clean", data_cleaning)
graph.add_node("process", data_processing)
graph.add_node("analyze", analysis)

graph.add_edge(START, "ingest")
graph.add_edge("ingest", "clean")
graph.add_edge("clean", "process")
graph.add_edge("process", "analyze")
graph.add_edge("analyze", END)

app = graph.compile()

# 执行管道
result = app.invoke({
    "messages": [],
    "raw_data": [],
    "cleaned_data": [],
    "processed_data": {},
    "analysis_result": "",
    "pipeline_status": "running"
})

案例4:多智能体代码审查系统

# 定义状态
class CodeReviewState(TypedDict):
    messages: Annotated[list, add_messages]
    code_to_review: str
    style_review: str
    logic_review: str
    security_review: str
    final_report: str
    current_reviewer: str

# 定义审查智能体
def style_reviewer(state):
    """代码风格审查"""
    llm = init_chat_model("gpt-4o")
    
    prompt = f"""
    审查以下代码的风格和可读性:
    {state['code_to_review']}
    
    提供详细的改进建议。
    """
    
    response = llm.invoke(prompt)
    
    return {
        "messages": [AIMessage(content=f"[风格审查] {response.content}")],
        "style_review": response.content,
        "current_reviewer": "logic"
    }

def logic_reviewer(state):
    """逻辑审查"""
    llm = init_chat_model("gpt-4o")
    
    prompt = f"""
    审查以下代码的逻辑和算法正确性:
    {state['code_to_review']}
    
    识别潜在的逻辑错误。
    """
    
    response = llm.invoke(prompt)
    
    return {
        "messages": [AIMessage(content=f"[逻辑审查] {response.content}")],
        "logic_review": response.content,
        "current_reviewer": "security"
    }

def security_reviewer(state):
    """安全审查"""
    llm = init_chat_model("gpt-4o")
    
    prompt = f"""
    审查以下代码的安全性问题:
    {state['code_to_review']}
    
    识别潜在的安全漏洞。
    """
    
    response = llm.invoke(prompt)
    
    return {
        "messages": [AIMessage(content=f"[安全审查] {response.content}")],
        "security_review": response.content,
        "current_reviewer": "finalize"
    }

def generate_final_report(state):
    """生成最终报告"""
    report = f"""
    # 代码审查报告
    
    ## 风格审查
    {state['style_review']}
    
    ## 逻辑审查
    {state['logic_review']}
    
    ## 安全审查
    {state['security_review']}
    
    ## 建议
    {generate_recommendations(state)}
    """
    
    return {
        "messages": [AIMessage(content=report)],
        "final_report": report
    }

# 构建并行审查图
graph = StateGraph(CodeReviewState)

graph.add_node("style", style_reviewer)
graph.add_node("logic", logic_reviewer)
graph.add_node("security", security_reviewer)
graph.add_node("report", generate_final_report)

# 并行启动三个审查器
graph.add_edge(START, "style")
graph.add_edge(START, "logic")
graph.add_edge(START, "security")

# 汇聚到报告生成器
graph.add_edge("style", "report")
graph.add_edge("logic", "report")
graph.add_edge("security", "report")

graph.add_edge("report", END)

app = graph.compile()

# 执行代码审查
result = app.invoke({
    "messages": [],
    "code_to_review": "...",
    "style_review": "",
    "logic_review": "",
    "security_review": "",
    "final_report": "",
    "current_reviewer": "style"
})

总结与最佳实践

LangGraph核心优势

  1. 显式控制 - 完全控制执行流程
  2. 有状态 - 中央状态管理
  3. 可持久化 - 内置检查点系统
  4. 时间旅行 - 恢复到历史状态
  5. 多智能体 - 原生支持多代理协作
  6. 可监控 - 与LangSmith集成

最佳实践

  • 使用Annotated定义State,使用合适的Reducers
  • 节点保持单一职责
  • 使用条件边实现复杂路由
  • 为关键节点启用持久化
  • 在生产环境使用PostgreSQL checkpointer
  • 使用LangSmith追踪和调试
  • 为长期运行的任务实现人工审核
  • 设计清晰的错误处理和恢复机制

这份LangGraph教程涵盖了从基础到高级的所有内容,提供了深度和广度兼备的学习资源。