基于LangGraph1.0的思维方式LangGraph能够改变您对构建Agent的思路，在本教程中，我们会引导您梳理运

基于LangGraph1.0实现电子邮件Agent

LangGraph 能够改变您对构建Agent的思路。使用 LangGraph 构建Agent时，首先要把它拆解成一个个被称为 “Node” 的离散步骤。接着，为每个节点描述不同的决策和转换方式。最后，通过一种各节点都能读写的共享状态，将这些节点连接起来。在本教程中，我们会引导您梳理运用 LangGraph 构建客户支持电子邮件Agent的思维流程。

从你想要实现自动化的流程入手

假设你要打造一个处理客户支持邮件的人工智能Agent。产品团队向你提出了以下要求：

该Agent应具备以下功能：

读取收到的客户邮件
依据紧急程度与主题对邮件进行分类
检索相关文档以答复问题
拟写恰当的回复内容
遇复杂问题转交给人工客服处理
必要时安排后续跟进工作

以下是一些需要处理的场景示例：

简单产品问题：我该如何重置密码？
故障报告：我选择 PDF 格式时，导出功能就崩溃了。
紧急账单问题：我的订阅被重复扣费了！
功能需求：能不能在手机应用里添加深色模式？
复杂技术问题：我们的 API 集成老是间歇性出现 504 错误，导致无法正常运行。

要在 LangGraph 里实现一个Agent程序，一般要遵循以下五个相同的步骤。

将工作流程分解成一个个独立的步骤
明确每个步骤需要完成什么
设计你的共享状态
构建你的所有节点
将它们连接成一个图

步骤1:将工作流程分解成一个个独立的步骤

首先要确定流程中的具体不同步骤。每一个步骤都将成为一个Node（即执行某一特定任务的函数），然后描绘出这些步骤彼此是如何连接的。

箭头指示了可能的路径，但具体选择哪条路径，实际由每个节点内部的决策来定。

既然已经明确了工作流程中的各个组件，接下来我们看看每个节点需要执行哪些操作：

读取邮件: 提取并解析邮件内容
意图分类: 运用大语言模型LLM对紧急程度和主题进行分类，进而导向相应的操作
文档搜索: 在知识库中查询相关信息
故障跟踪: 在跟踪系统中创建或更新问题记录
起草回复: 生成恰当的回复内容
人工审核: 提交给人工客服进行审批或处理
发送回复: 发送电子邮件答复

请注意，部分节点负责决定后续流程走向（如意图分类Classify Intent、起草回复Draft Reply、人工审核Human Review），而另外一些节点则始终按固定流程推进（例如，读取邮件Read Email之后必定进入意图分类Classify Intent环节，文档搜索Doc Search之后必定进入起草回复Draft Reply环节）。

步骤2:明确每个步骤需要完成什么

针对图Graph中的每个节点Node，判断其代表的是哪种操作类型，以及为保证正常运行所需的上下文Context。

LLM相关步骤

当某一步骤需要进行文本理解、分析、生成，或做出推理决策时：

意图分类Classify Intent节点
- 静态上下文(提示词prompt)：分类类别、紧急程度的定义、回复格式
- 动态上下文(来自state)：邮件内容、发件人信息
- 预期成果：能够确定后续流程走向的结构化的分类
起草回复Draft Reply节点
- 静态上下文(提示词prompt)：语气规范、公司政策、回复模板
- 动态上下文(来自state)：分类结果、检索结果、客户过往记录
- 预期成果：可供审核的专业邮件回复

数据相关步骤

当某个步骤需要从外部获取信息时：

文档搜索Document Search节点
- 参数：根据意图和主题构建的查询
- 重试策略：针对瞬时故障采用指数退避策略
- 缓存：可以对常见查询进行缓存，以此减少 API 调用次数
客户历史记录查询
- 参数：从状态state中获取的客户邮箱或 ID
- 重试策略：没错，要是获取不到，就转而使用基本信息
- 缓存：是的，设置生存时间，兼顾数据的时效性与性能

行为相关步骤

当某个步骤需要执行外部操作时：

发送回复Send Reply节点
- 执行时机：获得（人工或自动）批准后
- 重试策略：针对网络问题，采用指数退避重试策略（每次重试间隔时间呈指数级增长）
- 不可缓存：每次发送都是独立操作
故障跟踪Bug Track节点
- 执行时机：只要意图为 “故障” 就执行
- 重试策略：务必保证不丢失故障报告，采用重试策略
- 返回：需包含在回复中的工单编号

用户输入相关步骤

当某个步骤需要人工干预时：

人工审核Human Review节点
- 用于决策的信息：原始邮件、回复草稿、紧急程度、类别划分
- 期望的输入形式：表示是否批准的bool值，以及可能有的编辑后的回复
- 在以下情况触发：紧急程度高、问题复杂，或存在质量方面的考量

步骤3:设计你的共享状态

状态State是Agent内所有节点Node均可访问的共享内存。可将其视作Agent在处理流程过程中，用以记录所有获取的信息及做出的决策的 “笔记本”。

哪些数据属于状态范畴？

针对每一项数据，向自己提出以下问题：

包含在state中
- 某数据是否需要在各个步骤间持续保留？若答案是肯定的，那就应将其包含在状态state之中
无需存储的情况
- 能否依据其他数据推导出该数据？若可以，那就只需在需要时进行计算，而无需将其存储在状态里。

对于我们的邮件Agent程序而言，需要跟踪以下信息：

原始邮件及发件人信息（这些信息无法重新生成）
分类结果（多个下游节点会用到）
搜索结果与客户数据（重新获取成本高昂）
回复草稿（审核过程中需一直保留）
执行元数据（用于调试及恢复操作）

保持原始状态，按需格式化提示信息

一条关键原则：状态应存储原始数据，而非格式化后的文本。在需要时，于节点内部对提示信息进行格式化。

这种分离具有以下意义：

不同节点能够依据自身需求，以不同方式对相同数据进行格式化处理
可在不修改状态模式的前提下，更改提示模板内容
调试更为明晰 —— 能确切知晓每个节点所接收的数据
你的智能体可实现演进，同时不会破坏现有状态

让我们来定义状态：

from typing import TypedDict, Literal

# Define the structure for email classification
class EmailClassification(TypedDict):
    intent: Literal["question", "bug", "billing", "feature", "complex"]
    urgency: Literal["low", "medium", "high", "critical"]
    topic: str
    summary: str

class EmailAgentState(TypedDict):
    # Raw email data
    email_content: str
    sender_email: str
    email_id: str

    # Classification result
    classification: EmailClassification | None

    # Raw search/API results
    search_results: list[str] | None  # List of raw document chunks
    customer_history: dict | None  # Raw customer data from CRM

    # Generated content
    draft_response: str | None
    messages: list[str] | None

请注意，状态中仅包含原始数据，既没有提示模板，也没有格式化后的字符串，更没有指令。分类输出以单个字典形式存储，直接来自大语言模型LLM。

步骤4:构建你的所有节点

如今，我们将每一步都实现为一个函数。在 LangGraph 中，一个节点就只是一个 Python 函数，该函数接收当前状态并返回对该状态的更新。

妥善处理错误 不同类型的错误需要采用不同的处理策略：

错误类型	谁来修复它	策略	何时使用该错误
瞬时错误(网络问题、速率限制)	系统(自动)	重试策略	通常重试即可解决的临时故障
LLM可恢复的错误（工具故障、解析问题）	LLM	将错误存储在状态中并循环返回	LLM能够察觉错误并调整其方式
用户可修复的错误（信息缺失、指令不明确）	人类	使用 interrupt () 暂停	需要用户输入才能继续
意外错误	开发者	让其抛出异常	需要调试的未知问题

瞬时Error：添加重试策略，以自动重试网络问题和速率限制问题：

from langgraph.types import RetryPolicy

workflow.add_node(
    "search_documentation",
    search_documentation,
    retry_policy=RetryPolicy(max_attempts=3, initial_interval=1.0)
)

LLM可恢复的Error：将错误存储在状态中并循环返回，以便 LLM 可以查看哪里出了问题并重试：

from langgraph.types import Command


def execute_tool(state: State) -> Command[Literal["agent", "execute_tool"]]:
    try:
        result = run_tool(state['tool_call'])
        return Command(update={"tool_result": result}, goto="agent")
    except ToolError as e:
        # Let the LLM see what went wrong and try again
        return Command(
            update={"tool_result": f"Tool error: {str(e)}"},
            goto="agent"
        )

用户可修复的Error：必要时暂停并从用户处收集信息（例如帐户 ID、订单号或说明信息）：

from langgraph.types import Command


def lookup_customer_history(state: State) -> Command[Literal["draft_response"]]:
    if not state.get('customer_id'):
        user_input = interrupt({
            "message": "Customer ID needed",
            "request": "Please provide the customer's account ID to look up their subscription history"
        })
        return Command(
            update={"customer_id": user_input['customer_id']},
            goto="lookup_customer_history"
        )
    # Now proceed with the lookup
    customer_data = fetch_customer_history(state['customer_id'])
    return Command(update={"customer_history": customer_data}, goto="draft_response")

意外Error：向上抛异常以便调试。勿捕获无法处理的内容：

def send_reply(state: EmailAgentState):
    try:
        email_service.send(state["draft_response"])
    except Exception:
        raise  # Surface unexpected errors

实现电子邮件Agent的所有节点

后续有完整代码示例

步骤5:将它们连接成一个图

现在，我们将各个节点连接成一个工作图。由于每个节点都会自行处理路由决策，因此我们仅需几条关键的边。

为借助 interrupt() 实现human-in-the-loop，我们需要编译checkpointer从而在运行时保存状态：

from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import RetryPolicy

# Create the graph
workflow = StateGraph(EmailAgentState)

# Add nodes with appropriate error handling
workflow.add_node("read_email", read_email)
workflow.add_node("classify_intent", classify_intent)

# Add retry policy for nodes that might have transient failures
workflow.add_node(
    "search_documentation",
    search_documentation,
    retry_policy=RetryPolicy(max_attempts=3)
)
workflow.add_node("bug_tracking", bug_tracking)
workflow.add_node("draft_response", draft_response)
workflow.add_node("human_review", human_review)
workflow.add_node("send_reply", send_reply)

# Add only the essential edges
workflow.add_edge(START, "read_email")
workflow.add_edge("read_email", "classify_intent")
workflow.add_edge("send_reply", END)

# Compile with checkpointer for persistence, in case run graph with Local_Server --> Please compile without checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

图结构极为精简，原因在于路由借助 Command 对象在节点内部实现。每个节点利用如Command[Literal ["node1", "node2"]] 这类类型标注，声明自身可抵达之处，使流程明晰且具备可追溯性。

测试Agent

让我们给Agent运行一个紧急的账单问题，这个问题需要人工审核：

# Test with an urgent billing issue
initial_state = {
    "email_content": "I was charged twice for my subscription! This is urgent!",
    "sender_email": "customer@example.com",
    "email_id": "email_123",
    "messages": []
}

# Run with a thread_id for persistence
config = {"configurable": {"thread_id": "customer_123"}}
result = app.invoke(initial_state, config)
# The graph will pause at human_review
print(f"Draft ready for review: {result['draft_response'][:100]}...")

# When ready, provide human input to resume
from langgraph.types import Command

human_response = Command(
    resume={
        "approved": True,
        "edited_response": "We sincerely apologize for the double charge. I've initiated an immediate refund..."
    }
)

# Resume execution
final_result = app.invoke(human_response, config)
print(f"Email sent successfully!")

当调用 interrupt() 时，程序会暂停，将所有内容保存到检查点，然后等待。几天后，程序可以恢复运行，并从上次中断的地方继续执行。thread_id 确保此次会话的所有状态都被完整地保存下来。

总结及后续步骤

关键要点

构建这个邮件Agent，向我们展现了 LangGraph 的思考方式：

分解为离散步骤
1. 每个节点都专注于做好一件事。这种分解方式实现了流式进度更新、可暂停和恢复的持久执行，以及清晰的调试，因为您可以检查步骤之间的状态。
状态是共享内存
1. 存储原始数据，而非格式化文本。这使得不同的节点能够以不同的方式使用相同的信息。
节点是函数
1. 它们接收状态、执行任务并返回更新。当需要做出路由决策时，它们会同时指定状态更新和下一个目标。
错误是流程的一部分
1. 瞬时故障会进行重试，LLM 可恢复的错误会带着上下文循环返回，用户可修复的问题会暂停等待输入，意外错误会向上抛异常以便调试。
用户输入至关重要
1. interrupt() 函数会无限期地暂停执行，保存所有状态，并在您提供输入时从中断处恢复执行。当与其他节点操作结合使用时，interrupt() 函数必须位于最前面。
图结构自然而然地涌现
1. 您只需定义必要的连接，节点便可自行处理路由逻辑。这使得控制流清晰明确且可追溯—您只需查看当前节点，即可随时了解代理的下一步行动。

高级考量

节点粒度权衡

本节探讨节点粒度设计中的权衡取舍。大多数应用程序可跳过此部分，采用上述模式。你或许会疑惑：为何不把 “读取邮件” 与 “意图分类” 合并为一个节点？

又或者，为何要将 “文档搜索” 与 “起草回复” 分开呢？答案关乎弹性与可观测性之间的权衡。

答案关乎弹性与可观测性之间的权衡。

弹性角度的考量：LangGraph 采用持久化执行方式，会在节点边界创建检查点。一旦工作流因中断或故障而恢复运行，便会从执行中断的节点起始处重新开始。节点越小，检查点生成越频繁，如此一来，若出现问题，需重复执行的工作量就越少。要是将多个操作整合为一个大节点，一旦在接近节点末尾处出现故障，就意味着要从该节点起始处重新执行所有操作。

我们针对邮件代理选择这种细分方式的原因如下：

外部服务隔离：“文档搜索” 与 “错误追踪” 之所以是独立节点，是因为它们会调用外部 API。若搜索服务响应缓慢或出现故障，我们希望将其与LLM的调用隔离开来。如此，便可针对这些特定节点添加重试策略，而不影响其他节点。
中间环节可见性：将 “意图分类” 设为独立节点，便于我们在采取行动前查看LLM做出的决策。这对于调试与监控极为重要，通过此方式，你能确切知晓代理何时以及为何将任务转至人工审核环节。
不同故障模式：LLM调用、数据库查询以及邮件发送，各自具有不同的重试策略。独立节点使你能够对这些策略分别进行配置。
可复用性与测试：较小的节点更易于单独测试，也便于在其他工作流中复用。

另一种可行方法：你可以把 “读取邮件” 与 “意图分类” 合并为单个节点。但这样一来，在分类前就无法查看原始邮件，并且一旦该节点出现故障，这两个操作都得重新执行。对于大多数应用程序而言，独立节点在可观测性与调试方面所带来的优势，值得做出这种权衡。

应用层面的考量：第 2 步中关于缓存的讨论（是否缓存搜索结果）属于应用层面的决策，并非 LangGraph 框架的特性。你可依据自身具体需求，在节点函数中实现缓存功能，LangGraph 对此并无强制规定。

性能考量：节点数量增多并不意味着执行速度会变慢。LangGraph 默认采用异步持久化模式，即在后台写入检查点，这样一来，工作流图可继续运行，无需等待检查点写入完成。这意味着，你既能频繁生成检查点，又能将其对性能的影响降至最低。如有需要，你可调整这一行为 —— 采用 exit 模式，仅在任务完成时生成检查点；或采用 sync 模式，在每个检查点写入完成前暂停执行。

完整代码示例

定义State

from typing import TypedDict, Literal

# Define the structure for email classification
class EmailClassification(TypedDict):
    intent: Literal["question", "bug", "billing", "feature", "complex"]
    urgency: Literal["low", "medium", "high", "critical"]
    topic: str
    summary: str

class EmailAgentState(TypedDict):
    # Raw email data
    email_content: str
    sender_email: str
    email_id: str

    # Classification result
    classification: EmailClassification | None

    # Raw search/API results
    search_results: list[str] | None  # List of raw document chunks
    customer_history: dict | None  # Raw customer data from CRM

    # Generated content
    draft_response: str | None
    messages: list[str] | None

定义节点

from langgraph.types import interrupt, Command, RetryPolicy
from langchain_community.chat_models import ChatTongyi
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import RetryPolicy
from langgraph.graph import StateGraph, START, END

DASHSCOPE_API_KEY="xxxxcc"
llm = ChatTongyi(model='qwen-plus', api_key=DASHSCOPE_API_KEY)

定义读取邮件节点

def read_email(state: EmailAgentState) -> Command[Literal["classify_intent"]]:
    """Extract and parse email content"""
    print("1. read_email")
    # In production, this would connect to your email service
    return Command(
        update={"messages": [HumanMessage(content=f"Processing email: {state['email_content']}")]},
        goto="classify_intent"
    )

定义意图分类节点

def classify_intent(state: EmailAgentState) -> Command[Literal["search_documentation", "human_review", "draft_response", "bug_tracking"]]:
    """Use LLM to classify email intent and urgency, then route accordingly"""
    print("2. classify_intent")
    # Create structured LLM that returns EmailClassification dict
    structured_llm = llm.with_structured_output(EmailClassification)

    # Format the prompt on-demand, not stored in state
    classification_prompt = f"""
    Analyze this customer email and classify it:

    Email: {state['email_content']}
    From: {state['sender_email']}

    Provide classification including intent, urgency, topic, and summary.
    """

    # Get structured response directly as dict
    classification = structured_llm.invoke(classification_prompt)

    # Determine next node based on classification
    if classification['intent'] == 'billing' or classification['urgency'] == 'critical':
        goto = "human_review"
    elif classification['intent'] in ['question', 'feature']:
        goto = "search_documentation"
    elif classification['intent'] == 'bug':
        goto = "bug_tracking"
    else:
        goto = "draft_response"

    # Store classification as a single dict in state
    return Command(
        update={"classification": classification},
        goto=goto
    )

定义搜索文档节点

def search_documentation(state: EmailAgentState) -> Command[Literal["draft_response"]]:
    """Search knowledge base for relevant information"""
    print("3. search_documentation")
    # Build search query from classification
    classification = state.get('classification', {})
    query = f"{classification.get('intent', '')} {classification.get('topic', '')}"

    try:
        # Implement your search logic here
        # Store raw search results, not formatted text
        search_results = [
            "Reset password via Settings > Security > Change Password",
            "Password must be at least 12 characters",
            "Include uppercase, lowercase, numbers, and symbols"
        ]
    except SearchAPIError as e:
        # For recoverable search errors, store error and continue
        search_results = [f"Search temporarily unavailable: {str(e)}"]

    return Command(
        update={"search_results": search_results},  # Store raw results or error
        goto="draft_response"
    )

定义故障追踪节点

def bug_tracking(state: EmailAgentState) -> Command[Literal["draft_response"]]:
    """Create or update bug tracking ticket"""
    print("4. bug_tracking")
    # Create ticket in your bug tracking system
    ticket_id = "BUG-12345"  # Would be created via API

    return Command(
        update={
            "search_results": [f"Bug ticket {ticket_id} created"],
            "current_step": "bug_tracked"
        },
        goto="draft_response"
    )

定义起草回复节点

def draft_response(state: EmailAgentState) -> Command[Literal["human_review", "send_reply"]]:
    """Generate response using context and route based on quality"""
    print("5. draft_response")
    classification = state.get('classification', {})

    # Format context from raw state data on-demand
    context_sections = []

    if state.get('search_results'):
        # Format search results for the prompt
        formatted_docs = "\n".join([f"- {doc}" for doc in state['search_results']])
        context_sections.append(f"Relevant documentation:\n{formatted_docs}")

    if state.get('customer_history'):
        # Format customer data for the prompt
        context_sections.append(f"Customer tier: {state['customer_history'].get('tier', 'standard')}")

    # Build the prompt with formatted context
    draft_prompt = f"""
    Draft a response to this customer email:
    {state['email_content']}

    Email intent: {classification.get('intent', 'unknown')}
    Urgency level: {classification.get('urgency', 'medium')}

    {chr(10).join(context_sections)}

    Guidelines:
    - Be professional and helpful
    - Address their specific concern
    - Use the provided documentation when relevant
    """

    response = llm.invoke(draft_prompt)

    # Determine if human review needed based on urgency and intent
    needs_review = (
        classification.get('urgency') in ['high', 'critical'] or
        classification.get('intent') == 'complex'
    )

    # Route to appropriate next node
    goto = "human_review" if needs_review else "send_reply"

    return Command(
        update={"draft_response": response.content},  # Store only the raw response
        goto=goto
    )

定义人工审核节点

def human_review(state: EmailAgentState) -> Command[Literal["send_reply", END]]:
    """Pause for human review using interrupt and route based on decision"""
    print("6. human_review")
    classification = state.get('classification', {})

    # interrupt() must come first - any code before it will re-run on resume
    human_decision = interrupt({
        "email_id": state.get('email_id',''),
        "original_email": state.get('email_content',''),
        "draft_response": state.get('draft_response',''),
        "urgency": classification.get('urgency'),
        "intent": classification.get('intent'),
        "action": "Please review and approve/edit this response"
    })

    # Now process the human's decision
    if human_decision.get("approved"):
        return Command(
            update={"draft_response": human_decision.get("edited_response", state.get('draft_response',''))},
            goto="send_reply"
        )
    else:
        # Rejection means human will handle directly
        return Command(update={}, goto=END)

定义发送回复节点

def send_reply(state: EmailAgentState) -> Command[Literal[END]]:
    print("7. send_reply")
    """Send the email response"""
    # Integrate with email service(try catch exception)
    print(f"Sending reply: {state['draft_response'][:100]}...")
    return return Command(update={}, goto=END)

创建图

from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import RetryPolicy

# Create the graph
workflow = StateGraph(EmailAgentState)

# Add nodes with appropriate error handling
workflow.add_node("read_email", read_email)
workflow.add_node("classify_intent", classify_intent)
workflow.add_node("human_review", human_review)

# Add retry policy for nodes that might have transient failures
workflow.add_node(
    "search_documentation",
    search_documentation,
    retry_policy=RetryPolicy(max_attempts=3, initial_interval=1.0)
)
workflow.add_node("bug_tracking", bug_tracking)
workflow.add_node("draft_response", draft_response)

workflow.add_node("send_reply", send_reply)
# Add only the essential edges
workflow.add_edge(START, "read_email")
workflow.add_edge("read_email", "classify_intent")
workflow.add_edge("send_reply", END)

# Compile with checkpointer for persistence, in case run graph with Local_Server --> Please compile without checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

显示图

from IPython.display import Image, display
display(Image(app.get_graph(xray=True).draw_mermaid_png()))

图结构

测试Agent

# Test with an urgent billing issue
initial_state = {
    "email_content": "I was charged twice for my subscription! This is urgent!",
    "sender_email": "customer@example.com",
    "email_id": "email_123",
    "messages": []
}

# Run with a thread_id for persistence
config = {"configurable": {"thread_id": "customer_123"}}
result = app.invoke(initial_state, config)
# The graph will pause at human_review
# print(f"Draft ready for review: {result['draft_response'][:100]}...")

# When ready, provide human input to resume
from langgraph.types import Command

human_response = Command(
    resume={
        "approved": True,
        "edited_response": "We sincerely apologize for the double charge. I've initiated an immediate refund..."
    }
)

# Resume execution
final_result = app.invoke(human_response, config)
print(f"Email sent successfully!")

执行结果为：

1. read_email
2. classify_intent
6. human_review
6. human_review
7. send_reply
Sending reply: We sincerely apologize for the double charge. I've initiated an immediate refund......
Email sent successfully!

参考文章

Thinking in LangGraph官方英文文档