作为一名对AI技术充满好奇技术人员,阅读这篇关于反思模式的技术文档,就像打开了一扇新世界的大门。它让我明白,真正智能的AI并非只是机械地回答問題,而是能够像人类一样"自我检查、自我改进"。以下是我的一些学习心得。
📌 一、最大的启发:AI不再是"一次性"输出,而是"迭代优化"的过程
在此之前,我以为AI的工作模式是:输入问题,直接输出最终答案。这篇文档彻底改变了我的看法。它指出,AI的初始输出往往并不完美,可能存在不准确、不完整或不够优化的问题。
反思模式的核心,就是为AI设计了一个"检查作业"的流程。这个过程可以概括为一个清晰的四步循环:
- 执行:AI先根据任务生成一个初稿(比如一段代码或一篇文章)。
- 评估/批判:AI会切换角色,成为一个"严格的老师",来审查自己刚刚生成的初稿。它会检查其中是否存在逻辑错误、事实不准、风格不符等问题。
- 反思/精炼:根据批判意见,AI确定如何改进,并生成一个优化后的新版本。
- 迭代:这个"生成-批判-改进"的循环可以重复进行,直到结果令人满意或达到预设的停止条件。
这让我联想到学生时代写作文:先写草稿,然后自己检查错别字和语句不通顺的地方,再根据检查结果进行修改,甚至重写。原来AI也可以通过这种"反复推敲"的方式,让输出结果变得越来越好。
有研究表明,在代码生成任务中,采用反思模式后,准确率能从48.1%大幅提升至95.1%,这充分证明了其有效性。
📌 二、最巧妙的设计:"生产者-评论者"分工合作,避免"自卖自夸"
文档中提到的一个非常精妙的设计是 "生产者-评论者"模型。这意味着可以将任务分配给两个具有不同职责的AI角色:
▪️ 🧑💻 生产者:负责埋头创作,生成初始方案。
▪️ 🧑🔬 评论者:负责挑刺找错,提供改进建议。
这种分工的好处显而易见。如果让同一个AI既当运动员又当裁判员,它可能很难发现自己的思维盲区。而让一个专门的"评论者"以客观、批判的视角来审视工作,能更有效地发现问题。
这就像软件开发中程序员和测试工程师的关系一样,通过分工协作确保最终产品的质量。
📌 三、最直观的感受:反思模式有广泛的应用场景,非常实用
文档列举的许多应用场景让我感觉这项技术离我们并不遥远:
▪️ 代码生成与调试:AI不仅能写代码,还能自己运行测试、发现Bug并修复,像Devin这样的"AI程序员"就在运用这种模式。
▪️ 创意写作:AI可以不断反思和优化它写的文章、故事或营销文案,使其更符合要求。
▪️ 客户服务聊天机器人:机器人可以回顾整个对话历史,确保回答的连贯性和准确性,避免误解用户意图。
▪️ 复杂问题解决:在解决逻辑谜题或制定复杂计划时,AI可以评估中间步骤的合理性,并及时调整策略。
这让我意识到,反思模式是打造高质量AI应用的关键。无论是GitHub Copilot根据你的反馈调整代码建议,还是Claude在分析长文档时进行自我检查,其背后很可能都有反思模式的支撑。
📌 四、我的思考:任何强大的能力都有其"代价"
文档也坦诚地提到了反思模式的局限性,这让我能更全面地看待这项技术。
⚠️ 速度与成本的权衡:反复思考意味着需要多次调用大模型,这会增加计算成本和时间成本。因此,它可能不适合对实时性要求极高的应用(如实时语音对话)。
⚠️ 技术复杂性:实现一个完整、多轮的反思循环需要复杂的流程控制和状态管理,对开发者有较高的技术要求。
这让我明白,技术的选择总是一种权衡。反思模式用"更慢、更贵"换来了"更好、更准",我们需要根据具体场景做出合适的选择。
✅ 写在最后:我的收获
通过这篇文档,我理解的"反思模式"不再是冰冷的技术术语,而是AI向"真正智能"迈进的重要一步。它赋予了AI**"吾日三省吾身"**的能力,使其能够从错误中学习,不断自我完善。
对于我们技人员而言,了解这样的前沿模式,最大的价值在于:
- 提升鉴别力:未来在选择或使用AI产品时,我可能会更倾向于那些具备自我反思和持续优化能力的产品。
- 开阔视野:明白了当前AI能力的边界和未来可能的发展方向,知道它们不仅能执行任务,还能优化任务。
- 激发兴趣:这种巧妙的设计让我对AI技术背后的原理产生了更浓厚的兴趣,或许会激励我未来进行更深入的学习。
总之,这次阅读让我深刻感受到,AI的进化之路,不仅仅是模型越来越大,更是其工作方式和"思考"模式变得越来越聪明。而反思模式,无疑是这条路上一个闪耀的里程碑。
💬 互动讨论
你对AI的"反思模式"有什么看法?你认为这种技术会如何改变我们使用AI的方式?欢迎在留言区分享你的想法!
Chapter 4: Reflection | 第四章:反思
Reflection Pattern Overview | 反思模式概述
In the preceding chapters, we've explored fundamental agentic patterns: Chaining for sequential execution, Routing for dynamic path selection, and Parallelization for concurrent task execution. These patterns enable agents to perform complex tasks more efficiently and flexibly. However, even with sophisticated workflows, an agent's initial output or plan might not be optimal, accurate, or complete. This is where the Reflection pattern comes into play.
在前面的章节中,我们探讨了基本的智能体模式:用于顺序执行的链式模式,用于动态路径选择的路由模式,以及用于并发任务执行的并行化模式。这些模式使智能体能够更高效、更灵活地执行复杂任务。然而,即使拥有复杂的工作流程,智能体的初始输出或计划也可能不是最优、准确或完整的。这正是反思模式发挥作用的地方。
The Reflection pattern involves an agent evaluating its own work, output, or internal state and using that evaluation to improve its performance or refine its response. It's a form of self-correction or self-improvement, allowing the agent to iteratively refine its output or adjust its approach based on feedback, internal critique, or comparison against desired criteria. Reflection can occasionally be facilitated by a separate agent whose specific role is to analyze the output of an initial agent.
反思模式涉及智能体评估自身的工作、输出或内部状态,并利用该评估来提高其性能或完善其响应。这是一种自我纠正或自我改进的形式,允许智能体基于反馈、内部批评或与期望标准的比较,迭代地完善其输出或调整其方法。反思有时可以由一个专门的智能体来促进,该智能体的特定角色是分析初始智能体的输出。
Unlike a simple sequential chain where output is passed directly to the next step, or routing which chooses a path, reflection introduces a feedback loop. The agent doesn't just produce an output; it then examines that output (or the process that generated it), identifies potential issues or areas for improvement, and uses those insights to generate a better version or modify its future actions.
与输出直接传递到下一步的简单顺序链,或选择路径的路由不同,反思引入了反馈循环。智能体不仅仅是产生输出;它还会检查该输出(或产生该输出的过程),识别潜在问题或改进领域,并利用这些见解生成更好的版本或修改其未来的行动。
The process typically involves: 该过程通常涉及:
- Execution: The agent performs a task or generates an initial output. | 执行: 智能体执行任务或生成初始输出。
- Evaluation/Critique: The agent (often using another LLM call or a set of rules) analyzes the result from the previous step. This evaluation might check for factual accuracy, coherence, style, completeness, adherence to instructions, or other relevant criteria. | 评估/批评: 智能体(通常使用另一个LLM调用或一组规则)分析上一步的结果。此评估可能会检查事实准确性、连贯性、风格、完整性、对指令的遵守情况或其他相关标准。
- Reflection/Refinement: Based on the critique, the agent determines how to improve. This might involve generating a refined output, adjusting parameters for a subsequent step, or even modifying the overall plan. | 反思/改进: 基于批评,智能体决定如何改进。这可能涉及生成改进后的输出、调整后续步骤的参数,甚至修改整体计划。
- Iteration (Optional but common): The refined output or adjusted approach can then be executed, and the reflection process can repeat until a satisfactory result is achieved or a stopping condition is met. | 迭代(可选但常见): 然后可以执行改进后的输出或调整后的方法,并且反思过程可以重复,直到获得满意的结果或满足停止条件。
A key and highly effective implementation of the Reflection pattern separates the process into two distinct logical roles: a Producer and a Critic. This is often called the "Generator-Critic" or "Producer-Reviewer" model. While a single agent can perform self-reflection, using two specialized agents (or two separate LLM calls with distinct system prompts) often yields more robust and unbiased results.
反思模式的一个关键且高效的实现是将过程分离为两个不同的逻辑角色:生产者和批评者。这通常被称为"生成器-批评者"或"生产者-评审者"模型。虽然单个智能体可以执行自我反思,但使用两个专门的智能体(或具有不同系统提示的两个独立LLM调用)通常会产生更稳健和更公正的结果。
- The Producer Agent: This agent's primary responsibility is to perform the initial execution of the task. It focuses entirely on generating the content, whether it's writing code, drafting a blog post, or creating a plan. It takes the initial prompt and produces the first version of the output. | 生产者智能体:该智能体的主要职责是执行任务的初始执行。它完全专注于生成内容,无论是编写代码、起草博客文章还是创建计划。它接收初始提示并生成输出的第一个版本。
- The Critic Agent: This agent's sole purpose is to evaluate the output generated by the Producer. It is given a different set of instructions, often a distinct persona (e.g., "You are a senior software engineer," "You are a meticulous fact-checker"). The Critic's instructions guide it to analyze the Producer's work against specific criteria, such as factual accuracy, code quality, stylistic requirements, or completeness. It is designed to find flaws, suggest improvements, and provide structured feedback. | 批评者智能体:该智能体的唯一目的是评估生产者生成的输出。它被赋予一组不同的指令,通常是一个独特的角色(例如,"你是一名高级软件工程师","你是一名细致的事实核查员")。批评者的指令引导其根据特定标准(如事实准确性、代码质量、风格要求或完整性)分析生产者的工作。它旨在发现缺陷、提出改进建议并提供结构化的反馈。
This separation of concerns is powerful because it prevents the "cognitive bias" of an agent reviewing its own work. The Critic agent approaches the output with a fresh perspective, dedicated entirely to finding errors and areas for improvement. The feedback from the Critic is then passed back to the Producer agent, which uses it as a guide to generate a new, refined version of the output. The provided LangChain and ADK code examples both implement this two-agent model: the LangChain example uses a specific "reflector_prompt" to create a critic persona, while the ADK example explicitly defines a producer and a reviewer agent.
这种关注点分离非常强大,因为它避免了智能体审查自身工作时产生的"认知偏差"。批评者智能体以全新的视角处理输出,完全致力于发现错误和改进领域。然后,来自批评者的反馈被传递回生产者智能体,生产者智能体将其作为指导来生成新的、改进的输出版本。提供的LangChain和ADK代码示例都实现了这种双智能体模型:LangChain示例使用特定的"reflector_prompt"来创建批评者角色,而ADK示例则明确定义了生产者和评审者智能体。
Implementing reflection often requires structuring the agent's workflow to include these feedback loops. This can be achieved through iterative loops in code, or using frameworks that support state management and conditional transitions based on evaluation results. While a single step of evaluation and refinement can be implemented within either a LangChain/LangGraph, or ADK, or Crew.AI chain, true iterative reflection typically involves more complex orchestration.
实现反思通常需要构建智能体的工作流程以包含这些反馈循环。这可以通过代码中的迭代循环,或使用支持状态管理和基于评估结果的条件转换的框架来实现。虽然单步评估和改进可以在LangChain/LangGraph、ADK或Crew.AI链中实现,但真正的迭代反思通常涉及更复杂的编排。
The Reflection pattern is crucial for building agents that can produce high-quality outputs, handle nuanced tasks, and exhibit a degree of self-awareness and adaptability. It moves agents beyond simply executing instructions towards a more sophisticated form of problem-solving and content generation.
反思模式对于构建能够产生高质量输出、处理细致入微的任务并表现出一定程度的自我意识和适应性的智能体至关重要。它将智能体从简单地执行指令推向更复杂的问题解决和内容生成形式。
The intersection of reflection with goal setting and monitoring (see Chapter 11) is worth noticing. A goal provides the ultimate benchmark for the agent's self-evaluation, while monitoring tracks its progress. In a number of practical cases, Reflection then might act as the corrective engine, using monitored feedback to analyze deviations and adjust its strategy. This synergy transforms the agent from a passive executor into a purposeful system that adaptively works to achieve its objectives.
反思与目标设定和监控(参见第11章)的交集值得注意。目标为智能体的自我评估提供了最终基准,而监控则跟踪其进展。在许多实际案例中,反思随后可能充当纠正引擎,利用监控反馈来分析偏差并调整其策略。这种协同作用将智能体从被动执行者转变为有目的的系统,能够自适应地努力实现其目标。
Furthermore, the effectiveness of the Reflection pattern is significantly enhanced when the LLM keeps a memory of the conversation (see Chapter 8). This conversational history provides crucial context for the evaluation phase, allowing the agent to assess its output not just in isolation, but against the backdrop of previous interactions, user feedback, and evolving goals. It enables the agent to learn from past critiques and avoid repeating errors. Without memory, each reflection is a self-contained event; with memory, reflection becomes a cumulative process where each cycle builds upon the last, leading to more intelligent and context-aware refinement.
此外,当LLM保持对话的记忆时(参见第8章),反思模式的有效性会显著增强。这种对话历史为评估阶段提供了关键背景,使智能体不仅能够孤立地评估其输出,还能结合之前的交互、用户反馈和不断变化的目标进行评估。它使智能体能够从过去的批评中学习并避免重复错误。没有记忆,每次反思都是一个独立的事件;有了记忆,反思就变成了一个累积的过程,每个周期都建立在前一个周期的基础上,从而带来更智能和更具情境感知能力的改进。
Practical Applications & Use Cases | 实际应用与用例
The Reflection pattern is valuable in scenarios where output quality, accuracy, or adherence to complex constraints is critical: 反思模式在输出质量、准确性或对复杂约束的遵守至关重要的情况下非常有价值:
-
Creative Writing and Content Generation: Refining generated text, stories, poems, or marketing copy. | 创意写作和内容生成:改进生成的文本、故事、诗歌或营销文案。
- Use Case: An agent writing a blog post. | 用例: 智能体撰写博客文章。
- Reflection: Generate a draft, critique it for flow, tone, and clarity, then rewrite based on the critique. Repeat until the post meets quality standards. | 反思: 生成草稿,批评其流畅性、语气和清晰度,然后根据批评进行重写。重复此过程,直到文章达到质量标准。
- Benefit: Produces more polished and effective content. | 好处: 产生更精炼、更有效的内容。
-
Code Generation and Debugging: Writing code, identifying errors, and fixing them. | 代码生成与调试:编写代码、识别错误并修复它们。
- Use Case: An agent writing a Python function. | 用例: 智能体编写Python函数。
- Reflection: Write initial code, run tests or static analysis, identify errors or inefficiencies, then modify the code based on the findings. | 反思: 编写初始代码,运行测试或静态分析,识别错误或低效之处,然后根据发现修改代码。
- Benefit: Generates more robust and functional code. | 好处: 生成更健壮、功能更完善的代码。
-
Complex Problem Solving: Evaluating intermediate steps or proposed solutions in multi-step reasoning tasks. | 复杂问题解决:在多步推理任务中评估中间步骤或提出的解决方案。
- Use Case: An agent solving a logic puzzle. | 用例: 智能体解决逻辑谜题。
- Reflection: Propose a step, evaluate if it leads closer to the solution or introduces contradictions, backtrack or choose a different step if needed. | 反思: 提出一个步骤,评估它是更接近解决方案还是引入矛盾,如果需要则回溯或选择不同的步骤。
- Benefit: Improves the agent's ability to navigate complex problem spaces. | 好处: 提高智能体驾驭复杂问题空间的能力。
-
Summarization and Information Synthesis: Refining summaries for accuracy, completeness, and conciseness. | 摘要与信息综合:改进摘要的准确性、完整性和简洁性。
- Use Case: An agent summarizing a long document. | 用例: 智能体总结长文档。
- Reflection: Generate an initial summary, compare it against key points in the original document, refine the summary to include missing information or improve accuracy. | 反思: 生成初始摘要,将其与原始文档的要点进行比较,改进摘要以包含缺失信息或提高准确性。
- Benefit: Creates more accurate and comprehensive summaries. | 好处: 创建更准确、更全面的摘要。
-
Planning and Strategy: Evaluating a proposed plan and identifying potential flaws or improvements. | 规划与策略:评估提议的计划并识别潜在缺陷或改进之处。
- Use Case: An agent planning a series of actions to achieve a goal. | 用例: 智能体规划一系列行动以实现目标。
- Reflection: Generate a plan, simulate its execution or evaluate its feasibility against constraints, revise the plan based on the evaluation. | 反思: 生成计划,模拟其执行或根据约束评估其可行性,根据评估修订计划。
- Benefit: Develops more effective and realistic plans. | 好处: 制定更有效、更现实的计划。
-
Conversational Agents: Reviewing previous turns in a conversation to maintain context, correct misunderstandings, or improve response quality. | 对话智能体:回顾对话中的先前回合以维持上下文、纠正误解或提高响应质量。
- Use Case: A customer support chatbot. | 用例: 客户支持聊天机器人。
- Reflection: After a user response, review the conversation history and the last generated message to ensure coherence and address the user's latest input accurately. | 反思: 在用户响应后,回顾对话历史和最后生成的消息,以确保连贯性并准确回应用户的最新输入。
- Benefit: Leads to more natural and effective conversations. | 好处: 带来更自然、更有效的对话。
Reflection adds a layer of meta-cognition to agentic systems, enabling them to learn from their own outputs and processes, leading to more intelligent, reliable, and high-quality results.
反思为智能体系统增加了一层元认知,使它们能够从自己的输出和过程中学习,从而产生更智能、更可靠和更高质量的结果。
Hands-On Code Example (LangChain) | 动手代码示例 (LangChain)
The implementation of a complete, iterative reflection process necessitates mechanisms for state management and cyclical execution. While these are handled natively in graph-based frameworks like LangGraph or through custom procedural code, the fundamental principle of a single reflection cycle can be demonstrated effectively using the compositional syntax of LCEL (LangChain Expression Language).
实现一个完整的、迭代的反思过程需要状态管理和循环执行的机制。虽然这些在图基框架(如LangGraph)或通过自定义过程代码中得到了原生处理,但可以使用LCEL的组合语法有效地演示单个反思循环的基本原理。
This example implements a reflection loop using the Langchain library and OpenAI's GPT-4o model to iteratively generate and refine a Python function that calculates the factorial of a number. The process starts with a task prompt, generates initial code, and then repeatedly reflects on the code based on critiques from a simulated senior software engineer role, refining the code in each iteration until the critique stage determines the code is perfect or a maximum number of iterations is reached. Finally, it prints the resulting refined code.
此示例使用Langchain库和OpenAI的GPT-4o模型实现了一个反思循环,以迭代生成和改进一个计算数字阶乘的Python函数。该过程从任务提示开始,生成初始代码,然后根据模拟的高级软件工程师角色的批评反复反思代码,在每次迭代中改进代码,直到批评阶段确定代码完美或达到最大迭代次数。最后,它打印出最终改进后的代码。
First, ensure you have the necessary libraries installed: 首先,确保已安装必要的库:
pip install langchain langchain-community langchain-openai |
|---|
You will also need to set up your environment with your API key for the language model you choose (e.g., OpenAI, Google Gemini, Anthropic). 您还需要使用您选择的语言模型(例如,OpenAI、Google Gemini、Anthropic)的API密钥来设置您的环境。
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage
# --- Configuration ---
# Load environment variables from .env file (for OPENAI_API_KEY)
load_dotenv()
# Check if the API key is set
if not os.getenv("OPENAI_API_KEY"):
raise ValueError("OPENAI_API_KEY not found in .env file. Please add it.")
# Initialize the Chat LLM. We use gpt-4o for better reasoning.
# A lower temperature is used for more deterministic outputs.
llm = ChatOpenAI(model="gpt-4o", temperature=0.1)
def run_reflection_loop():
"""
Demonstrates a multi-step AI reflection loop to progressively improve a Python function.
"""
# --- The Core Task ---
task_prompt = """
Your task is to create a Python function named `calculate_factorial`.
This function should do the following:
1. Accept a single integer `n` as input.
2. Calculate its factorial (n!).
3. Include a clear docstring explaining what the function does.
4. Handle edge cases: The factorial of 0 is 1.
5. Handle invalid input: Raise a ValueError if the input is a negative number.
# --- The Reflection Loop ---
max_iterations = 3
current_code = ""
# We will build a conversation history to provide context in each step.
message_history = [HumanMessage(content=task_prompt)]
for i in range(max_iterations):
print("\n" + "="*25 + f" REFLECTION LOOP: ITERATION {i + 1} " + "="*25)
# --- 1. GENERATE / REFINE STAGE ---
# In the first iteration, it generates. In subsequent iterations, it refines.
if i == 0:
print("\n>>> STAGE 1: GENERATING initial code...")
# The first message is just the task prompt.
response = llm.invoke(message_history)
current_code = response.content
else:
print("\n>>> STAGE 1: REFINING code based on previous critique...")
# The message history now contains the task,
# the last code, and the last critique.
# We instruct the model to apply the critiques.
message_history.append(HumanMessage(content="Please refine the code using the critiques provided."))
response = llm.invoke(message_history)
current_code = response.content
print("\n--- Generated Code (v" + str(i + 1) + ") ---\n" + current_code)
message_history.append(response) # Add the generated code to history
# --- 2. REFLECT STAGE ---
print("\n>>> STAGE 2: REFLECTING on the generated code...")
# Create a specific prompt for the reflector agent.
# This asks the model to act as a senior code reviewer.
reflector_prompt = [
SystemMessage(content="""You are a senior software engineer and an expert in Python.
Your role is to perform a meticulous code review.
Critically evaluate the provided Python code based on the original task requirements.
Look for bugs, style issues, missing edge cases, and areas for improvement.
If the code is perfect and meets all requirements,
respond with the single phrase 'CODE_IS_PERFECT'.
Otherwise, provide a bulleted list of your critiques.
"""),
HumanMessage(content=f"Original Task:\n{task_prompt}\n\nCode to Review:\n{current_code}")
]
critique_response = llm.invoke(reflector_prompt)
critique = critique_response.content
# --- 3. STOPPING CONDITION ---
if "CODE_IS_PERFECT" in critique:
print("\n--- Critique ---\nNo further critiques found. The code is satisfactory.")
break
print("\n--- Critique ---\n" + critique)
# Add the critique to the history for the next refinement loop.
message_history.append(HumanMessage(content=f"Critique of the previous code:\n{critique}"))
print("\n" + "="*30 + " FINAL RESULT " + "="*30)
print("\nFinal refined code after the reflection process:\n")
print(current_code)
if __name__ == "__main__":
run_reflection_loop()
The code begins by setting up the environment, loading API keys, and initializing a powerful language model like GPT-4o with a low temperature for focused outputs. The core task is defined by a prompt asking for a Python function to calculate the factorial of a number, including specific requirements for docstrings, edge cases (factorial of 0), and error handling for negative input. The run_reflection_loop function orchestrates the iterative refinement process. Within the loop, in the first iteration, the language model generates initial code based on the task prompt. In subsequent iterations, it refines the code based on critiques from the previous step. A separate "reflector" role, also played by the language model but with a different system prompt, acts as a senior software engineer to critique the generated code against the original task requirements. This critique is provided as a bulleted list of issues or the phrase 'CODE_IS_PERFECT' if no issues are found. The loop continues until the critique indicates the code is perfect or a maximum number of iterations is reached. The conversation history is maintained and passed to the language model in each step to provide context for both generation/refinement and reflection stages. Finally, the script prints the last generated code version after the loop concludes.
代码首先设置环境,加载API密钥,并初始化一个强大的语言模型(如GPT-4o),使用较低的温度以获得更集中的输出。核心任务由一个提示定义,要求编写一个计算数字阶乘的Python函数,包括对文档字符串、边缘情况(0的阶乘)和负输入错误处理的特定要求。run_reflection_loop函数编排了迭代改进过程。在循环内,第一次迭代时,语言模型根据任务提示生成初始代码。在后续迭代中,它根据前一步的批评改进代码。一个单独的"反思者"角色(也由语言模型扮演,但使用不同的系统提示)充当高级软件工程师,根据原始任务要求对生成的代码进行批评。此批评以问题项目符号列表的形式提供,如果未发现问题,则提供短语'CODE_IS_PERFECT'。循环持续进行,直到批评表明代码完美或达到最大迭代次数。对话历史被维护并在每个步骤传递给语言模型,为生成/改进和反思阶段提供上下文。最后,脚本在循环结束后打印最后生成的代码版本。
Hands-On Code Example (ADK) | 动手代码示例 (ADK)
Let's now look at a conceptual code example implemented using the Google ADK. Specifically, the code showcases this by employing a Generator-Critic structure, where one component (the Generator) produces an initial result or plan, and another component (the Critic) provides critical feedback or a critique, guiding the Generator towards a more refined or accurate final output.
现在让我们看一个使用Google ADK实现的概念性代码示例。具体来说,该代码通过使用生成器-批评者结构来展示这一点,其中一个组件(生成器)产生初步结果或计划,另一个组件(批评者)提供关键反馈或批评,指导生成器生成更精细或更准确的最终输出。
from google.adk.agents import SequentialAgent, LlmAgent
# The first agent generates the initial draft.
generator = LlmAgent(
name="DraftWriter",
description="Generates initial draft content on a given subject.",
instruction="Write a short, informative paragraph about the user's subject.",
output_key="draft_text" # The output is saved to this state key.
)
# The second agent critiques the draft from the first agent.
reviewer = LlmAgent(
name="FactChecker",
description="Reviews a given text for factual accuracy and provides a structured critique.",
instruction="""
You are a meticulous fact-checker.
1. Read the text provided in the state key 'draft_text'.
2. Carefully verify the factual accuracy of all claims.
3. Your final output must be a dictionary containing two keys:
- "status": A string, either "ACCURATE" or "INACCURATE".
- "reasoning": A string providing a clear explanation for your status, citing specific issues if any are found.
""",
output_key="review_output" # The structured dictionary is saved here.
)
# The SequentialAgent ensures the generator runs before the reviewer.
review_pipeline = SequentialAgent(
name="WriteAndReview_Pipeline",
sub_agents=[generator, reviewer]
)
# Execution Flow:
# 1. generator runs -> saves its paragraph to state['draft_text'].
# 2. reviewer runs -> reads state['draft_text'] and saves its dictionary output to state['review_output'].
This code demonstrates the use of a sequential agent pipeline in Google ADK for generating and reviewing text. It defines two LlmAgent instances: generator and reviewer. The generator agent is designed to create an initial draft paragraph on a given subject. It is instructed to write a short and informative piece and saves its output to the state key draft_text. The reviewer agent acts as a fact-checker for the text produced by the generator. It is instructed to read the text from draft_text and verify its factual accuracy. The reviewer's output is a structured dictionary with two keys: status and reasoning. status indicates if the text is "ACCURATE" or "INACCURATE", while reasoning provides an explanation for the status. This dictionary is saved to the state key review_output. A SequentialAgent named review_pipeline is created to manage the execution order of the two agents. It ensures that the generator runs first, followed by the reviewer. The overall execution flow is that the generator produces text, which is then saved to the state. Subsequently, the reviewer reads this text from the state, performs its fact-checking, and saves its findings (the status and reasoning) back to the state. This pipeline allows for a structured process of content creation and review using separate agents.Note: An alternative implementation utilizing ADK's LoopAgent is also available for those interested.
此代码演示了在Google ADK中使用顺序智能体管道来生成和审查文本。它定义了两个LlmAgent实例:生成器和评审者。生成器智能体旨在就给定主题创建初始草稿段落。它被指示写一段简短且信息丰富的文字,并将其输出保存到状态键draft_text。评审者智能体充当生成器产生的文本的事实核查员。它被指示从draft_text读取文本并验证其事实准确性。评审者的输出是一个结构化的字典,包含两个键:status和reasoning。status指示文本是"ACCURATE"还是"INACCURATE",而reasoning则为该状态提供解释。该字典被保存到状态键review_output。创建了一个名为review_pipeline的SequentialAgent来管理两个智能体的执行顺序。它确保生成器首先运行,然后是评审者。整体执行流程是:生成器生成文本,然后将其保存到状态中;随后,评审者从状态中读取此文本,执行其事实核查,并将其发现(状态和推理)保存回状态。该管道允许使用独立的智能体进行结构化的内容创建和审查过程。注意: 对于感兴趣的用户,还提供了使用ADK的LoopAgent的替代实现。
Before concluding, it's important to consider that while the Reflection pattern significantly enhances output quality, it comes with important trade-offs. The iterative process, though powerful, can lead to higher costs and latency, since every refinement loop may require a new LLM call, making it suboptimal for time-sensitive applications. Furthermore, the pattern is memory-intensive; with each iteration, the conversational history expands, including the initial output, critique, and subsequent refinements.
在结束之前,必须考虑到,虽然反思模式显著提高了输出质量,但它也带来了重要的权衡。迭代过程虽然强大,但可能导致更高的成本和延迟,因为每个改进循环可能需要一个新的LLM调用,这使得它对于时间敏感的应用来说不是最优选择。此外,该模式是内存密集型的;随着每次迭代,对话历史会扩展,包括初始输出、批评和后续的改进。
At Glance | 概览
What: An agent's initial output is often suboptimal, suffering from inaccuracies, incompleteness, or a failure to meet complex requirements. Basic agentic workflows lack a built-in process for the agent to recognize and fix its own errors. This is solved by having the agent evaluate its own work or, more robustly, by introducing a separate logical agent to act as a critic, preventing the initial response from being the final one regardless of quality. | 是什么: 智能体的初始输出通常不是最优的,存在不准确、不完整或无法满足复杂要求的问题。基本的智能体工作流缺乏内置过程让智能体识别和修复自身错误。这通过让智能体评估自身工作,或者更稳健地,引入一个独立的逻辑智能体充当批评者来解决,从而防止无论质量如何都将初始响应作为最终响应。
Why: The Reflection pattern offers a solution by introducing a mechanism for self-correction and refinement. It establishes a feedback loop where a "producer" agent generates an output, and then a "critic" agent (or the producer itself) evaluates it against predefined criteria. This critique is then used to generate an improved version. This iterative process of generation, evaluation, and refinement progressively enhances the quality of the final result, leading to more accurate, coherent, and reliable outcomes. | 为什么: 反思模式通过引入自我纠正和改进的机制来提供解决方案。它建立了一个反馈循环,其中"生产者"智能体生成输出,然后"批评者"智能体(或生产者自身)根据预定义标准对其进行评估。然后利用该批评生成改进版本。这种生成、评估和改进的迭代过程逐步提高了最终结果的质量,从而产生更准确、连贯和可靠的结果。
Rule of thumb: Use the Reflection pattern when the quality, accuracy, and detail of the final output are more important than speed and cost. It is particularly effective for tasks like generating polished long-form content, writing and debugging code, and creating detailed plans. Employ a separate critic agent when tasks require high objectivity or specialized evaluation that a generalist producer agent might miss. | 经验法则: 当最终输出的质量、准确性和细节比速度和成本更重要时,使用反思模式。它对于生成精炼的长篇内容、编写和调试代码以及创建详细计划等任务特别有效。当任务需要高度的客观性或专业评估,而通才型生产者智能体可能忽略时,请使用独立的批评者智能体。
Visual summary | 视觉摘要
Fig. 1: Reflection design pattern, self-reflection | 图1:反思设计模式,自我反思
Fig.2: Reflection design pattern, producer and critique agent | 图2:反思设计模式,生产者和批评者智能体
Key Takeaways | 关键要点
- The primary advantage of the Reflection pattern is its ability to iteratively self-correct and refine outputs, leading to significantly higher quality, accuracy, and adherence to complex instructions. | 反思模式的主要优点是它能够迭代地自我纠正和改进输出,从而显著提高质量、准确性以及对复杂指令的遵循程度。
- It involves a feedback loop of execution, evaluation/critique, and refinement. Reflection is essential for tasks requiring high-quality, accurate, or nuanced outputs. | 它涉及执行、评估/批评和改进的反馈循环。对于需要高质量、准确或细致入微输出的任务,反思至关重要。
- A powerful implementation is the Producer-Critic model, where a separate agent (or prompted role) evaluates the initial output. This separation of concerns enhances objectivity and allows for more specialized, structured feedback. | 一个强大的实现是生产者-批评者模型,其中一个独立的智能体(或被提示的角色)评估初始输出。这种关注点分离增强了客观性,并允许更专业、结构化的反馈。
- However, these benefits come at the cost of increased latency and computational expense, along with a higher risk of exceeding the model's context window or being throttled by API services. | 然而,这些好处是以增加延迟和计算成本为代价的,同时还有超出模型上下文窗口或被API服务限制的更高风险。
- While full iterative reflection often requires stateful workflows (like LangGraph), a single reflection step can be implemented in LangChain using LCEL to pass output for critique and subsequent refinement. | 虽然完整的迭代反思通常需要状态化的工作流(如LangGraph),但可以在LangChain中使用LCEL实现单步反思,以传递输出进行批评和后续改进。
- Google ADK can facilitate reflection through sequential workflows where one agent's output is critiqued by another agent, allowing for subsequent refinement steps. | Google ADK可以通过顺序工作流促进反思,其中一个智能体的输出由另一个智能体批评,从而允许后续的改进步骤。
- This pattern enables agents to perform self-correction and enhance their performance over time. | 这种模式使智能体能够执行自我纠正,并随着时间的推移提高其性能。
Conclusion | 结论
The reflection pattern provides a crucial mechanism for self-correction within an agent's workflow, enabling iterative improvement beyond a single-pass execution. This is achieved by creating a loop where the system generates an output, evaluates it against specific criteria, and then uses that evaluation to produce a refined result. This evaluation can be performed by the agent itself (self-reflection) or, often more effectively, by a distinct critic agent, which represents a key architectural choice within the pattern.
反思模式为智能体工作流中的自我纠正提供了一个关键机制,实现了超越单次执行的迭代改进。这是通过创建一个循环来实现的:系统生成输出,根据特定标准对其进行评估,然后利用该评估产生改进的结果。这种评估可以由智能体本身执行(自我反思),或者通常更有效的是,由一个独特的批评者智能体执行,这代表了该模式中的一个关键架构选择。
While a fully autonomous, multi-step reflection process requires a robust architecture for state management, its core principle is effectively demonstrated in a single generate-critique-refine cycle. As a control structure, reflection can be integrated with other foundational patterns to construct more robust and functionally complex agentic systems.
虽然一个完全自主的多步反思过程需要健壮的架构来进行状态管理,但其核心原理在单个生成-批评-改进循环中得到了有效展示。作为一种控制结构,反思可以与其他基础模式集成,以构建更健壮、功能更复杂的智能体系统。
References | 参考文献
Here are some resources for further reading on the Reflection pattern and related concepts: 以下是一些关于反思模式及相关概念的进一步阅读资源:
- Training Language Models to Self-Correct via Reinforcement Learning, arxiv.org/abs/2409.12… | 通过强化学习训练语言模型进行自我纠正
- LangChain Expression Language (LCEL) Documentation: python.langchain.com/docs/introd… | LangChain表达式语言文档
- LangGraph Documentation:www.langchain.com/langgraph | LangGraph文档
- Google Agent Developer Kit (ADK) Documentation (Multi-Agent Systems): google.github.io/adk-docs/ag… | Google智能体开发工具包文档(多智能体系统)