从零实现一个简易版的Coding AgentQwen-Code 等这些 Coding Agent 极大程度的提高了生产力

随着 LLM 的能力越来越强，Coding Agent（编程智能体）也成为了大家日常编程中不可或缺的存在。我们应该都用过现在的一些主流编辑器比如Trae，Cursor等都集成了AI侧边栏或者 Claude Code，OpenCode， Qwen-Code 等这些 Coding Agent 极大程度的提高了生产力和个人能力边界。那么我们一直在用，有没有考虑过自己实现一个简易版的Coding Agent来了解一下它的基本原理呢？

这篇文章就以初学者的视角，一步步带着大家动手实现一个简易版的 Coding Agent 通过实操拆解核心逻辑，帮大家轻松理解Coding Agent的基本实现原理，避开复杂的底层架构，聚焦最基础的核心流程，做到从实践中学习

实际上一个完整的Coding Agent实现起来极其复杂，绝非几行代码就能完成！本文完全是站在初学者的角度，抛弃复杂的底层封装和高级特性，只提炼最核心、最基础的实现逻辑，简单搭建一个简易版Coding Agent。

Coding Agent 核心

实际上Coding Agent本质是一个 ReAct 风格的 Agent，具有规划、感知、和执行能力，靠三类工具完成基本闭环：

文件系统工具集：负责获取项目仓库的完整结构、抓取必要上下文，支持查看、创建、修改文件等基础操作，相当于智能体的“项目导航器”
文本编辑器工具集：专注代码层面操作，完成代码插入、替换、删除、修改等精细化改动，是智能体的“代码编辑器”
待办清单（TO-DO）：全程把控任务进度，把复杂编程任务拆解成小步骤，一步步规划推进，避免逻辑混乱、遗漏步骤

核心架构：ReAct 模式

思考（Thought）→ 行动（Action）→ 观察（Observation）→ 再思考 → 再行动 → ... → 最终答案

ReAct 是 Reasoning and Acting 的缩写，是一种让 LLM 能够“边思考边行动”的提示范式，核心思想是让模型交替进行 Reasoning（推理）和 Acting（行动），就是让 LLM 在执行任务时遵循思考 -> 行动 -> 观察的循环。

思考 (Reason) ：分析用户需求，决定下一步做什么。
行动 (Act) ：调用工具（如读取文件、修改代码）。
观察 (Observe) ：查看工具的输出结果。
循环：根据结果继续思考，直到任务完成。

ReAct是如何循环的？

光看概念，ReAct 可能还是让人云里雾里。为了把问题讲清楚，我准备了一个 ReAct 模式的手动实现的例子，带你通过代码看看 ReAct 到底是怎么跑起来的，它完整地展示了 ReAct (Reasoning + Acting) 的核心逻辑循环，没有任何黑盒封装。你可以清晰地看到 ReAct 的三个关键要素是如何被手动串联起来的。

ReAct Prompt 可以参考这里，或者你也可以看 langchain v0 版本的 create_react_agent 的提示词。

这是一个 Search Agent 其实本质上就是一个拥有联网搜索能力的AI，个人觉着并不能称为一个Agent，本质上还只是大模型调用了一个工具。模型使用的ollama部署本地的 qwen3:14b，并在提示词（Prompt）中严格约束输出格式，确保 LLM 的每一步输出都能被脚本解析出来。

它集成了两个轻量工具：

get_current_time：用于获取当前系统时间；
search_duckduckgo：基于 DuckDuckGo 的实时网络搜索工具。

import re

from datetime import datetime
from langchain_community.tools import DuckDuckGoSearchResults
from langchain_core.tools import tool
from langchain_core.prompts import PromptTemplate
from langchain_ollama import ChatOllama

@tool
def search_duckduckgo(query: str) -> str:
    """
    用于在 DuckDuckGo 搜索引擎上执行查询的工具。

    此工具在以下情况下非常有用：
    1. 当需要查找当前事件或最新信息时
    2. 当需要搜索特定事实、数据或详细信息时
    3. 当问题涉及需要实时信息的内容时
    注意：对于简单的事实性问题，请优先使用此工具而不是依赖模型的知识。

    参数：
    query (str): 要在 DuckDuckGo 上搜索的查询字符串。

    返回：
    str: 搜索结果的字符串表示。
    """
    return DuckDuckGoSearchResults().invoke(query)


@tool
def get_current_time() -> str:
    """
    获取当前实时时间的工具。

    此工具用于获取当前的时间，适用于需要知道当前时间的场景，包括不限于年份、月份、日期、小时、分钟、秒等。
    此外如果用户提到去年、今年、上个月、本月等时间范围，也需要使用此工具获取相关时间。

    返回：
    str: 当前时间的字符串表示。
    """
    return str(datetime.now())

OLLAMA_CHAT_MODEL = "qwen3:14b"

class SearchAgent:
    def __init__(self, model_name: str = OLLAMA_CHAT_MODEL, max_steps: int = 5):
        """初始化搜索代理"""
        self.max_steps = max_steps

        
        self.llm = ChatOllama(model=model_name, reasoning=False).bind(
            stop=["\nObservation:"] # 这里是关键，在LLM下个词生成的是nObservation的时候终止生成，然后我们去调用工具，最后把工具获取的结果拼接在这里
        )

        self.search = DuckDuckGoSearchResults()

        self.tools = [
            get_current_time,
            search_duckduckgo,
        ]

        self.prompt = PromptTemplate.from_template(
            """
            尽你所能回答以下问题。你可以使用以下工具：

            {tools}

            请严格遵循以下格式：

            Question: 你必须回答的输入问题
            Thought: 你应该总是思考该做什么
            Action: 要采取的行动，必须是 [{tool_names}] 之一
            Action Input: 行动的输入内容
            Observation: 行动的结果（由系统返回，不要自己生成）
            ... (这个 Thought/Action/Action Input/Observation 可以重复 N 次)
            Thought: 我现在知道最终答案了
            Final Answer: 针对原始问题的最终答案

            开始！

            Question: {question}
            Agent Scratchpad: {agent_scratchpad}
            """,
            partial_variables={
                "tool_names": ", ".join([tool.name for tool in self.tools]),
                "tools": "\n".join(
                    [f"{tool.name}: {tool.description}" for tool in self.tools]
                ),
            },
        )

    def _run_tool(self, tool_name: str, tool_input: str) -> str:
        """运行指定工具"""
        print(f"  [系统] 🛠️ 正在调用工具: {tool_name} | 输入: {tool_input}")
        for tool in self.tools:
            if tool.name == tool_name:
                result = tool.invoke(tool_input)
                print(
                    f"  [系统] ✅ 工具 {tool_name} 返回结果: {result[:100]}..."
                    if len(str(result)) > 100
                    else f"  [系统] ✅ 工具 {tool_name} 返回结果: {result}"
                )
                return result
        raise ValueError(f"Unknown tool: {tool_name}")

    def query(self, question: str) -> str:
        """主入口：执行 Agent 循环"""

        print(f"🚀 Agent 启动 | 问题: {question}\n")
        step_count = 0
        agent_scratchpad = ""

        while step_count < self.max_steps:
            step_count += 1
            print(f"\n🔄 第 {step_count} 轮思考...")

            # 格式化提示模板
            prompt = self.prompt.format(
                question=question, agent_scratchpad=agent_scratchpad
            )

            # 调用 LLM 生成响应
            print("  [系统] 🧠 LLM 正在思考...")
            response = self.llm.invoke(prompt)
            llm_output = str(response.content)

            print(f"  [系统] 🤖 LLM 输出:\n{llm_output}")

            if "Final Answer:" in llm_output:
                # 提取最终答案
                final_answer = llm_output.split("Final Answer:")[-1].strip()
                print(f"\n🎉 任务完成 | 最终答案: {final_answer}")
                return final_answer

            action_match = re.search(r"Action:\s*(.*?)\n", llm_output)
            action_input_match = re.search(r"Action Input:\s*(.*)", llm_output)

            if action_match and action_input_match:
                action = action_match.group(1).strip()
                action_input = action_input_match.group(1).strip()
                # 5. 执行工具 (Action)
                observation = self._run_tool(action, action_input)
                # 6. 更新历史 (Observation)
                # 将 LLM 的思考 + 真实的工具结果追加到历史记录中
                agent_scratchpad += f"{llm_output}\nObservation: {observation}\n"
            else:
                # 如果格式不对，给 LLM 一个反馈让它重试
                print("  [警告] ⚠️ 解析失败，格式不正确，要求模型重试...")
                agent_scratchpad += f"{llm_output}\nObservation: 格式错误，请严格按照 Action: [工具名] 和 Action Input: [内容] 的格式输出。\n"  # noqa: E501

        print("\n❌ 任务失败：达到最大循环次数")
        return "❌ 任务失败：达到最大循环次数，未能找到答案。"


if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser(description="AI 搜索助手")
    parser.add_argument(
        "question",
        nargs="?",
        help="你想问的问题",
    )
    args = parser.parse_args()

    searchAgent = SearchAgent(max_steps=10)
    searchAgent.query(args.question)

运行两个例子看一下他的输出

Question: 去年的诺贝尔和平奖的获得者是谁？

🚀 Agent 启动 | 问题: 去年的诺贝尔和平奖的获得者是谁

🔄 第 1 轮思考...
  [系统] 🧠 LLM 正在思考...
  [系统] 🤖 LLM 输出:
Thought: 我需要先获取去年的年份，然后搜索去年的诺贝尔和平奖获得者。
Action: get_current_time
Action Input: 
  [系统] 🛠️ 正在调用工具: get_current_time | 输入: 
  [系统] ✅ 工具 get_current_time 返回结果: 2026-03-19 17:49:11.969420

🔄 第 2 轮思考...
  [系统] 🧠 LLM 正在思考...
  [系统] 🤖 LLM 输出:
Thought: 现在我知道去年是2025年，接下来我需要搜索2025年的诺贝尔和平奖获得者。
Action: search_duckduckgo
Action Input: 2025年诺贝尔和平奖获得者是谁
  [系统] 🛠️ 正在调用工具: search_duckduckgo | 输入: 2025年诺贝尔和平奖获得者是谁
  [系统] ✅ 工具 search_duckduckgo 返回结果: snippet: 互联网 获 诺 贝 尔 和 平 奖 提名. 互联网. 图像加注文字，互联网 获 得 诺 贝 尔 和 平 奖 提名.希望推动互联网 获 得 2010 年 诺 贝 尔 和 平 奖 的提名...

🔄 第 3 轮思考...
  [系统] 🧠 LLM 正在思考...
  [系统] 🤖 LLM 输出:
Thought: 根据搜索结果，2025年的诺贝尔和平奖获得者是“互联网”，但这一信息似乎与实际不符，因为诺贝尔奖通常授予个人或组织。可能搜索结果中存在误导信息。我需要进一步确认2025年诺贝尔和平奖的真正获得者。
Action: search_duckduckgo
Action Input: 2025年诺贝尔和平奖正式获得者是谁
  [系统] 🛠️ 正在调用工具: search_duckduckgo | 输入: 2025年诺贝尔和平奖正式获得者是谁
  [系统] ✅ 工具 search_duckduckgo 返回结果: snippet: 2025 年 诺贝尔 和平 奖 （英语： 2025 Nobel Peace Prize）由 挪威诺贝尔委员会 于2025年10月10日宣布，授予 委内瑞拉 反对派领袖 玛丽亚·科丽娜...

🔄 第 4 轮思考...
  [系统] 🧠 LLM 正在思考...
  [系统] 🤖 LLM 输出:
Thought: 根据维基百科和新闻报道，2025年诺贝尔和平奖的获得者是委内瑞拉反对派领袖玛丽亚·科丽娜·马查多（María Corina Machado）。
Final Answer: 2025年诺贝尔和平奖的获得者是委内瑞拉反对派领袖玛丽亚·科丽娜·马查多（María Corina Machado）。

🎉 任务完成 | 最终答案: 2025年诺贝尔和平奖的获得者是委内瑞拉反对派领袖玛丽亚·科丽娜·马查多（María Corina Machado）。


Question: 你好呀

🚀 Agent 启动 | 问题: 你好


🔄 第 1 轮思考...
  [系统] 🧠 LLM 正在思考...
  [系统] 🤖 LLM 输出:
Thought: 我需要回应用户的问候。
Final Answer: 你好！有什么可以帮助你的吗？

🎉 任务完成 | 最终答案: 你好！有什么可以帮助你的吗？

问题： “去年的诺贝尔和平奖获得者是谁？”

这个问题看着挺简单的但是，模型要回答出它来需要准确的知道去年是哪一年，以及搜索结果可能存在噪声或误导信息，需要二次验证。Agent 的执行过程体现了 ReAct 的优势，相较于之前它会直接回答问题，可能回答出历史数据或者说不知道。

第二轮：它首先先思考意识到去年是哪一年然后再调用工具获取当前的准确时间来推算出去年的年份
第二轮：它构造搜索关键词 “2025年诺贝尔和平奖获得者是谁”，调用搜索引擎，但返回结果含糊不清，甚至提到“互联网获提名”这类过时或者错误信息。
第三轮：Agent 发现第一次搜索的不准确，然后 自己重新调整了搜索词 为 “2025年诺贝尔和平奖正式获得者是谁”，以提高准确性。
第四轮：获得明确信息后，它整合上下文，输出了最终答案。

问题： “你好呀”

面对这样社交性问候语，Agent 展现了另一种智能：判断何时不需要行动。

它在第一轮思考中就识别出：“这是一个简单的打招呼，无需调用任何工具。”
于是直接跳过 Action 和 Observation 阶段，输出 Final Answer：“你好！有什么可以帮助你的吗？”

这说明 ReAct 并非盲目循环，而是具备 任务感知能力：对于已有足够知识回答的问题，它会立即终止循环，避免不必要的调用和延迟。这种 “该出手时才出手” 的判断力，ReAct Agent 设计的重要体现。

ReAct为什么好用？

对比普通大模型，ReAct的优势特别突出，能很好的解决复杂编程的痛点：

能搞定复杂难题：传统LLM只会一次性输出，遇到复杂代码任务容易出错；ReAct支持多步骤拆解，不再是一问一答的模式，能实时调用工具感知环境变化、操控项目文件，循环推进直到解决问题，适配复杂开发场景。
可解释性与调试友好：每一步的 Thought、Action、Observation推理清晰可见，不是黑盒输出，出问题能快速定位原因。
适配工具调用场景：完美兼容外部工具调用，不管是文件读写、信息搜索，还是数据库查询，都能无缝衔接，实用性极强。

Coding Agent 实现

开发框架

在这个项目中，我们使用了LangChain框架（Python版）来构建Coding Agent。TS版本的LangChain也可以，只是语法不同，按照个人熟悉语法选择即可。

底层模型

底层模型我们使用Minimax M2.5专为Agent场景原生设计，编程与智能体性能（Coding & Agentic）直接对标Claude Opus 4.6，默认支持think模式和function call。

这里我使用的七牛云提供的模型服务，支持很多模型，注册会送1000万 Token用于学习完全够用了。

Tool（工具）

Agent 的强大之处在于工具。在这个项目中，我们为 Agent 提供了三类核心工具

📂 文件浏览工具 (File System)

这些工具能让 Agent 快速了解项目的目录结构和文件内容，类似于程序员在 IDE 中的文件树和搜索功能。

工具名称	主要功能	适用场景
`ls`	列出指定目录下的文件和子目录	快速查看目录下有哪些文件，支持按 glob 模式（如 `*.py`）筛选。
`grep`	基于 ripgrep 的高性能内容搜索	查找代码定义、引用或特定字符串。比直接调用 `grep` 命令更安全且输出格式更友好。
`tree`	以树状结构展示目录层级	快速建立对项目整体结构的认知。自动过滤 `.git`、`node_modules` 等干扰项。

📝 代码编辑工具 (Editor)

这些工具赋予了 Agent 修改代码的能力。设计上遵循“原子操作”原则，每个工具只做一件事，确保修改的准确性。

工具名称	主要功能	适用场景
`create`	创建新的 UTF-8 文本文件	新建代码文件、配置文件。会自动创建不存在的父目录。
`insert`	在指定行前插入文本	在现有代码中插入新的函数、导入语句或逻辑块。
`str_replace`	精确替换文件中的字符串	修改现有代码。比基于行号的编辑更稳健，不易受行号变化影响。
`view`	读取文件内容（带行号）	读取代码以理解上下文，或在修改后验证代码是否正确。

💻 系统操作工具 (System)

这是 Agent 与操作系统交互的工具，使其能够执行命令、创建、运行等任务。

工具名称	主要功能	适用场景
`bash`	执行 Shell 命令	运行测试 (`pytest`)、安装依赖 (`pip install`)、Git 操作 (`git status`) 等。支持维护当前工作目录 (`cd` 命令生效)。

以上这些就可以构建成一个简易版本的Coding Agent。

实战

项目目录创建

首先，我们把项目的架子搭起来。目录结构不复杂，按照功能模块拆分： model 放模型调用， tools 放工具实现， logger 负责结构化的终端输出（思考，工具调用等步骤）。

code_agent/
├── main.py              # 入口文件，定义 Agent 循环和 System Prompt
├── model/               # 模型层，封装 LLM 调用
│   └── chat_model.py
├── tools/               # 工具层，Agent 的手脚
│   ├── editor/          # 编辑器工具 (create, insert, replace, view)
│   ├── fs/              # 文件系统工具 (ls, grep, tree)
│   ├── terminal/        # 终端工具 (bash)
│   └── todo/            # 任务规划工具
└── logger/              # 日志模块，提供结构化的终端输出

项目依赖安装

接下来装一下必要的库。我们用 uv 来装（比 pip 快很多），主要是 LangChain 全家桶和 ripgrep。特别提一下 ripgrep ，我们的搜索工具底层就是调用的它，搜索速度非常快，这对于 Agent 在大项目里找代码至关重要。

# 核心框架
uv add langchain langchain-openai

# 用于支持 MCP 协议（可选，方便未来扩展）
uv add langchain-mcp-adapters

# 强力搜索工具的 Python 绑定
uv add ripgrep

引入模型

去七牛云厂商注册申请api-key，引入minimax-m2.5模型，七牛云厂商的模型默认都是优先使用openai格式接入的，所以我们使用langchain_openai来接入服务厂商模型。

# chat_model.py
from typing import Literal

from langchain_openai import ChatOpenAI

ModelName = Literal["minimax"]


def init_chat_model(model_name: ModelName) -> ChatOpenAI | ChatOllama:

    return ChatOpenAI(
        model="minimax/minimax-m2.5",
        temperature=0.7,
        streaming=True,
        api_key=lambda: "***********************",
        base_url="https://api.qnaigc.com/v1",
    )


if __name__ == "__main__":
    chat_model = init_chat_model("minimax")
    print(chat_model.invoke("What is the capital of France?"))

工具集实现（核心）

实际上，下面这些工具实现无需自己手动写，完全可以写提示词，然后让AI生成（作者就是让AI生成的），因为大多数AI对这些工具实现都有 “记忆” ，实现出来的工具其实都大差不差，准确率也很高，比如下面这些工具集你完全可以复制函数内的描述让AI再帮你生成一套工具，当然你想直接复制使用也是可以的。

工具写的时候一定要添加上parse_docstring=True属性，这样LangChain可以自动提取出工具的注释和完整的JSON Schema

补充：Langchain 貌似已经实现了一些内置相关中间件来直接实现这类工具能力

ShellToolMiddleware

FilesystemFileSearchMiddleware

FilesystemMiddleware

文件浏览工具集

首先就是文件系统工具集，即 ls、tree 和 grep，这些工具赋予了大模型类似 IDE 的项目感知与检索能力。tree 和 ls 帮助 Agent 快速构建项目的“脑图”，就像我们接受新项目第一件事是看目录结构一样，Agent 也需要知道代码在哪里、配置文件在哪里、模块之间是如何组织的。所以后面 Agent 启动之后也会优先调用这两个工具。grep （基于 ripgrep）提供了类似 IDE 全局搜索的能力。

import fnmatch
import os
from typing import Optional

from langchain.tools import tool


@tool("ls", parse_docstring=True)
def ls_tool(
    path: str, match: Optional[list[str]] = None, ignore: Optional[list[str]] = None
) -> str:
    """
    Description:
        Lists files and directories in a given path. Optionally provide an array of glob patterns to match and ignore.
        列出给定路径下的文件和目录。可选地提供要匹配和忽略的 glob 模式数组。

    Args:
        path: The absolute path to list files and directories from. Relative paths are **not** allowed.
              要列出文件和目录的绝对路径。不允许使用相对路径。
        match: An optional array of glob patterns to match. Only files matching these patterns will be returned.
               可选的要匹配的 glob 模式数组。只有匹配这些模式的文件才会被返回。
        ignore: An optional array of glob patterns to ignore. Files matching these patterns will be excluded.
                可选的要忽略的 glob 模式数组。匹配这些模式的文件将被排除。
    """
    if not os.path.isabs(path):
        return f"Error: Path '{path}' is not an absolute path. (错误：路径 '{path}' 不是绝对路径)"

    if not os.path.exists(path):
        return f"Error: Path '{path}' does not exist. (错误：路径 '{path}' 不存在)"

    if not os.path.isdir(path):
        return (
            f"Error: Path '{path}' is not a directory. (错误：路径 '{path}' 不是目录)"
        )

    try:
        entries = os.listdir(path)
    except Exception as e:
        return f"Error listing directory '{path}': {str(e)}"

    # Filter by match patterns (inclusion)
    # If match is provided, we only keep files that match at least one pattern
    if match:
        matched_entries = set()
        for pattern in match:
            matched_entries.update(fnmatch.filter(entries, pattern))
        entries = list(matched_entries)

    # Filter by ignore patterns (exclusion)
    # If ignore is provided, we remove files that match any of the patterns
    if ignore:
        for pattern in ignore:
            ignored = set(fnmatch.filter(entries, pattern))
            entries = [e for e in entries if e not in ignored]

    entries.sort()

    if not entries:
        return "(empty directory)"

    # Decorate output: append '/' to directories
    result = []
    for entry in entries:
        full_path = os.path.join(path, entry)
        if os.path.isdir(full_path):
            result.append(f"{entry}/")
        else:
            result.append(entry)

    return "\n".join(result)

import os
import shutil
import subprocess
from typing import Literal, Optional

from langchain.tools import tool


@tool("grep", parse_docstring=True)
def grep_tool(
    pattern: str,
    path: Optional[str] = None,
    glob: Optional[str] = None,
    output_mode: Literal[
        "content", "files_with_matches", "count"
    ] = "files_with_matches",
    B: Optional[int] = None,
    A: Optional[int] = None,
    C: Optional[int] = None,
    n: Optional[bool] = None,
    i: Optional[bool] = None,
    type: Optional[str] = None,
    head_limit: Optional[int] = None,
    multiline: Optional[bool] = False,
) -> str:
    """A powerful search tool built on ripgrep for searching file contents with regex patterns.
    基于 ripgrep 构建的强大搜索工具，用于使用正则表达式模式搜索文件内容。

    ALWAYS use this tool for search tasks. NEVER invoke `grep` or `rg` as a Bash command.
    Supports full regex syntax, file filtering, and various output modes.
    始终使用此工具进行搜索任务。切勿将 `grep` 或 `rg` 作为 Bash 命令调用。
    支持完整的正则表达式语法、文件过滤和各种输出模式。

    Args:
        pattern: The regular expression pattern to search for in file contents.
                 Uses ripgrep syntax - literal braces need escaping (e.g., `interface\\{\\}` for `interface{}`).
                 在文件内容中搜索的正则表达式模式。使用 ripgrep 语法 - 字面大括号需要转义。
        path: File or directory to search in. Defaults to current working directory if not specified.
              要搜索的文件或目录。如果未指定，默认为当前工作目录。
        glob: Glob pattern to filter files (e.g., "*.js", "*.{ts,tsx}").
              用于过滤文件的 Glob 模式（例如 "*.js", "*.{ts,tsx}"）。
        output_mode: Output mode - "content" shows matching lines with optional context,
                    "files_with_matches" shows only file paths (default),
                    "count" shows match counts per file.
                    输出模式 - "content" 显示匹配行和可选上下文，"files_with_matches" 仅显示文件路径（默认），"count" 显示每个文件的匹配计数。
        B: Number of lines to show before each match. Only works with output_mode="content".
           在每个匹配项之前显示的行数。仅在 output_mode="content" 时有效。
        A: Number of lines to show after each match. Only works with output_mode="content".
           在每个匹配项之后显示的行数。仅在 output_mode="content" 时有效。
        C: Number of lines to show before and after each match. Only works with output_mode="content".
           在每个匹配项前后显示的行数。仅在 output_mode="content" 时有效。
        n: Show line numbers in output. Only works with output_mode="content".
           在输出中显示行号。仅在 output_mode="content" 时有效。
        i: Enable case insensitive search.
           启用不区分大小写搜索。
        type: File type to search (e.g., "js", "py", "rust", "go", "java").
             More efficient than glob for standard file types.
             要搜索的文件类型（例如 "js", "py", "rust", "go", "java"）。对于标准文件类型比 glob 更高效。
        head_limit: Limit output to first N lines/entries. Works across all output modes.
                    将输出限制为前 N 行/条目。适用于所有输出模式。
        multiline: Enable multiline mode where patterns can span lines and . matches newlines.
                  Default is False (single-line matching only).
                  启用多行模式，其中模式可以跨越行，并且 . 匹配换行符。默认为 False（仅单行匹配）。

    Returns:
        Search results as a string, formatted according to the output_mode.
        搜索结果字符串，根据 output_mode 格式化。
    """
    # 1. Check if ripgrep (rg) is installed
    rg_path = shutil.which("rg")
    if not rg_path:
        return "Error: 'rg' (ripgrep) executable not found. Please install ripgrep first. (错误：未找到 'rg' (ripgrep) 可执行文件。请先安装 ripgrep。)"

    # 2. Build command arguments
    cmd = [rg_path]

    # Pattern handling
    # Note: We put the pattern later as a positional argument, usually after options

    # Options mapping
    if i:
        cmd.append("-i")

    if multiline:
        cmd.append("--multiline")
        cmd.append("--multiline-dotall")

    if type:
        cmd.extend(["--type", type])

    if glob:
        # Handle glob patterns safely
        cmd.extend(["--glob", glob])

    # Context options (only for content mode usually, but rg allows them generally)
    # Priority: C > (A or B)
    if C is not None:
        cmd.extend(["-C", str(C)])
    else:
        if B is not None:
            cmd.extend(["-B", str(B)])
        if A is not None:
            cmd.extend(["-A", str(A)])

    # Output mode handling
    if output_mode == "files_with_matches":
        cmd.append("--files-with-matches")
        # In this mode, context/line numbers don't make much sense, but rg handles it gracefully (ignores them)
    elif output_mode == "count":
        cmd.append("--count")
    elif output_mode == "content":
        # Default rg behavior is to show content
        # Force line numbers if requested
        if n:
            cmd.append("-n")
        # Force no heading to make parsing easier/consistent if needed,
        # but standard output is usually fine for LLM reading.
        # cmd.append("--no-heading")

    # Path handling
    if path:
        if not os.path.exists(path):
            return f"Error: Path '{path}' does not exist. (错误：路径 '{path}' 不存在)"
        search_path = path
    else:
        search_path = "."

    # 3. Execution with subprocess
    # We append pattern and path at the end
    cmd.append("--")  # End of options
    cmd.append(pattern)
    cmd.append(search_path)

    try:
        # Capture output
        process = subprocess.run(
            cmd,
            capture_output=True,
            text=True,
            check=False,  # Don't raise error on non-zero exit code (rg returns 1 if no match)
        )
    except Exception as e:
        return f"Error executing grep: {str(e)} (执行 grep 时出错)"

    # 4. Handle results
    if process.returncode == 1:
        # Exit code 1 means no matches found
        return "No matches found. (未找到匹配项)"
    elif process.returncode > 1:
        # Exit code > 1 means error
        return f"Grep error (Exit code {process.returncode}):\n{process.stderr} (Grep 错误)"

    output = process.stdout

    # 5. Apply head_limit if specified
    if head_limit and head_limit > 0:
        lines = output.splitlines()
        if len(lines) > head_limit:
            preview = lines[:head_limit]
            return (
                "\n".join(preview)
                + f"\n... ({len(lines) - head_limit} more lines truncated) (... 还有 {len(lines) - head_limit} 行被截断)"
            )

    # If output is too large even without limit (safety net)
    MAX_CHARS = 100000
    if len(output) > MAX_CHARS:
        return (
            output[:MAX_CHARS]
            + f"\n... (Output truncated, too long: {len(output)} chars) (... 输出过长被截断)"
        )

    return output if output else "No matches found. (未找到匹配项)"

import fnmatch
import os
from typing import Optional

from langchain.tools import tool


@tool("tree", parse_docstring=True)
def tree_tool(
    path: Optional[str] = None,
    max_depth: Optional[int] = 3,
) -> str:
    """Display directory structure in a tree format, similar to the 'tree' command.
    以树状格式显示目录结构，类似于 'tree' 命令。

    Shows files and directories in a hierarchical tree structure.
    Automatically excludes common ignore patterns (version control, dependencies, build artifacts, etc.).
    以分层树状结构显示文件和目录。自动排除常见的忽略模式（版本控制、依赖项、构建产物等）。

    Args:
        path: Directory path to display. Defaults to current working directory if not specified.
              要显示的目录路径。如果未指定，默认为当前工作目录。
        max_depth: Maximum depth to traverse. The max_depth should be less than or equal to 3. Defaults to 3.
                   遍历的最大深度。最大深度应小于或等于 3。默认为 3。

    Returns:
        A tree-structured view of the directory as a string.
        目录的树状结构视图字符串。
    """
    if path is None:
        path = os.getcwd()
    else:
        if not os.path.isabs(path):
            path = os.path.abspath(path)

    if not os.path.exists(path):
        return f"Error: Path '{path}' does not exist. (错误：路径 '{path}' 不存在)"

    if not os.path.isdir(path):
        return (
            f"Error: Path '{path}' is not a directory. (错误：路径 '{path}' 不是目录)"
        )

    # Hardcoded ignore patterns for noise reduction
    IGNORE_PATTERNS = {
        ".git",
        ".svn",
        ".hg",
        "__pycache__",
        "node_modules",
        "venv",
        ".venv",
        "dist",
        "build",
        ".idea",
        ".vscode",
        ".DS_Store",
        "*.pyc",
        "*.pyo",
        "coverage",
        ".pytest_cache",
        ".mypy_cache",
    }

    def should_ignore(name: str) -> bool:
        for pattern in IGNORE_PATTERNS:
            if fnmatch.fnmatch(name, pattern):
                return True
        return False

    tree_lines = []

    def build_tree(current_path: str, prefix: str = "", depth: int = 0):
        if depth >= max_depth:
            return

        try:
            entries = sorted(os.listdir(current_path))
        except PermissionError:
            tree_lines.append(f"{prefix}[Permission Denied]")
            return
        except Exception as e:
            tree_lines.append(f"{prefix}[Error: {str(e)}]")
            return

        # Filter out ignored files/directories
        entries = [e for e in entries if not should_ignore(e)]
        count = len(entries)

        for i, entry in enumerate(entries):
            is_last = i == count - 1
            connector = "└── " if is_last else "├── "

            full_path = os.path.join(current_path, entry)
            is_dir = os.path.isdir(full_path)

            display_name = f"{entry}/" if is_dir else entry
            tree_lines.append(f"{prefix}{connector}{display_name}")

            if is_dir:
                extension = "    " if is_last else "│   "
                build_tree(full_path, prefix + extension, depth + 1)

    tree_lines.append(f"{os.path.basename(path) or path}/")
    build_tree(path, "", 0)

    return "\n".join(tree_lines)

命令行工具

命令性工具是 Agent 的“四肢”，赋予了它直接操控操作系统的能力。通过 bash 工具，Agent 不再局限于被动地读写文件，而是能够像人类开发者一样主动出击：它可以执行 pip install 安装缺失的依赖，运行 pytest 验证代码修改是否正确，甚至使用 git 提交代码。为了保证交互的连续性，工具内部特意维护了当前工作目录（CWD）的状态，这意味着 Agent 执行 cd 命令后，后续操作都会在新的目录下进行，还原了真实的终端体验。

import os
import subprocess
from typing import Optional

from langchain.tools import tool

# Global state to maintain current working directory across calls
# 维护跨调用的当前工作目录的全局状态
_CURRENT_CWD = os.getcwd()


@tool("bash", parse_docstring=True)
def bash_tool(command: str, reset_cwd: Optional[bool] = False) -> str:
    """Execute a standard bash command in a keep-alive shell, and return the output if successful or error message if failed.
    在保持活动的 shell 中执行标准 bash 命令，如果成功则返回输出，如果失败则返回错误消息。

    Use this tool to perform:
    - Create directories
    - Install dependencies
    - Start development server
    - Run tests and linting
    - Git operations
    使用此工具执行：创建目录、安装依赖、启动开发服务器、运行测试和 Lint、Git 操作等。

    Never use this tool to perform any harmful or dangerous operations.
    Use `ls`, `grep` and `tree` tools for file system operations instead of this tool.
    切勿使用此工具执行任何有害或危险的操作。
    请使用 `ls`、`grep` 和 `tree` 工具进行文件系统操作，而不是此工具。

    Args:
        command: The command to execute.
                 要执行的命令。
        reset_cwd: Whether to reset the current working directory to the project root directory.
                   是否将当前工作目录重置为项目根目录。
    """
    global _CURRENT_CWD

    if reset_cwd:
        _CURRENT_CWD = os.getcwd()
        return (
            f"Current working directory reset to {_CURRENT_CWD}. (当前工作目录已重置)"
        )

    # Handle directory changes manually since subprocess doesn't persist cwd
    if command.strip().startswith("cd "):
        try:
            target_dir = command.strip()[3:].strip()
            # Handle user home directory ~
            target_dir = os.path.expanduser(target_dir)

            # Resolve path relative to current maintained CWD
            new_path = os.path.abspath(os.path.join(_CURRENT_CWD, target_dir))

            if os.path.isdir(new_path):
                _CURRENT_CWD = new_path
                return f"Changed directory to {_CURRENT_CWD} (已切换目录)"
            else:
                return f"Error: Directory '{target_dir}' does not exist. (错误：目录不存在)"
        except Exception as e:
            return f"Error changing directory: {str(e)} (切换目录出错)"

    try:
        # Execute command
        # We use shell=True to support pipes, redirects, and environment variables
        process = subprocess.run(
            command,
            shell=True,
            cwd=_CURRENT_CWD,  # Use the maintained CWD
            capture_output=True,
            text=True,
            executable="/bin/bash",  # Explicitly use bash
        )

        stdout = process.stdout.strip()
        stderr = process.stderr.strip()

        output = []
        if stdout:
            output.append(stdout)
        if stderr:
            # Check if stderr is actually an error or just info (some tools print to stderr)
            if process.returncode != 0:
                output.append(f"Error (stderr): {stderr}")
            else:
                output.append(f"Info (stderr): {stderr}")

        result = "\n".join(output)

        if process.returncode != 0 and not result:
            result = f"Command failed with exit code {process.returncode} (no output). (命令失败，退出码 {process.returncode})"

        if not result:
            result = (
                "(Command executed successfully with no output) (命令执行成功，无输出)"
            )

        return result

    except Exception as e:
        return f"Error executing command: {str(e)} (执行命令出错)"

文本编辑器工具

代码编辑工具是 Coding Agent 能够“写代码”的核心能力支撑。Anthropic 自 2024 年开始推出了标准化的 Text Editor 工具定义，将文本编辑器抽象为四个核心命令（view、create、insert、str_replace），为 AI Agent 构建了一个既简洁又强大的文本编辑工具，也逐渐成为业界的标准。这些工具提供了一套基础文件 I/O 的原子操作接口： create 负责新建文件， insert 实现定点插入， str_replace 执行精准替换， view 则用于读取校验。通过这些基础的 CRUD 操作，Coding Agent 获得了对项目代码的全权控制能力，能够执行从新建文件到复杂重构的任何编辑操作。

需要注意 view 工具不是原封不动的读取文件，它在返回代码时会自动注入行号信息（从1开始）。这一设计弥补了 LLM 在空间定位上的短板，使其能够像使用 IDE 一样精准地引用代码行。

from pathlib import Path
from typing import Optional

from langchain.tools import tool


@tool("view", parse_docstring=True)
def view_tool(
    file_path: str, start_line: Optional[int] = None, end_line: Optional[int] = None
) -> str:
    """Read file content as text, with 1-based line numbers prefixed to each line.

    以文本方式查看文件内容，并在每行前添加从 1 开始的行号。

    Conventions:
        - Line numbers start from 1, corresponding to common IDE display.
          行号从 1 开始统计，对应 IDE 中常见显示。
        - [start_line, end_line] is a closed interval; out-of-bounds ranges are automatically clipped to file boundaries.
          [start_line, end_line] 为闭区间，超出范围会自动裁剪到文件边界。
        - Returns a readable error string instead of raising an exception if the file does not exist or cannot be read.
          文件不存在或读取异常时返回可读的错误字符串，而不是抛出异常。

    Args:
        file_path: Relative or absolute file path, read as UTF-8.
                   相对或绝对文件路径，按 UTF-8 读取。
        start_line: Optional, 1-based start line number (inclusive).
                    可选，1 基准、包含该行的起始行号。
        end_line: Optional, 1-based end line number (inclusive).
                  可选，1 基准、包含该行的结束行号。

    Returns:
        File content with line numbers, or error message.
        带行号的文件内容，或错误信息。
    """
    path = Path(file_path)
    if not path.is_file():
        return f"Error: file not found: {file_path} (错误：文件未找到)"

    try:
        text = path.read_text(encoding="utf-8")
    except Exception as e:
        return f"Error: failed to read file {file_path}: {e} (错误：读取文件失败)"

    lines = text.splitlines()
    total_lines = len(lines)

    if total_lines == 0:
        return "(empty file) (空文件)"

    # Default to viewing the entire file; start_line/end_line are 1-based closed intervals
    start = 1 if start_line is None or start_line < 1 else start_line
    end = total_lines if end_line is None or end_line > total_lines else end_line

    if start > end or start > total_lines:
        return (
            f"Error: invalid range start_line={start_line} end_line={end_line}; "
            f"file has {total_lines} line(s). (错误：无效范围)"
        )

    numbered_lines = []
    for lineno in range(start, end + 1):
        # Use original line content, excluding trailing newline characters
        content = lines[lineno - 1]
        numbered_lines.append(f"{lineno}: {content}")

    return "\n".join(numbered_lines)

import re
from pathlib import Path
from typing import Optional

from langchain.tools import tool


@tool("str_replace", parse_docstring=True)
def str_replace_tool(
    file_path: str,
    target: str,
    replacement: str,
    count: Optional[int] = None,
    case_sensitive: bool = True,
) -> str:
    """Perform literal string replacement (non-regex) on the entire file.

    对整个文件执行普通字符串替换（非正则语义）。

    Conventions:
        - Returns error if target is empty string.
          target 为空字符串时返回错误。
        - Returns replacement count and total file lines.
          返回替换次数与文件总行数。
        - Returns readable error strings for file not found or IO exceptions instead of raising exceptions.
          文件不存在或读写异常时返回可读错误字符串，而不是抛出异常。

    Args:
        file_path: Target file path, read/written as UTF-8.
                   目标文件路径，按 UTF-8 读写。
        target: The original substring to find and replace, must not be empty.
                要查找并替换的原始子串，不能为空。
        replacement: The substring to replace with.
                     替换后的子串。
        count: Optional, positive integer. If set, only replaces the first `count` matches; if omitted or None, replaces all.
               可选，正整数。设置时仅替换前 count 个匹配；省略或为 None 时替换全部。
        case_sensitive: Whether to be case sensitive, default True. If False, matches target case-insensitively.
                        是否区分大小写，默认 True。False 时以不区分大小写方式匹配 target。

    Returns:
        A brief description of the operation result or error message.
        操作结果简述或错误信息。
    """
    if target == "":
        return "Error: target string must not be empty. (错误：目标字符串不能为空)"

    if count is not None and count < 0:
        return f"Error: count must be non-negative, got {count}. (错误：count 必须非负)"

    path = Path(file_path)
    if not path.is_file():
        return f"Error: file not found: {file_path} (错误：文件未找到)"

    try:
        text = path.read_text(encoding="utf-8")
    except Exception as e:
        return f"Error: failed to read file {file_path}: {e} (错误：读取文件失败)"

    if not case_sensitive:
        # Use re.escape to ensure literal matching of target, not regex pattern
        # 使用 re.escape 保证按字面含义匹配 target，而不是正则模式
        pattern = re.compile(re.escape(target), flags=re.IGNORECASE)
        if count is None or count == 0:
            new_text, replaced_count = pattern.subn(replacement, text)
        else:
            new_text, replaced_count = pattern.subn(replacement, text, count)
    else:
        occurrences = text.count(target)
        if count is None or count == 0:
            replaced_count = occurrences
            new_text = text.replace(target, replacement)
        else:
            replaced_count = min(occurrences, count)
            new_text = text.replace(target, replacement, count)

    if replaced_count > 0:
        try:
            path.write_text(new_text, encoding="utf-8")
        except Exception as e:
            return f"Error: failed to write file {file_path}: {e} (错误：写入文件失败)"

        return f"Replaced {replaced_count} occurrence(s) of '{target}'. (已替换 {replaced_count} 处 '{target}')"
    else:
        return f"No occurrences of '{target}' found. (未找到 '{target}')"

from pathlib import Path

from langchain.tools import tool


@tool("insert", parse_docstring=True)
def insert_tool(file_path: str, line_index: int, content: str) -> str:
    """Insert text into a specific file at a 0-based line index.

    在指定文件中按 0 基准行索引插入文本。

    Conventions:
        - Returns error string if line_index < 0.
          line_index < 0 时返回错误字符串。
        - If line_index is greater than current line count, appends content to the end of file.
          当 line_index 大于当前行数时，等价于将内容追加到文件末尾。
        - Returns a brief description (lines inserted, total lines).
          返回简短说明（插入行数、最终总行数）。
        - Returns readable error strings for file not found or IO exceptions instead of raising exceptions.
          文件不存在或读写异常时返回可读错误字符串，而不是抛出异常。

    Args:
        file_path: Target file path, read/written as UTF-8.
                   目标文件路径，按 UTF-8 读写。
        line_index: 0-based line index, indicating insertion "before" this line.
                    0 起始的行索引，表示在该行"前面"插入内容。
        content: Text to insert, can contain multiple lines, will be split by "\\n" and inserted line by line.
                 要插入的文本，可以包含多行，将按 "\\n" 拆分逐行插入。

    Returns:
        A brief description of the operation result or error message.
        操作结果简述或错误信息。
    """
    if line_index < 0:
        return f"Error: line_index must be non-negative, got {line_index}. (错误：行索引必须非负)"

    path = Path(file_path)
    if not path.is_file():
        return f"Error: file not found: {file_path} (错误：文件未找到)"

    try:
        # Keep newlines of existing lines
        existing_lines = path.read_text(encoding="utf-8").splitlines(keepends=True)
    except Exception as e:
        return f"Error: failed to read file {file_path}: {e} (错误：读取文件失败)"

    # Append if insertion position is greater than line count
    insert_pos = min(line_index, len(existing_lines))

    # content itself may contain newlines, insert line by line
    if content:
        new_lines = content.splitlines(keepends=True)
        # Ensure the last line has a newline if it's not the end of file,
        # or generally for consistency. But splitlines(keepends=True) keeps original.
        # If the inserted content didn't end with newline, it might merge with next line.
        # For safety in "insert line" semantics, we usually ensure inserted block ends with newline if not empty.
        if new_lines and not new_lines[-1].endswith("\n"):
            new_lines[-1] += "\n"
    else:
        new_lines = []

    try:
        updated_lines = (
            existing_lines[:insert_pos] + new_lines + existing_lines[insert_pos:]
        )
        path.write_text("".join(updated_lines), encoding="utf-8")
    except Exception as e:
        return f"Error: failed to write file {file_path}: {e} (错误：写入文件失败)"

    inserted_count = len(new_lines)
    total_lines = len(updated_lines)
    return (
        f"Inserted {inserted_count} line(s) at index {insert_pos}. "
        f"Total lines: {total_lines}. (在索引 {insert_pos} 处插入了 {inserted_count} 行。总行数：{total_lines})"
    )

from pathlib import Path

from langchain.tools import tool


@tool("create", parse_docstring=True)
def create_tool(file_path: str, content: str = "", exist_ok: bool = False) -> str:
    """Create a new UTF-8 text file.

    创建新的 UTF-8 文本文件。

    Conventions:
        - Automatically creates parent directories.
          自动创建父目录。
        - Returns a brief description including the number of lines written and the file path.
          返回简短说明，包含写入的行数和文件路径。
        - All exceptions are converted to readable error strings.
          所有异常转换为可读的错误字符串。

    Args:
        file_path: Target file path, written as UTF-8.
                   目标文件路径，按 UTF-8 写入。
        content: Initial file content, can be an empty string, can contain multiple lines.
                 初始文件内容，可为空字符串，可包含多行。
        exist_ok: Default False. If False, raises error if file exists; if True, allows overwriting.
                  默认 False。False 时若文件已存在则报错；True 时允许覆盖。

    Returns:
        A brief description of the operation result or error message.
        操作结果简述或错误信息。
    """
    path = Path(file_path)

    try:
        if path.exists() and not exist_ok:
            return f"Error: file already exists: {file_path}. Set exist_ok=True to overwrite. (错误：文件已存在，请设置 exist_ok=True 以覆盖)"

        if path.parent and not path.parent.exists():
            path.parent.mkdir(parents=True, exist_ok=True)

        path.write_text(content, encoding="utf-8")
    except Exception as e:
        return f"Error: failed to create file {file_path}: {e} (错误：创建文件失败)"

    line_count = len(content.splitlines()) if content else 0
    return f"Created file {file_path} with {line_count} line(s). (已创建文件，共 {line_count} 行)"

创建 Coding Agent

在 LangChain v1.0 中，创建一个 ReAct 风格的 Agent 变得更加简洁直观——你只需明确提供三个核心组件：模型、工具和 系统提示词 。框架会自动将它们组合成一个具备 推理-行动 循环能力的智能体，无需手动拼接提示模板或解析输出格式，大幅降低了使用门槛，同时保留了 ReAct 的核心逻辑。

import argparse
import os
import re

from langchain.agents import create_agent
from langchain.agents.middleware import TodoListMiddleware
from langchain_core.messages import HumanMessage, SystemMessage
from logger import AgentLogger
from model import init_chat_model
from tools.editor import create_tool, insert_tool, str_replace_tool, view_tool
from tools.fs import grep_tool, ls_tool, tree_tool
from tools.terminal import bash_tool


def build_agent():
    llm = init_chat_model("minimax")
    tools = [
        ls_tool,
        grep_tool,
        tree_tool,
        bash_tool,
        create_tool,
        insert_tool,
        str_replace_tool,
        view_tool,
    ]

    agent = create_agent(
        llm,
        tools=tools,
        system_prompt=SystemMessage(
            content=(
                f"""
                ---
                PROJECT_ROOT: {os.getcwd()} # 系统提示词这里可以加更多项目相关信息比如git等等
                ---

                作为ReAct模式的编程助手，请按以下准则执行用户的指令：

                ## 重要工作原则

                1. **务必检查修改结果**：每次修改文件后，都要用`view`工具查看确认，确保修改准确无误。

                2. **循序渐进地思考**：在得出结论前，多确认一步——任务是否真的完成了？有没有遗漏的细节？

                3. **如实反馈工作情况**：如果完成了修改，就说清楚改了哪里；要是任务没办法完成，也坦诚说明原因。

                4. **相信工具反馈**：工具给出的结果就是事实。比如工具说某个内容已经替换了，那它肯定就替换好了。
                """
            )
        ),
    )
    return agent, tools


async def main():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "question",
        nargs="?",
        default="请列出 ai-learn 目录下的文件",
    )
    args = parser.parse_args()

    logger = AgentLogger()
    agent, _ = build_agent()
    payload = {
        "messages": [
            HumanMessage(content=args.question),
        ]
    }

    logger.header("🤖 Code Agent 启动")
    logger.user_input(args.question)

    pending_tool_name = None
    pending_tool_input = None
    current_step_thought_printed = False
    final_response = ""

    try:
        async for event in agent.astream_events(
            payload, version="v2", config={"recursion_limit": 100}
        ):
            # 下面这些都是用于解析agent输出流用来在终端输出
            ev = event.get("event")
            name = event.get("name", "")
            data = event.get("data", {}) or {}

            if ev == "on_chain_start" and name == "model":
                logger.increment_step()
                current_step_thought_printed = False

            elif ev == "on_chain_stream" and name == "model":
                chunk = data.get("chunk")
                if chunk and hasattr(chunk, "content") and chunk.content:
                    if not current_step_thought_printed:
                        logger.print_thought_header()
                        current_step_thought_printed = True
                    logger.stream_thought(chunk.content)

            elif ev == "on_chain_end" and name == "model":
                if current_step_thought_printed:
                    logger.end_thought()
                else:
                    output = data.get("output", {})
                    content_to_print = None

                    if "generations" in output:
                        generations = output.get("generations", [[]])
                        if generations and generations[0]:
                            msg = generations[0][0]
                            if hasattr(msg, "text") and msg.text:
                                content_to_print = msg.text
                            elif hasattr(msg, "message"):
                                content_to_print = str(msg.message.content)

                    elif "messages" in output:  # LangGraph model node output
                        messages = output["messages"]
                        if messages:
                            msg = messages[-1]
                            content_to_print = getattr(msg, "content", str(msg))

                    if content_to_print:
                        think_match = re.search(
                            r"<think>(.*?)</think>", content_to_print, re.DOTALL
                        )
                        if think_match:
                            thought = think_match.group(1).strip()
                            remaining = re.sub(
                                r"<think>.*?</think>",
                                "",
                                content_to_print,
                                flags=re.DOTALL,
                            ).strip()

                            logger.print_thought_header()
                            logger.stream_thought(thought)
                            logger.end_thought()

                            if remaining:
                                logger.llm_response(remaining)
                        else:
                            logger.llm_response(content_to_print)

            elif ev == "on_tool_start":
                pending_tool_name = name
                pending_tool_input = data.get("input", {})

                if name == "write_todos":
                    todos = pending_tool_input.get("todos", [])
                    logger.print_todo(todos)
                else:
                    logger.tool_call_start(name, pending_tool_input)

            elif ev == "on_tool_end":
                output = data.get("output", "")
                if hasattr(output, "content"):
                    output_str = output.content  # type: ignore
                else:
                    output_str = str(output)

                if name != "write_todos" and pending_tool_name != "write_todos":
                    logger.tool_call_end(pending_tool_name or name, output_str)

                pending_tool_name = None
                pending_tool_input = None

            elif ev == "on_chain_end":
                output = data.get("output") or {}
                if isinstance(output, dict):
                    messages = output.get("messages") or []
                    if messages:
                        last = messages[-1]
                        final_response = getattr(last, "content", str(last))

        if final_response:
            final_response = re.sub(
                r"<think>.*?</think>", "", final_response, flags=re.DOTALL
            ).strip()

        logger.final_result(final_response)
        logger.summary()

    except Exception as e:
        logger.error(str(e))


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

agent_logger.py（为agent提供一个漂亮的UI输出）

import json


class AgentLogger:
    COLORS = {
        "reset": "\033[0m",
        "bold": "\033[1m",
        "dim": "\033[2m",
        "green": "\033[32m",
        "yellow": "\033[33m",
        "blue": "\033[34m",
        "magenta": "\033[35m",
        "cyan": "\033[36m",
        "white": "\033[37m",
        "bg_blue": "\033[44m",
        "bg_magenta": "\033[45m",
    }

    def __init__(self, max_output_length: int = 500):
        self.max_output_length = max_output_length
        self.step_count = 0
        self.tool_call_count = 0

    def _color(self, text: str, color: str) -> str:
        return f"{self.COLORS.get(color, '')}{text}{self.COLORS['reset']}"

    def _truncate(self, text: str) -> str:
        text = str(text)
        if len(text) > self.max_output_length:
            return text[: self.max_output_length] + f"\n... (截断，共 {len(text)} 字符)"
        return text

    def _format_json(self, data: dict) -> str:
        try:
            return json.dumps(data, ensure_ascii=False, indent=2)
        except Exception:
            return str(data)

    def header(self, title: str):
        line = "═" * 60
        print(f"\n{self._color(line, 'cyan')}")
        print(self._color(f"  {title}", "bold"))
        print(f"{self._color(line, 'cyan')}\n")

    def user_input(self, question: str):
        print(self._color("┌─────────────────────────────────────────┐", "blue"))
        print(
            self._color("│  📥 用户输入", "bold")
            + self._color("                            │", "blue")
        )
        print(self._color("├─────────────────────────────────────────┤", "blue"))
        print(
            self._color("│  ", "blue")
            + self._truncate(question)[:39].ljust(39)
            + self._color("│", "blue")
        )
        print(self._color("└─────────────────────────────────────────┘", "blue") + "\n")

    def llm_thinking(self):
        self.step_count += 1
        print(self._color(f"\n{'─' * 50}", "dim"))
        print(self._color(f"  🧠 步骤 {self.step_count}: 模型思考中...", "magenta"))
        print(self._color(f"{'─' * 50}", "dim"))

    def increment_step(self):
        self.step_count += 1

    def print_thought_header(self):
        print(self._color(f"\n{'─' * 50}", "dim"))
        print(self._color(f"  🧠 步骤 {self.step_count}: 模型思考中...", "magenta"))
        print(self._color(f"{'─' * 50}", "dim"))
        print(self._color("  Thinking Process:", "dim"))

    def stream_thought(self, content: str):
        print(self._color(content, "white"), end="", flush=True)

    def end_thought(self):
        print("\n")

    def llm_response(self, content: str):
        print(self._color("\n  💭 模型响应:", "yellow"))
        print(self._color("  ┌" + "─" * 48 + "┐", "dim"))
        for line in self._truncate(content).split("\n"):
            print(self._color("  │ ", "dim") + line[:46])
        print(self._color("  └" + "─" * 48 + "┘", "dim"))

    def tool_call_start(self, tool_name: str, tool_input: dict):
        self.tool_call_count += 1
        print(self._color(f"\n  🔧 工具调用 #{self.tool_call_count}", "bold"))
        print(self._color("  ├─ 工具名称: ", "cyan") + self._color(tool_name, "yellow"))
        print(self._color("  ├─ 输入参数:", "cyan"))
        input_str = (
            self._format_json(tool_input)
            if isinstance(tool_input, dict)
            else str(tool_input)
        )
        for line in self._truncate(input_str).split("\n"):
            print(self._color("  │   ", "dim") + line)

    def tool_call_end(self, tool_name: str, tool_output: str):
        print(self._color("  └─ 输出结果:", "cyan"))
        for line in self._truncate(tool_output).split("\n"):
            print(self._color("      ", "dim") + line)

    def print_todo(self, todos: list):
        print(self._color("\n  📝 规划待办事项 (To-Do List):", "blue"))
        print(self._color("  ┌" + "─" * 48 + "┐", "dim"))
        for item in todos:
            content = item.get("content", "")
            status = item.get("status", "pending")

            # 选择图标
            if status == "completed":
                icon = "✅"
                color = "green"
            elif status == "in_progress":
                icon = "⏳"
                color = "yellow"
            else:
                icon = "⭕"
                color = "white"

            line = f"  │ {icon} {content}"
            # 简单处理过长行
            if len(line) > 50:
                line = line[:47] + "..."

            # Use ljust on the text part, but we need to handle ANSI codes properly
            # Simplified: just print line and close with |
            # But the _color method takes (text, color).

            # Let's rebuild the line logic to be simpler and safer
            # Just print the formatted line
            print(self._color(f"  │ {icon} {content}", color))

        print(self._color("  └" + "─" * 48 + "┘", "dim"))

    def final_result(self, content: str):
        print("\n" + self._color("═" * 60, "green"))
        print(self._color("  ✅ 最终结果", "bold"))
        print(self._color("═" * 60, "green"))
        print(self._truncate(content))
        print(self._color("─" * 60, "dim"))

    def error(self, message: str):
        print(self._color(f"\n❌ 错误: {message}", "red"))

    def summary(self):
        print(self._color("\n📊 执行统计:", "cyan"))
        print(self._color(f"   • 总步骤数: {self.step_count}", "white"))
        print(self._color(f"   • 工具调用次数: {self.tool_call_count}", "white"))

构建流程解读

导入依赖
引入命令行解析、正则表达式、LangChain 核心组件（Agent、消息、工具）、自定义工具（文件操作、代码编辑、终端执行）以及日志记录器。
构建 Agent（build_agent 函数）
- 初始化大模型（使用 MiniMax）；
- 注册一组面向编程任务的工具，如查看目录、搜索文本、执行命令、编辑文件等；
- 通过 create_agent 创建 ReAct 风格的智能体，并传入包含项目路径和行为准则的系统提示词（SystemMessage），明确要求 Agent 修改后要验证 逐步思考 如实反馈。
启动主流程（main 函数）
- 解析用户输入的问题（默认列出目录）；
- 初始化带格式化输出的日志器，用于美化运行过程；
- 将用户问题封装为 HumanMessage，作为 Agent 的初始输入。
监听 Agent 执行流
使用 agent.astream_events 异步监听整个推理与行动过程，按事件类型分别处理：
- 模型开始思考：标记新步骤开始；
- 模型流式输出：实时打印 LLM 的“思考”内容（支持 <think>...</think> 标签提取）；
- 工具调用：记录即将执行的工具及其参数；
- 工具返回结果：打印工具的实际输出，供后续推理使用；
- 最终响应：捕获 Agent 的最终回答。
后处理与展示
清理最终回答中的思考标签，确保用户只看到干净结果，并通过日志器统一输出最终答案和执行摘要（如总步骤数、关键操作等）。

跑几个Demo试试看？

现在我们可以简单的跑几个demo看看效果，尝试运行一下我们构建的Coding Agent。

python agent/code_agent/main.py '你是谁 你能做什么'

════════════════════════════════════════════════════════════
  🤖 Code Agent 启动
════════════════════════════════════════════════════════════

┌─────────────────────────────────────────┐
│  📥 用户输入                            │
├─────────────────────────────────────────┤
│  你是谁 你能做什么                              │
└─────────────────────────────────────────┘


════════════════════════════════════════════════════════════
  ✅ 最终结果
════════════════════════════════════════════════════════════
你好！我是 **ReAct Coding Agent**，一个基于 ReAct（Reasoning + Acting）推理模式的 AI 编程助手。

## 我能做什么

### 🛠️ 文件操作
- **查看文件** - 读取文件内容，支持指定行范围
- **创建文件** - 创建新的文本文件（自动创建父目录）
- **编辑文件** - 进行字符串替换、插入内容
- **搜索文件** - 使用正则表达式搜索文件内容

### 📁 目录操作
- **列出文件** - 查看目录结构，支持 glob 模式过滤
- **树形显示** - 以树状结构展示目录层级

### 💻 命令执行
- **执行 Bash 命令** - 运行各种命令行操作（如安装依赖、启动服务、Git 操作等）

## 使用方式

你可以用自然语言告诉我你想要做什么，例如：

- "帮我查看 src 目录下的所有 TypeScript 文件"
- "搜索包含 'TODO' 的代码"
- "创建一个新的 React 组件文件"
- "启动开发服务器"

我会理解你的意图，选择合适的工具来完成任务。有什么我可以帮你的吗？
────────────────────────────────────────────────────────────

📊 执行统计:
   • 总步骤数: 1
   • 工具调用次数: 0
   
python agent/code_agent/main.py '给main.py 72行代码加一行注释'


════════════════════════════════════════════════════════════
  🤖 Code Agent 启动
════════════════════════════════════════════════════════════

┌─────────────────────────────────────────┐
│  📥 用户输入                            │
├─────────────────────────────────────────┤
│  给main.py 72行代码加一行注释                    │
└─────────────────────────────────────────┘


  🔧 工具调用 #1
  ├─ 工具名称: view
  ├─ 输入参数:
  │   {
  │     "file_path": "/Users/Documents/ai-learn/main.py"
  │   }
  └─ 输出结果:
      1: import asyncio
      2: import os
      3: import sys
      4: 
      5: print(
      6:     sys.version, sys.platform, sys.executable, os.path.join(os.path.dirname(__file__))
      7: )
      8: 
      9: # from langchain_core.callbacks import BaseCallbackHandler
      10: # from langchain_classic.chains.summarize import load_summarize_chain
      11: # from langchain_classic.memory import ConversationSummaryBufferMemory, VectorStoreRetrieverMemory
      12: # from langgraph.checkpoint.memory import MemorySaver
      13: # from langchain.chat_models import init_
      ... (截断，共 2190 字符)

  🔧 工具调用 #2
  ├─ 工具名称: str_replace
  ├─ 输入参数:
  │   {
  │     "file_path": "/Users/Documents/ai-learn/main.py",
  │     "target": "if __name__ == \"__main__\":\n    asyncio.run(main())",
  │     "replacement": "if __name__ == \"__main__\":\n    # 程序入口：启动异步事件循环并运行 main() 函数\n    asyncio.run(main())"
  │   }
  └─ 输出结果:
      Replaced 1 occurrence(s) of 'if __name__ == "__main__":
          asyncio.run(main())'. (已替换 1 处 'if __name__ == "__main__":
          asyncio.run(main())')

  🔧 工具调用 #3
  ├─ 工具名称: view
  ├─ 输入参数:
  │   {
  │     "file_path": "/Users/Documents/ai-learn/main.py",
  │     "start_line": 70,
  │     "end_line": 75
  │   }
  └─ 输出结果:
      70: 
      71: if __name__ == "__main__":
      72:     # 程序入口：启动异步事件循环并运行 main() 函数
      73:     asyncio.run(main())

════════════════════════════════════════════════════════════
  ✅ 最终结果
════════════════════════════════════════════════════════════
已完成！在第 72 行添加了注释：

if __name__ == "__main__":
    # 程序入口：启动异步事件循环并运行 main() 函数
    asyncio.run(main())

注释说明了这是程序入口点，用于启动异步事件循环并执行 `main()` 函数。
────────────────────────────────────────────────────────────

📊 执行统计:
   • 总步骤数: 4
   • 工具调用次数: 3

为什么如此简单？

LangChain v1 的 create_agent 构建的不再是一个简单的链，而是一个底层由 langgraph 驱动的状态机（StateGraph）。这个图定义了 Agent 的完整生命周期：思考、行动、再思考，直至终点。

工具调用与停止：
不再向 LLM 传递工具的自然语言描述，而是提供结构化的 JSON Schema（包含工具名、参数、类型等），模型利用原生 Function Calling 能力直接输出 tool_calls 对象；框架自动执行工具并判断是否继续推理，摆脱了对 ReAct 文本格式解析（如 "Action:"）和 stop token（如 "\nObservation:"）的依赖。
稳健性与可组合性：
使用结构化的 AgentState 管理中间状态，并通过独立的 ToolNode 封装工具逻辑，替代最上面案例中易出错的字符串拼接（agent_scratchpad），使流程更可靠、调试更直观，也更容易组合多个 Agent 或嵌入复杂工作流。
灵活的编排设计：
LangGraph 的核心在于其编排设计，它将大模型（作为系统的“大脑”）和工具（作为“手脚”）通过一种类似于流程图或状态机的方式连接起来，创建了一个能够自我驱动并执行任务的智能体系。这种设计不仅使得整个系统更加直观易懂，同时也极大地提升了系统的灵活性和可扩展性。

下面是一个简单由 langgraph 驱动 Agent 的一个简单Demo，或者你也可以看github上一个开源的完整的项目 react-agent 用于学习。

from datetime import datetime
from typing import Annotated, Sequence, TypedDict

from langchain_core.messages import AIMessage, BaseMessage, HumanMessage
from langchain_core.tools import tool
from langchain_ollama import ChatOllama
from langgraph.graph import END, START, StateGraph
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode


# ==========================================
# 1. 定义工具 (Tools)
# ==========================================
@tool
def get_current_time() -> str:
    """
    获取当前实时,历史,未来时间的工具。

    此工具用于获取当前的时间，适用于需要知道当前时间的场景，包括不限于年份、月份、日期、小时、分钟、秒等。
    此外如果用户提到去年、今年、上个月、本月等时间范围，也需要使用此工具获取相关时间。

    返回：
    str: 当前时间的字符串表示。
    """
    return str(datetime.now())


# 准备工具列表和一个方便按名字查找工具的字典
tools = [get_current_time]

# ==========================================
# 2. 初始化 LLM 并绑定工具
# ==========================================
# 使用 Ollama 作为本地模型
llm = ChatOllama(model="qwen3:14b", temperature=0)
llm_with_tools = llm.bind_tools(tools)


# ==========================================
# 3. 定义 Agent 的状态
# ==========================================
class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]


# ==========================================
# 4. 定义核心节点
# ==========================================
def agent_node(state: AgentState):
    # 真实调用 LLM（携带了已绑定的工具信息）
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response]}


# 真实项目中，我们不再需要手写 tool_node 函数去解析和执行工具
# LangGraph 提供了一个内置的 ToolNode 专门做这件事
tool_node = ToolNode(tools)


# ==========================================
# 5. 定义条件路由
# ==========================================
def should_continue(state: AgentState):
    last_message = state["messages"][-1]

    # 如果最后一条消息没有工具调用，就结束
    if not isinstance(last_message, AIMessage) or not last_message.tool_calls:
        return "end"

    # 否则，继续执行工具
    return "continue"


# ==========================================
# 6. 构建图并编译
# ==========================================
workflow = StateGraph(AgentState)

workflow.add_node("agent", agent_node)  # 决策节点
workflow.add_node("action", tool_node)  # 行动节点

workflow.add_edge(START, "agent")

workflow.add_conditional_edges(  # 决策节点根据 should_continue 函数判断是否继续执行工具
    "agent",
    should_continue,
    {
        "continue": "action",
        "end": END,
    },
)

workflow.add_edge("action", "agent")  # 从工具节点返回到决策节点, 形成循环

# 编译成可执行的 Runnable
agent_runnable = workflow.compile()

# ==========================================
# 7. 测试运行入口
# ==========================================
if __name__ == "__main__":
    print("================= 开始测试 Agent =================")
    # 1. 模拟用户输入
    user_input = "明年是哪一年？"
    print(f"User: {user_input}\n")

    # 2. 构造初始状态
    inputs = {"messages": [HumanMessage(content=user_input)]}

    # 3. 运行 Agent 并流式打印中间过程
    for event in agent_runnable.stream(inputs, stream_mode="values"):  # type: ignore
        # stream_mode="values" 会在图的每一次状态更新时触发
        last_msg = event["messages"][-1]
        last_msg.pretty_print()

    print("================= Agent 运行结束 =================")

支持 MCP（Model Context Protocol）

LangChain 已经实现了一个库来支持 MCP（langchain-mcp-adapters），且同时完美兼容 stdio、sse、streamable_http 多种传输方式。依托其灵活的工具封装能力，LangChain 可将 MCP 提供的任意工具统一转换为标准的可以执行的 Tool，你可以轻松的将 MCP 接入到自己的Agent中。

from langchain_core.tools import BaseTool
from langchain_mcp_adapters.client import (
    MultiServerMCPClient,
    StreamableHttpConnection,
)


async def load_mcp_tools() -> list[BaseTool]:
    client = MultiServerMCPClient(
        {
            # https://modelscope.cn/mcp/servers/harykali/7daysfoodHelperV3.0/tools
            "7daysfoodHelperV3.0": StreamableHttpConnection(
                transport="streamable_http",
                url="https://mcp.api-inference.modelscope.net/xxxxxxxxxx/mcp",
            )
        }
    )
    tools = await client.get_tools()
    return tools


if __name__ == "__main__":
    import asyncio

    tools = asyncio.run(load_mcp_tools())
    print(tools)


async def build_agent():
    llm = init_chat_model("minimax")

    mcp_tools = await load_mcp_tools()

    tools = [
        ls_tool,
        grep_tool,
        tree_tool,
        bash_tool,
        create_tool,
        insert_tool,
        str_replace_tool,
        view_tool,
        *mcp_tools,
    ]

    agent = create_agent(
        llm,
        tools=tools,
        middleware=[TodoListMiddleware()],

上面集成了一个叫 七日餐饮助手3.0（MCP&Agent挑战赛）的 MCP 它提供了菜谱查询能力，下面这个例子能看到调用了 MCP 中提供的 Tool。


════════════════════════════════════════════════════════════
  🤖 Code Agent 启动
════════════════════════════════════════════════════════════

┌─────────────────────────────────────────┐
│  📥 用户输入                            │
├─────────────────────────────────────────┤
│  洋葱炒鸡蛋的做法                               │
└─────────────────────────────────────────┘


  🔧 工具调用 #1
  ├─ 工具名称: search_recipe
  ├─ 输入参数:
  │   {'recipe_name': '洋葱炒鸡蛋', 'runtime': ToolRuntime(state={'messages': [HumanMessage(content='洋葱炒鸡蛋的做法', additional_kwargs={}, response_metadata={}, id='a90cb23d-b654-4392-a287-dfabc9cee58c'), AIMessage(content='', additional_kwargs={}, response_metadata={'finish_reason': 'tool_calls', 'model_name': 'minimax/minimax-m2.5', 'model_provider': 'openai'}, id='lc_run--019d4e2f-ecb5-7122-83e6-fab98da9cbd8', tool_calls=[{'name': 'search_recipe', 'args': {'recipe_name': '洋葱炒鸡蛋'}, 'id': 'call_1180b5fe70404c3
  │   ... (截断，共 4056 字符)
  └─ 输出结果:
      [{'type': 'text', 'text': '🍽️ 洋葱炒鸡蛋 + 米饭 详细做法\n══════════════════════════════════════════════════\n\n洋葱炒鸡蛋的做法\n洋葱炒鸡蛋，是中国的一道日常生活中所熟知的菜品\n预估烹饪难度：★★\n必备原料和工具\n* 鸡蛋\n* 洋葱\n* 食用油\n* 盐\n* 葱\n* 料酒\n计算\n每份：\n* 鸡蛋 2 个\n* 洋葱 50 g\n* 食用油 50 ml\n* 盐 2 g\n* 葱 半 根\n* 料酒 2 ml\n操作\n* 鸡蛋打入大碗中，加入洋葱片、盐后搅拌 60 S\n* 起锅烧油，倒入鸡蛋，一面煎炸 30-45 S ，翻面继续翻炒，反复 2-3 分钟 后散上料酒出锅\n* 鸡蛋装盘，散上葱花\n!示例菜成品\n附加内容\n如果您遵循本指南的制作流程而发现有问题或可以改进的流程，请提出 Issue 或 Pull request 。', 'id': 'lc_9d00e506-667c-47ee-95e4-9d13e6d3176b'}]

  💭 模型响应:
  ┌────────────────────────────────────────────────┐
  │ # 🍳 洋葱炒鸡蛋的做法
  │ 
  │ 这是一道简单美味的家常菜，预估烹饪难度：★★
  │ 
  │ ## 必备原料
  │ 
  │ | 食材 | 用量 |
  │ |------|------|
  │ | 鸡蛋 | 2个 |
  │ | 洋葱 | 50g |
  │ | 食用油 | 50ml |
  │ | 盐 | 2g |
  │ | 葱 | 半根 |
  │ | 料酒 | 2ml |
  │ 
  │ ## 烹饪步骤
  │ 
  │ 1. **准备蛋液**：鸡蛋打入大碗中，加入洋葱片、盐后搅拌60秒
  │ 
  │ 2. **煎炒**：起锅烧油，倒入鸡蛋，一面煎炸30-45秒，翻面继续翻炒，反复2-3分钟后
  │ 
  │ 3. **装盘**：鸡蛋装盘，撒上葱花即可
  │ 
  │ ## 小贴士
  │ 
  │ - 洋葱可以切丝或切块，根据个人喜好调整
  │ - 炒鸡蛋时火不要太大，以免鸡蛋炒老
  │ - 料酒可以去除蛋腥味，提升香味
  │ 
  │ 祝您烹饪愉快！🥢
  └────────────────────────────────────────────────┘

════════════════════════════════════════════════════════════
  ✅ 最终结果
════════════════════════════════════════════════════════════
# 🍳 洋葱炒鸡蛋的做法

这是一道简单美味的家常菜，预估烹饪难度：★★

## 必备原料

| 食材 | 用量 |
|------|------|
| 鸡蛋 | 2个 |
| 洋葱 | 50g |
| 食用油 | 50ml |
| 盐 | 2g |
| 葱 | 半根 |
| 料酒 | 2ml |

## 烹饪步骤

1. **准备蛋液**：鸡蛋打入大碗中，加入洋葱片、盐后搅拌60秒

2. **煎炒**：起锅烧油，倒入鸡蛋，一面煎炸30-45秒，翻面继续翻炒，反复2-3分钟后撒上料酒出锅

3. **装盘**：鸡蛋装盘，撒上葱花即可

## 小贴士

- 洋葱可以切丝或切块，根据个人喜好调整
- 炒鸡蛋时火不要太大，以免鸡蛋炒老
- 料酒可以去除蛋腥味，提升香味

祝您烹饪愉快！🥢
────────────────────────────────────────────────────────────

📊 执行统计:
   • 总步骤数: 2
   • 工具调用次数: 1

支持 To-do 列表（复制任务规划）

下面一张图就是 trae 中 To-do 规划的列表在 Agent 的定义中，规划（Planning）是具备 “智能性” 的核心能力，更是支撑 Agent 高效完成复杂任务的关键。所谓规划能力，本质上是 Agent 面对目标时，能主动拆解任务、梳理步骤、预判风险并制定可落地执行路径的能力并非简单按指令执行，而是像人类一样具备前瞻性和条理性，能在复杂场景中找最优解、灵活调整策略。

在 LangChain 中实现一个 Plan 规划非常简单，官方提供了一个内置的中间件 TodoListMiddleware，只需要一行代码就能实现一个规划列表功能了，感兴趣的也可以点进去查看源码本质上是实现了一个叫 write_todos 的一个 tool。

TodoListMiddleware 使用

from langchain.agents import create_agent
from langchain.agents.middleware import TodoListMiddleware

async def build_agent():
    llm = init_chat_model("minimax")

    mcp_tools = await load_mcp_tools()

    agent = create_agent(
        llm,
        tools=tools,
        middleware=[TodoListMiddleware()],

一些 TodoListMiddleware 源码

WRITE_TODOS_SYSTEM_PROMPT = """## `write_todos`

You have access to the `write_todos` tool to help you manage and plan complex objectives.
Use this tool for complex objectives to ensure that you are tracking each necessary step and giving the user visibility into your progress.
This tool is very helpful for planning complex objectives, and for breaking down these larger complex objectives into smaller steps.

It is critical that you mark todos as completed as soon as you are done with a step. Do not batch up multiple steps before marking them as completed.
For simple objectives that only require a few steps, it is better to just complete the objective directly and NOT use this tool.
Writing todos takes time and tokens, use it when it is helpful for managing complex many-step problems! But not for simple few-step requests.

## Important To-Do List Usage Notes to Remember
- The `write_todos` tool should never be called multiple times in parallel.
- Don't be afraid to revise the To-Do list as you go. New information may reveal new tasks that need to be done, or old tasks that are irrelevant."""  # noqa: E501

@tool(description=WRITE_TODOS_TOOL_DESCRIPTION)
def write_todos(
    todos: list[Todo], tool_call_id: Annotated[str, InjectedToolCallId]
) -> Command[Any]:
    """Create and manage a structured task list for your current work session."""
    return Command(
        update={
            "todos": todos,
            "messages": [ToolMessage(f"Updated todo list to {todos}", tool_call_id=tool_call_id)],
        }
    )

效果验证

输入一个复杂的需求，然后模型会根据复杂度来决定是否使用任务规划

python agent/code_agent/main.py '目录下新建一个文件夹，然后读取项目中的system_prompt提示词，然后将提示词翻译成日语，中文，阿拉伯语，韩语，最后生成这四个各个语种的md文件'

📝 规划待办事项 (To-Do List):
  ┌────────────────────────────────────────────────┐
  │ ⏳ 查看项目结构，找到 agent/code_agent/main.py 文件
  │ ⭕ 在目录下新建一个文件夹用于存放翻译文件
  │ ⭕ 读取 main.py 中的 system_prompt 提示词
  │ ⭕ 将提示词翻译成日语并生成 md 文件
  │ ⭕ 将提示词翻译成中文并生成 md 文件
  │ ⭕ 将提示词翻译成阿拉伯语并生成 md 文件
  │ ⭕ 将提示词翻译成韩语并生成 md 文件
  └────────────────────────────────────────────────┘
  
  📝 规划待办事项 (To-Do List):
  ┌────────────────────────────────────────────────┐
  │ ✅ 查看项目结构，找到 agent/code_agent/main.py 文件
  │ ⏳ 在目录下新建一个文件夹用于存放翻译文件
  │ ✅ 读取 main.py 中的 system_prompt 提示词
  │ ⭕ 将提示词翻译成日语并生成 md 文件
  │ ⭕ 将提示词翻译成中文并生成 md 文件
  │ ⭕ 将提示词翻译成阿拉伯语并生成 md 文件
  │ ⭕ 将提示词翻译成韩语并生成 md 文件
  └────────────────────────────────────────────────┘
  
  📝 规划待办事项 (To-Do List):
  ┌────────────────────────────────────────────────┐
  │ ✅ 查看项目结构，找到 agent/code_agent/main.py 文件
  │ ✅ 在目录下新建一个文件夹用于存放翻译文件
  │ ✅ 读取 main.py 中的 system_prompt 提示词
  │ ⏳ 将提示词翻译成日语并生成 md 文件
  │ ⭕ 将提示词翻译成中文并生成 md 文件
  │ ⭕ 将提示词翻译成阿拉伯语并生成 md 文件
  │ ⭕ 将提示词翻译成韩语并生成 md 文件
  └────────────────────────────────────────────────┘

规划能力让 Agent 摆脱了 “被动执行指令” 的局限，具备了 “主动思考、统筹安排” 的能力。如果没有规划能力，Agent 面对复杂任务时只会陷入混乱，要么遗漏关键步骤，要么出现逻辑漏洞，无法高效、准确地完成目标。

支持Agent Skills

可以看我这篇文章 juejin.cn/post/762329…

写在最后

本文的核心主要是以学习的角度，聚焦Coding Agent的基本实现逻辑，让初学者（笔者也是初学者）能够通过动手实践，理解智能体的工作原理，掌握基础的Agent构建方法，为后续深入学习更复杂的智能体开发垫一垫基础。随着大模型的迭代和 Agent 生态的完善，主动学习 AI、掌握智能体开发基础也越来越必要。