cursor带我学习OpenManus 框架📚 目录框架概述核心架构核心组件详解工作流程工具系统配置系统

版本: 0.3.0 创建日期: 2026年1月20日

📚 目录

框架概述
核心架构
核心组件详解
工作流程
工具系统
配置系统
实战示例
高级特性
最佳实践
常见问题

1. 框架概述

1.1 什么是 OpenManus?

OpenManus 是一个开源的通用 AI Agent 框架，旨在解决各种复杂任务。它基于 ReAct (Reasoning and Acting) 范式，结合了工具调用、浏览器自动化、代码执行等多种能力。

核心特点:

✅ 无需邀请码: 完全开源，任何人都可以使用
✅ 通用性强: 支持编程、信息检索、文件处理、网页浏览等多种任务
✅ 架构清晰: 基于 ReAct 范式的分层架构
✅ 工具丰富: 内置多种工具，支持扩展
✅ 沙箱支持: 可选的 Docker 沙箱环境保证安全执行

1.2 技术栈

核心技术:
├── Python 3.12+           # 主要开发语言
├── Pydantic 2.x           # 数据验证和配置管理
├── OpenAI API             # LLM 接口 (支持多种模型)
├── AsyncIO                # 异步编程
├── Browser-use            # 浏览器自动化
├── Docker                 # 沙箱环境 (可选)
└── MCP (Model Context Protocol)  # 模型上下文协议

1.3 项目结构

OpenManus-0.3.0/
├── app/                    # 核心应用代码
│   ├── agent/             # Agent 实现
│   │   ├── base.py        # Agent 基类
│   │   ├── react.py       # ReAct Agent
│   │   ├── toolcall.py    # 工具调用 Agent
│   │   ├── manus.py       # Manus 主 Agent
│   │   ├── browser.py     # 浏览器辅助
│   │   └── mcp.py         # MCP Agent
│   ├── tool/              # 工具集合
│   │   ├── base.py        # 工具基类
│   │   ├── python_execute.py      # Python 执行
│   │   ├── str_replace_editor.py  # 文件编辑
│   │   ├── browser_use_tool.py    # 浏览器工具
│   │   ├── web_search.py          # 网页搜索
│   │   └── tool_collection.py     # 工具集合管理
│   ├── sandbox/           # 沙箱环境
│   │   ├── client.py      # 沙箱客户端
│   │   └── core/          # 核心沙箱实现
│   ├── prompt/            # 提示词模板
│   ├── flow/              # 多 Agent 流程编排
│   ├── llm.py             # LLM 接口封装
│   ├── config.py          # 配置管理
│   ├── schema.py          # 数据模型定义
│   └── logger.py          # 日志系统
├── config/                # 配置文件
│   └── config.toml        # 主配置文件
├── workspace/             # 工作空间目录
├── main.py                # 主入口
├── run_mcp.py             # MCP 模式入口
├── run_flow.py            # 多 Agent 流程入口
└── requirements.txt       # 依赖包列表

2. 核心架构

2.1 架构总览

OpenManus 采用分层架构设计:

┌─────────────────────────────────────────────────┐
│              用户交互层                          │
│        (CLI / API / Web Interface)              │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│              Agent 层                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐      │
│  │  Manus   │  │   MCP    │  │  其他    │      │
│  │  Agent   │  │  Agent   │  │  Agents  │      │
│  └──────────┘  └──────────┘  └──────────┘      │
│         基于 ReAct 范式                          │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│              工具层                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐      │
│  │  Python  │  │  文件    │  │  浏览器  │      │
│  │  执行    │  │  编辑    │  │  自动化  │      │
│  └──────────┘  └──────────┘  └──────────┘      │
│  ┌──────────┐  ┌──────────┐                    │
│  │  搜索    │  │  终止    │  ...               │
│  └──────────┘  └──────────┘                    │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│            基础设施层                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐      │
│  │   LLM    │  │  沙箱    │  │  配置    │      │
│  │  接口    │  │  环境    │  │  管理    │      │
│  └──────────┘  └──────────┘  └──────────┘      │
└─────────────────────────────────────────────────┘

2.2 设计模式

2.2.1 ReAct (Reasoning and Acting) 范式

OpenManus 的核心是 ReAct 范式，每个步骤分为两个阶段:

class ReActAgent(BaseAgent):
    async def step(self) -> str:
        """执行一个完整的 ReAct 步骤"""
        # 1. Think: 推理和决策
        should_act = await self.think()

        if not should_act:
            return "Thinking complete - no action needed"

        # 2. Act: 执行动作
        return await self.act()

执行流程:

用户输入 → Think (分析任务) → Act (执行工具) →
观察结果 → Think (分析结果) → Act (下一步操作) →
... 循环直到完成或达到最大步数

2.2.2 工具调用模式

class ToolCallAgent(ReActAgent):
    async def think(self) -> bool:
        """使用 LLM 决定调用哪些工具"""
        response = await self.llm.ask_tool(
            messages=self.messages,
            tools=self.available_tools.to_params(),
            tool_choice=self.tool_choices,
        )
        self.tool_calls = response.tool_calls
        return bool(self.tool_calls)

    async def act(self) -> str:
        """执行工具调用并收集结果"""
        results = []
        for tool_call in self.tool_calls:
            result = await self.execute_tool(tool_call)
            results.append(result)
        return "\n\n".join(results)

3. 核心组件详解

3.1 Agent 组件

3.1.1 BaseAgent (基础 Agent)

职责: 提供 Agent 的基础功能和生命周期管理

核心属性:

class BaseAgent(BaseModel, ABC):
    name: str                    # Agent 名称
    description: Optional[str]   # Agent 描述
    system_prompt: str           # 系统提示词
    next_step_prompt: str        # 下一步提示词
    llm: LLM                     # 语言模型实例
    memory: Memory               # 对话记忆
    state: AgentState            # 当前状态 (IDLE/RUNNING/FINISHED/ERROR)
    max_steps: int = 10          # 最大步数
    current_step: int = 0        # 当前步数

核心方法:

async def run(self, request: str) -> str:
    """执行 Agent 主循环"""
    # 1. 初始化用户请求
    # 2. 循环执行 step() 直到完成或超过最大步数
    # 3. 返回执行结果

@abstractmethod
async def step(self) -> str:
    """执行单步操作 (子类实现)"""

状态管理:

# 状态转换上下文管理器
async with self.state_context(AgentState.RUNNING):
    # 执行任务
    ...
# 自动恢复到之前的状态或处理错误

3.1.2 ReActAgent (ReAct Agent)

职责: 实现 ReAct 范式的思考-行动循环

核心方法:

@abstractmethod
async def think(self) -> bool:
    """思考: 分析当前状态，决定是否需要行动"""

@abstractmethod
async def act(self) -> str:
    """行动: 执行具体操作"""

async def step(self) -> str:
    """组合 think 和 act"""
    should_act = await self.think()
    if not should_act:
        return "Thinking complete - no action needed"
    return await self.act()

3.1.3 ToolCallAgent (工具调用 Agent)

职责: 管理工具调用和执行

核心属性:

class ToolCallAgent(ReActAgent):
    available_tools: ToolCollection      # 可用工具集合
    tool_choices: TOOL_CHOICE_TYPE       # 工具选择策略 (auto/required/none)
    special_tool_names: List[str]        # 特殊工具名称 (如 terminate)
    max_observe: Optional[int]           # 观察结果的最大长度

Think 阶段实现:

async def think(self) -> bool:
    # 1. 构造消息 (包含系统提示和历史对话)
    messages = [system_msg] + self.messages

    # 2. 调用 LLM 获取工具调用决策
    response = await self.llm.ask_tool(
        messages=messages,
        tools=self.available_tools.to_params(),
        tool_choice=self.tool_choices,
    )

    # 3. 提取工具调用
    self.tool_calls = response.tool_calls

    # 4. 将 LLM 响应添加到记忆
    self.memory.add_message(...)

    return bool(self.tool_calls)

Act 阶段实现:

async def act(self) -> str:
    results = []
    for tool_call in self.tool_calls:
        # 1. 执行工具
        result = await self.execute_tool(tool_call)

        # 2. 截断过长的结果
        if self.max_observe:
            result = result[:self.max_observe]

        # 3. 添加工具响应到记忆
        self.memory.add_message(
            Message.tool_message(
                content=result,
                tool_call_id=tool_call.id,
                name=tool_call.function.name
            )
        )
        results.append(result)

    return "\n\n".join(results)

3.1.4 Manus (通用 Agent)

职责: OpenManus 的主要 Agent，提供通用任务解决能力

配置:

class Manus(ToolCallAgent):
    name: str = "Manus"
    system_prompt: str = SYSTEM_PROMPT  # 通用系统提示
    max_steps: int = 20                  # 最大步数
    max_observe: int = 10000            # 观察结果最大长度

    # 默认工具集
    available_tools: ToolCollection = ToolCollection(
        PythonExecute(),        # Python 代码执行
        BrowserUseTool(),       # 浏览器自动化
        StrReplaceEditor(),     # 文件编辑
        Terminate()             # 终止工具
    )

特殊功能 - 浏览器上下文增强:

async def think(self) -> bool:
    # 检测是否正在使用浏览器
    browser_in_use = # 检查最近的工具调用

    if browser_in_use:
        # 添加浏览器状态信息到提示词
        self.next_step_prompt = await self.browser_context_helper.format_next_step_prompt()

    result = await super().think()

    # 恢复原始提示词
    self.next_step_prompt = original_prompt
    return result

3.2 LLM 组件

3.2.1 LLM 类设计

职责: 封装与大语言模型的交互，支持多种模型和接口

核心特性:

✅ 支持 OpenAI、Azure OpenAI、AWS Bedrock、Ollama
✅ Token 计数和限制
✅ 自动重试机制
✅ 流式和非流式输出
✅ 多模态支持 (文本 + 图片)
✅ 工具调用支持

初始化:

class LLM:
    def __init__(self, config_name: str = "default"):
        # 从配置加载 LLM 设置
        llm_config = config.llm[config_name]

        self.model = llm_config.model
        self.max_tokens = llm_config.max_tokens
        self.temperature = llm_config.temperature
        self.api_type = llm_config.api_type

        # Token 追踪
        self.total_input_tokens = 0
        self.total_completion_tokens = 0
        self.max_input_tokens = llm_config.max_input_tokens

        # 初始化 tokenizer
        self.tokenizer = tiktoken.encoding_for_model(self.model)
        self.token_counter = TokenCounter(self.tokenizer)

        # 初始化客户端
        if self.api_type == "azure":
            self.client = AsyncAzureOpenAI(...)
        elif self.api_type == "aws":
            self.client = BedrockClient()
        else:
            self.client = AsyncOpenAI(...)

核心方法:

基础对话 (ask):

@retry(...)
async def ask(
    self,
    messages: List[Message],
    system_msgs: Optional[List[Message]] = None,
    stream: bool = True,
    temperature: Optional[float] = None,
) -> str:
    # 1. 格式化消息 (处理图片等)
    formatted_messages = self.format_messages(messages, supports_images=True)

    # 2. Token 计数和限制检查
    input_tokens = self.count_message_tokens(formatted_messages)
    if not self.check_token_limit(input_tokens):
        raise TokenLimitExceeded(...)

    # 3. 调用 LLM API
    response = await self.client.chat.completions.create(...)

    # 4. 更新 Token 统计
    self.update_token_count(input_tokens, completion_tokens)

    return response

工具调用 (ask_tool):

@retry(...)
async def ask_tool(
    self,
    messages: List[Message],
    tools: List[dict],
    tool_choice: str = "auto",
    **kwargs
) -> ChatCompletionMessage:
    # 1. 验证 tool_choice
    # 2. 格式化消息和工具
    # 3. 计算 Token (包括工具描述)
    # 4. 调用 API
    # 5. 返回包含工具调用的响应

图片对话 (ask_with_images):

@retry(...)
async def ask_with_images(
    self,
    messages: List[Message],
    images: List[str],  # base64 或 URL
    **kwargs
) -> str:
    # 处理多模态输入

3.2.2 Token 管理

Token 计数器:

class TokenCounter:
    def count_message_tokens(self, messages: List[dict]) -> int:
        total = FORMAT_TOKENS  # 基础格式 token

        for message in messages:
            tokens = BASE_MESSAGE_TOKENS  # 每条消息的基础 token
            tokens += self.count_text(message.get("role", ""))
            tokens += self.count_content(message["content"])

            if "tool_calls" in message:
                tokens += self.count_tool_calls(message["tool_calls"])

            total += tokens

        return total

Token 限制检查:

def check_token_limit(self, input_tokens: int) -> bool:
    if self.max_input_tokens is None:
        return True  # 无限制

    return (self.total_input_tokens + input_tokens) <= self.max_input_tokens

3.3 Memory 组件

职责: 管理对话历史和上下文

数据结构:

class Message(BaseModel):
    role: str                           # "system" | "user" | "assistant" | "tool"
    content: Optional[str]              # 消息内容
    tool_calls: Optional[List[ToolCall]]  # 工具调用
    tool_call_id: Optional[str]         # 工具调用 ID
    name: Optional[str]                 # 工具名称
    base64_image: Optional[str]         # 图片数据

class Memory(BaseModel):
    messages: List[Message] = []
    max_messages: int = 100

核心方法:

def add_message(self, message: Message) -> None:
    """添加消息并自动裁剪"""
    self.messages.append(message)
    if len(self.messages) > self.max_messages:
        self.messages = self.messages[-self.max_messages:]

def get_recent_messages(self, n: int) -> List[Message]:
    """获取最近的 n 条消息"""
    return self.messages[-n:]

def to_dict_list(self) -> List[dict]:
    """转换为字典列表 (用于 LLM API)"""
    return [msg.to_dict() for msg in self.messages]

4. 工作流程

4.1 完整执行流程

┌─────────────────────────────────────────────────┐
│ 1. 初始化                                        │
│    - 加载配置                                    │
│    - 创建 Agent (Manus)                         │
│    - 初始化 LLM 客户端                           │
└──────────────────┬──────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────┐
│ 2. 接收用户输入                                  │
│    prompt = input("Enter your prompt: ")        │
└──────────────────┬──────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────┐
│ 3. 启动 Agent                                    │
│    await agent.run(prompt)                      │
└──────────────────┬──────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────┐
│ 4. 主循环 (ReAct)                                │
│    while current_step < max_steps:              │
│        ├─ Think: LLM 分析和决策                 │
│        │   - 构造消息 (system + history)         │
│        │   - 调用 LLM API                        │
│        │   - 解析工具调用                        │
│        │                                         │
│        ├─ Act: 执行工具                          │
│        │   - 遍历工具调用列表                    │
│        │   - 执行每个工具                        │
│        │   - 收集结果                            │
│        │                                         │
│        └─ 更新记忆和状态                         │
│            - 保存 LLM 响应                       │
│            - 保存工具结果                        │
│            - 检查是否完成                        │
└──────────────────┬──────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────┐
│ 5. 清理和返回                                    │
│    - 清理资源 (浏览器等)                         │
│    - 返回执行结果                                │
└─────────────────────────────────────────────────┘

4.2 详细步骤分解

Step 1: 初始化

# main.py
async def main():
    # 1.1 创建 Manus Agent
    agent = Manus()
    # 内部流程:
    # - 加载配置 (config.toml)
    # - 初始化 LLM (根据配置)
    # - 创建 Memory
    # - 初始化工具集合

    # 1.2 获取用户输入
    prompt = input("Enter your prompt: ")

    # 1.3 启动执行
    await agent.run(prompt)

Step 2: Agent 主循环

# app/agent/base.py
async def run(self, request: str) -> str:
    # 2.1 初始化用户消息
    self.update_memory("user", request)

    # 2.2 进入运行状态
    async with self.state_context(AgentState.RUNNING):
        # 2.3 主循环
        while self.current_step < self.max_steps:
            self.current_step += 1

            # 2.4 执行一步 (Think + Act)
            step_result = await self.step()

            # 2.5 检测死循环
            if self.is_stuck():
                self.handle_stuck_state()

            # 2.6 检查是否完成
            if self.state == AgentState.FINISHED:
                break

Step 3: Think 阶段

# app/agent/toolcall.py
async def think(self) -> bool:
    # 3.1 准备提示词
    if self.next_step_prompt:
        self.messages.append(Message.user_message(self.next_step_prompt))

    # 3.2 调用 LLM
    response = await self.llm.ask_tool(
        messages=self.messages,
        system_msgs=[Message.system_message(self.system_prompt)],
        tools=self.available_tools.to_params(),
        tool_choice=self.tool_choices,
    )

    # 3.3 提取工具调用
    self.tool_calls = response.tool_calls
    content = response.content

    # 3.4 记录日志
    logger.info(f"✨ Thoughts: {content}")
    logger.info(f"🛠️ Selected {len(self.tool_calls)} tools")

    # 3.5 保存到记忆
    if self.tool_calls:
        msg = Message.from_tool_calls(content, self.tool_calls)
    else:
        msg = Message.assistant_message(content)
    self.memory.add_message(msg)

    return bool(self.tool_calls)

Step 4: Act 阶段

# app/agent/toolcall.py
async def act(self) -> str:
    results = []

    # 4.1 遍历工具调用
    for tool_call in self.tool_calls:
        # 4.2 解析参数
        args = json.loads(tool_call.function.arguments)

        # 4.3 执行工具
        result = await self.available_tools.execute(
            name=tool_call.function.name,
            tool_input=args
        )

        # 4.4 处理特殊工具 (如 terminate)
        if tool_call.function.name in self.special_tool_names:
            self.state = AgentState.FINISHED

        # 4.5 截断过长结果
        if self.max_observe:
            result = result[:self.max_observe]

        # 4.6 保存到记忆
        self.memory.add_message(
            Message.tool_message(
                content=result,
                tool_call_id=tool_call.id,
                name=tool_call.function.name
            )
        )

        results.append(result)

    return "\n\n".join(results)

4.3 示例执行流程

用户输入: "创建一个 Python 脚本来计算斐波那契数列"

Step 1 - Think:
  LLM 分析: 需要创建文件，使用 str_replace_editor 工具
  决策: 调用 str_replace_editor.create

Step 1 - Act:
  执行: str_replace_editor(command="create", path="/workspace/fibonacci.py", file_text="...")
  结果: File created successfully at /workspace/fibonacci.py

Step 2 - Think:
  LLM 分析: 文件已创建，现在需要测试
  决策: 调用 python_execute

Step 2 - Act:
  执行: python_execute(code="exec(open('/workspace/fibonacci.py').read())")
  结果: [输出斐波那契数列]

Step 3 - Think:
  LLM 分析: 任务完成
  决策: 调用 terminate

Step 3 - Act:
  执行: terminate()
  状态: FINISHED

5. 工具系统

5.1 工具基类

BaseTool 设计:

class BaseTool(ABC, BaseModel):
    name: str               # 工具名称
    description: str        # 工具描述 (给 LLM 看)
    parameters: dict        # 参数 schema (JSON Schema)

    @abstractmethod
    async def execute(self, **kwargs) -> Any:
        """执行工具的核心逻辑"""

    def to_param(self) -> Dict:
        """转换为 OpenAI function call 格式"""
        return {
            "type": "function",
            "function": {
                "name": self.name,
                "description": self.description,
                "parameters": self.parameters,
            }
        }

ToolResult 数据模型:

class ToolResult(BaseModel):
    output: Any = None              # 正常输出
    error: Optional[str] = None     # 错误信息
    base64_image: Optional[str] = None  # 图片数据
    system: Optional[str] = None    # 系统消息

    def __str__(self):
        return f"Error: {self.error}" if self.error else self.output

5.2 内置工具详解

5.2.1 PythonExecute (Python 执行)

功能: 在隔离环境中执行 Python 代码

参数:

{
  "code": "print('Hello, World!')"
}

实现原理:

class PythonExecute(BaseTool):
    async def execute(self, code: str, timeout: int = 5) -> Dict:
        # 1. 使用 multiprocessing 创建子进程
        with multiprocessing.Manager() as manager:
            result = manager.dict()

            # 2. 在子进程中执行代码
            proc = multiprocessing.Process(
                target=self._run_code,
                args=(code, result, safe_globals)
            )
            proc.start()
            proc.join(timeout)

            # 3. 处理超时
            if proc.is_alive():
                proc.terminate()
                return {"observation": "Timeout", "success": False}

            return dict(result)

注意事项:

✅ 只捕获 print() 输出
❌ 函数返回值不可见
⚠️ 有超时限制 (默认 5 秒)

5.2.2 StrReplaceEditor (文件编辑)

功能: 查看、创建、编辑文件和目录

命令:

view: 查看文件/目录
create: 创建文件
str_replace: 字符串替换
insert: 插入文本
undo_edit: 撤销编辑

示例 1 - 查看文件:

{
  "command": "view",
  "path": "/workspace/example.py"
}

示例 2 - 创建文件:

{
  "command": "create",
  "path": "/workspace/new_file.py",
  "file_text": "print('Hello')"
}

示例 3 - 替换内容:

{
  "command": "str_replace",
  "path": "/workspace/example.py",
  "old_str": "def old_function():\n    pass",
  "new_str": "def new_function():\n    return True"
}

实现亮点:

# 1. 支持沙箱和本地两种模式
operator = self._get_operator()  # 根据配置选择

# 2. 历史记录管理
self._file_history[path].append(old_content)  # 支持 undo

# 3. 智能截断
def maybe_truncate(content: str, max_len: int = 16000) -> str:
    if len(content) > max_len:
        return content[:max_len] + "<response clipped>"
    return content

# 4. 唯一性检查 (str_replace)
occurrences = file_content.count(old_str)
if occurrences > 1:
    raise ToolError("Multiple occurrences - not unique")

5.2.3 BrowserUseTool (浏览器自动化)

功能: 完整的浏览器控制能力

支持的操作:

actions = [
    "go_to_url",              # 导航到 URL
    "click_element",          # 点击元素
    "input_text",             # 输入文本
    "scroll_down",            # 向下滚动
    "scroll_up",              # 向上滚动
    "scroll_to_text",         # 滚动到文本
    "send_keys",              # 发送按键
    "get_dropdown_options",   # 获取下拉选项
    "select_dropdown_option", # 选择下拉选项
    "go_back",                # 后退
    "web_search",             # 网页搜索
    "wait",                   # 等待
    "extract_content",        # 提取内容
    "switch_tab",             # 切换标签
    "open_tab",               # 打开新标签
    "close_tab",              # 关闭标签
]

示例 1 - 导航和点击:

// 访问网站
{
  "action": "go_to_url",
  "url": "https://www.example.com"
}

// 点击第 5 个可交互元素
{
  "action": "click_element",
  "index": 5
}

示例 2 - 提取内容:

{
  "action": "extract_content",
  "goal": "提取页面中所有产品的名称和价格"
}

内部实现:

async def execute(self, action: str, **kwargs) -> ToolResult:
    # 1. 确保浏览器已初始化
    context = await self._ensure_browser_initialized()

    # 2. 根据 action 执行不同操作
    if action == "go_to_url":
        page = await context.get_current_page()
        await page.goto(url)
        return ToolResult(output=f"Navigated to {url}")

    elif action == "extract_content":
        # 使用 LLM 提取内容
        content = markdownify.markdownify(await page.content())
        prompt = f"Extract: {goal}\n\nPage: {content}"

        response = await self.llm.ask_tool(
            messages=[{"role": "system", "content": prompt}],
            tools=[extraction_function],
            tool_choice="required"
        )

        return ToolResult(output=extracted_content)

状态获取:

async def get_current_state(self) -> ToolResult:
    """获取当前浏览器状态 (包含截图)"""
    state = await context.get_state()
    screenshot = await page.screenshot(full_page=True)

    return ToolResult(
        output=json.dumps({
            "url": state.url,
            "title": state.title,
            "interactive_elements": state.element_tree.clickable_elements_to_string(),
            "scroll_info": {...}
        }),
        base64_image=base64.b64encode(screenshot).decode()
    )

5.2.4 WebSearch (网页搜索)

功能: 多引擎网页搜索，支持自动降级

支持的搜索引擎:

Google
DuckDuckGo
Baidu
Bing

配置:

[search]
engine = "Google"                      # 主搜索引擎
fallback_engines = ["DuckDuckGo", "Baidu", "Bing"]  # 降级引擎
retry_delay = 60                       # 重试延迟 (秒)
max_retries = 3                        # 最大重试次数
lang = "en"                            # 语言
country = "us"                         # 国家

自动降级机制:

async def execute(self, query: str, num_results: int = 5) -> SearchResponse:
    # 1. 尝试主引擎
    try:
        return await self._search_with_engine(self.engine, query)
    except Exception as e:
        logger.warning(f"{self.engine} failed: {e}")

    # 2. 尝试降级引擎
    for fallback in self.fallback_engines:
        try:
            return await self._search_with_engine(fallback, query)
        except:
            continue

    # 3. 所有引擎都失败
    raise SearchError("All search engines failed")

5.2.5 Terminate (终止)

功能: 标记任务完成并终止执行

实现:

class Terminate(BaseTool):
    name: str = "terminate"
    description: str = "Use this when the task is complete"
    parameters: dict = {"type": "object", "properties": {}}

    async def execute(self, **kwargs) -> ToolResult:
        return ToolResult(
            output="Task completed successfully. Agent will now terminate."
        )

5.3 工具集合管理

ToolCollection:

class ToolCollection:
    def __init__(self, *tools: BaseTool):
        self.tools = tools
        self.tool_map = {tool.name: tool for tool in tools}

    def to_params(self) -> List[Dict]:
        """转换为 LLM API 格式"""
        return [tool.to_param() for tool in self.tools]

    async def execute(self, name: str, tool_input: Dict) -> ToolResult:
        """执行指定工具"""
        tool = self.tool_map.get(name)
        if not tool:
            return ToolFailure(error=f"Tool {name} not found")

        try:
            return await tool(**tool_input)
        except ToolError as e:
            return ToolFailure(error=e.message)

6. 配置系统

6.1 配置文件结构

config/config.toml:

# ============= LLM 配置 =============
[llm]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..."
max_tokens = 4096
max_input_tokens = 100000    # 可选: 限制总输入 token
temperature = 0.0
api_type = "openai"           # openai | azure | aws | ollama
api_version = ""              # Azure 需要

# 可选: 特定用途的 LLM 配置
[llm.vision]
model = "gpt-4o"
api_key = "sk-..."

# ============= 沙箱配置 =============
[sandbox]
use_sandbox = false
image = "python:3.12-slim"
work_dir = "/workspace"
memory_limit = "512m"
cpu_limit = 1.0
timeout = 300
network_enabled = false

# ============= 浏览器配置 =============
[browser]
headless = false
disable_security = true
extra_chromium_args = []
max_content_length = 2000

# 浏览器代理 (可选)
[browser.proxy]
server = "http://proxy.example.com:8080"
username = "user"
password = "pass"

# ============= 搜索配置 =============
[search]
engine = "Google"
fallback_engines = ["DuckDuckGo", "Baidu", "Bing"]
retry_delay = 60
max_retries = 3
lang = "en"
country = "us"

# ============= MCP 配置 =============
[mcp]
server_reference = "app.mcp.server"

6.2 配置加载

Config 类 (单例模式):

class Config:
    _instance = None
    _lock = threading.Lock()

    def __new__(cls):
        if cls._instance is None:
            with cls._lock:
                if cls._instance is None:
                    cls._instance = super().__new__(cls)
        return cls._instance

    def __init__(self):
        # 1. 查找配置文件
        config_path = self._get_config_path()

        # 2. 加载 TOML
        with config_path.open("rb") as f:
            raw_config = tomllib.load(f)

        # 3. 解析 LLM 配置
        base_llm = raw_config["llm"]
        llm_overrides = {k: v for k, v in base_llm.items() if isinstance(v, dict)}

        # 4. 创建配置对象
        self._config = AppConfig(
            llm={
                "default": base_llm,
                **llm_overrides  # vision, planning, etc.
            },
            sandbox=SandboxSettings(**raw_config.get("sandbox", {})),
            browser_config=BrowserSettings(**raw_config.get("browser", {})),
            search_config=SearchSettings(**raw_config.get("search", {})),
        )

6.3 多 LLM 配置

场景: 不同任务使用不同模型

[llm]
model = "gpt-4o-mini"     # 默认: 快速便宜
api_key = "sk-..."

[llm.vision]
model = "gpt-4o"          # 视觉任务: 高精度
api_key = "sk-..."

[llm.planning]
model = "o1-preview"      # 规划任务: 深度推理
api_key = "sk-..."

使用:

# 默认 LLM
default_llm = LLM(config_name="default")

# 视觉 LLM
vision_llm = LLM(config_name="vision")

# 规划 LLM
planning_llm = LLM(config_name="planning")

7. 实战示例

7.1 示例 1: 数据分析任务

用户输入:

下载 https://example.com/sales.csv 并分析销售数据，
生成月度销售报告。

执行流程:

Step 1 [Think]:
  分析: 需要下载文件并分析数据
  决策: 使用 browser_use 工具访问 URL

Step 1 [Act]:
  工具: browser_use(action="go_to_url", url="https://example.com/sales.csv")
  结果: Downloaded file to /downloads/sales.csv

Step 2 [Think]:
  分析: 文件已下载，需要读取和分析
  决策: 使用 python_execute 读取 CSV

Step 2 [Act]:
  工具: python_execute(code="""
import pandas as pd
df = pd.read_csv('/downloads/sales.csv')
monthly = df.groupby(df['date'].str[:7])['amount'].sum()
print(monthly)
""")
  结果: [月度销售数据]

Step 3 [Think]:
  分析: 已完成分析，需要生成报告
  决策: 使用 str_replace_editor 创建报告

Step 3 [Act]:
  工具: str_replace_editor(command="create", path="/workspace/report.md", ...)
  结果: Report created

Step 4 [Think]:
  分析: 任务完成
  决策: terminate

Step 4 [Act]:
  工具: terminate()
  状态: FINISHED

7.2 示例 2: Web 爬虫任务

用户输入:

访问 Hacker News 首页，提取前 10 条新闻的标题和链接。

执行流程:

Step 1:
  工具: browser_use(action="go_to_url", url="https://news.ycombinator.com")

Step 2:
  工具: browser_use(action="extract_content", goal="提取前 10 条新闻的标题和链接")
  结果: [提取的新闻列表]

Step 3:
  工具: str_replace_editor(command="create", path="/workspace/hn_news.json", ...)

Step 4:
  工具: terminate()

7.3 示例 3: 代码重构任务

用户输入:

重构 /workspace/legacy_code.py，
将所有函数改为使用 type hints。

执行流程:

Step 1:
  工具: str_replace_editor(command="view", path="/workspace/legacy_code.py")
  结果: [查看原始代码]

Step 2:
  工具: str_replace_editor(
    command="str_replace",
    path="/workspace/legacy_code.py",
    old_str="def add(a, b):\n    return a + b",
    new_str="def add(a: int, b: int) -> int:\n    return a + b"
  )
  结果: [替换第一个函数]

Step 3-N:
  [继续替换其他函数]

Step N+1:
  工具: python_execute(code="import /workspace/legacy_code.py")  # 验证语法

Step N+2:
  工具: terminate()

8. 高级特性

8.1 沙箱环境

功能: 在 Docker 容器中安全执行代码

启用沙箱:

[sandbox]
use_sandbox = true
image = "python:3.12-slim"
work_dir = "/workspace"
memory_limit = "512m"
cpu_limit = 1.0
timeout = 300
network_enabled = false

架构:

┌──────────────────────────────────┐
│       Host System                │
│                                  │
│  ┌───────────────────────────┐  │
│  │   OpenManus Agent         │  │
│  │   ┌─────────────────┐     │  │
│  │   │ SandboxClient   │     │  │
│  │   └────────┬────────┘     │  │
│  └────────────┼──────────────┘  │
│               │                  │
│  ┌────────────▼──────────────┐  │
│  │   Docker Container        │  │
│  │   ┌─────────────────┐     │  │
│  │   │ Python Env      │     │  │
│  │   │ /workspace      │     │  │
│  │   └─────────────────┘     │  │
│  └───────────────────────────┘  │
└──────────────────────────────────┘

实现:

class DockerSandbox:
    async def create(self):
        """创建 Docker 容器"""
        self.container = self.docker_client.containers.run(
            image=self.config.image,
            command="tail -f /dev/null",
            detach=True,
            working_dir=self.config.work_dir,
            mem_limit=self.config.memory_limit,
            cpu_quota=int(self.config.cpu_limit * 100000),
            network_disabled=not self.config.network_enabled,
            volumes=self.volume_bindings,
        )

    async def run_command(self, command: str, timeout: int = None) -> str:
        """在容器中执行命令"""
        exec_result = self.container.exec_run(
            cmd=command,
            workdir=self.config.work_dir,
        )
        return exec_result.output.decode()

文件操作:

class SandboxFileOperator(FileOperator):
    async def read_file(self, path: str) -> str:
        """从容器读取文件"""
        return await SANDBOX_CLIENT.read_file(path)

    async def write_file(self, path: str, content: str) -> None:
        """写入文件到容器"""
        await SANDBOX_CLIENT.write_file(path, content)

8.2 MCP (Model Context Protocol)

功能: 通过 MCP 协议连接外部工具服务器

架构:

┌──────────────┐         ┌──────────────┐
│  MCP Agent   │ ◄─────► │  MCP Server  │
│              │  stdio  │              │
│              │   or    │  ┌─────────┐ │
│              │   sse   │  │  Tool 1 │ │
│              │         │  ├─────────┤ │
│              │         │  │  Tool 2 │ │
│              │         │  └─────────┘ │
└──────────────┘         └──────────────┘

使用 stdio 连接:

python run_mcp.py --connection stdio

使用 SSE 连接:

# 启动 MCP 服务器
python run_mcp_server.py

# 连接到服务器
python run_mcp.py --connection sse --server-url http://127.0.0.1:8000/sse

MCPAgent 实现:

class MCPAgent(ToolCallAgent):
    async def initialize(
        self,
        connection_type: str,
        command: str = None,
        args: List[str] = None,
        server_url: str = None,
    ):
        """初始化 MCP 连接"""
        if connection_type == "stdio":
            self.session = await self.client.connect_stdio(command, args)
        else:  # sse
            self.session = await self.client.connect_sse(server_url)

        # 获取服务器工具列表
        tools_result = await self.session.list_tools()

        # 将 MCP 工具添加到可用工具
        for tool in tools_result.tools:
            mcp_tool = MCPTool.from_mcp_tool(tool, self.session)
            self.available_tools.add_tool(mcp_tool)

8.3 多 Agent 协作 (Flow)

功能: 复杂任务的分解和多 Agent 协作

流程类型:

class FlowType(Enum):
    PLANNING = "planning"    # 规划型流程
    SEQUENTIAL = "sequential"  # 顺序执行
    PARALLEL = "parallel"    # 并行执行

Planning Flow 示例:

class PlanningFlow(BaseFlow):
    async def execute(self, user_request: str) -> str:
        # 1. 生成计划
        plan = await self._generate_plan(user_request)

        # 2. 执行计划
        for step in plan.steps:
            # 2.1 选择合适的 Agent
            agent = self._select_agent(step)

            # 2.2 执行步骤
            result = await agent.run(step.description)

            # 2.3 评估结果
            if not await self._evaluate_result(result, step):
                # 重新规划
                plan = await self._replan(plan, step, result)

        # 3. 返回最终结果
        return await self._summarize_results()

使用示例:

# 创建 Agents
agents = {
    "manus": Manus(),
    "researcher": ResearchAgent(),
    "coder": CodeAgent(),
}

# 创建 Flow
flow = FlowFactory.create_flow(
    flow_type=FlowType.PLANNING,
    agents=agents,
)

# 执行复杂任务
result = await flow.execute("研究并实现一个推荐系统")

8.4 防止死循环机制

问题: Agent 可能陷入重复操作

检测:

def is_stuck(self) -> bool:
    """检测是否陷入死循环"""
    if len(self.memory.messages) < 2:
        return False

    last_message = self.memory.messages[-1]

    # 计算重复次数
    duplicate_count = sum(
        1 for msg in reversed(self.memory.messages[:-1])
        if msg.role == "assistant" and msg.content == last_message.content
    )

    return duplicate_count >= self.duplicate_threshold  # 默认 2

处理:

def handle_stuck_state(self):
    """处理死循环状态"""
    stuck_prompt = (
        "检测到重复响应。请考虑新策略，避免重复已尝试的无效路径。"
    )
    # 将提示添加到下一步
    self.next_step_prompt = f"{stuck_prompt}\n{self.next_step_prompt}"
    logger.warning("检测到死循环，已添加提示")

9. 最佳实践

9.1 Prompt Engineering

系统提示词设计

好的系统提示词:

SYSTEM_PROMPT = """
你是 OpenManus，一个全能的 AI 助手，旨在解决用户提出的任何任务。

工具使用原则:
1. 根据用户需求主动选择最合适的工具或工具组合
2. 对于复杂任务，分解问题并逐步使用不同工具解决
3. 每次使用工具后，清晰地解释执行结果并建议下一步

可用工具:
{tool_descriptions}

工作目录: {directory}

重要提示:
- 始终验证工具执行结果
- 遇到错误时尝试替代方案
- 完成任务后明确调用 terminate 工具
"""

下一步提示词:

NEXT_STEP_PROMPT = """
基于用户需求和当前进度，请:
1. 分析当前状态
2. 决定下一步操作
3. 选择合适的工具执行
"""

9.2 错误处理

1. 工具执行失败:

try:
    result = await tool.execute(**args)
except ToolError as e:
    # 返回友好的错误信息给 LLM
    return ToolFailure(error=f"工具执行失败: {e.message}")
except Exception as e:
    logger.exception("Unexpected error")
    return ToolFailure(error=f"未预期的错误: {str(e)}")

2. LLM API 失败:

@retry(
    wait=wait_random_exponential(min=1, max=60),
    stop=stop_after_attempt(6),
    retry=retry_if_exception_type((OpenAIError, Exception))
)
async def ask(...):
    # 自动重试

3. Token 限制:

if not self.check_token_limit(input_tokens):
    raise TokenLimitExceeded(
        f"超过 Token 限制: {self.total_input_tokens + input_tokens} > {self.max_input_tokens}"
    )

9.3 性能优化

1. 限制观察长度:

class Manus(ToolCallAgent):
    max_observe: int = 10000  # 防止上下文过长

2. 流式输出:

# 使用流式输出提升用户体验
response = await llm.ask(messages, stream=True)

3. 异步执行:

# 所有 IO 操作使用 async/await
async def execute(self, ...):
    await asyncio.gather(
        task1(),
        task2(),
        task3(),
    )

9.4 安全建议

1. 使用沙箱:

[sandbox]
use_sandbox = true
network_enabled = false  # 禁用网络访问

2. 文件路径验证:

# 确保路径在工作目录内
if not path.is_absolute():
    raise ToolError("只接受绝对路径")

if not str(path).startswith(str(WORKSPACE_ROOT)):
    raise ToolError("路径必须在工作目录内")

3. 代码执行超时:

proc.join(timeout)  # 防止无限循环
if proc.is_alive():
    proc.terminate()

10. 常见问题

10.1 安装和配置

Q: 如何选择 Python 版本?

A: OpenManus 需要 Python 3.12+:

python --version  # 确保 >= 3.12

Q: 依赖安装失败怎么办?

A: 使用 uv (推荐) 或清华镜像:

# 方法 1: uv (快速)
uv pip install -r requirements.txt

# 方法 2: pip + 镜像
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

Q: 如何配置多个 API Key?

A: 在 config.toml 中配置:

[llm]
api_key = "sk-default-key"

[llm.vision]
api_key = "sk-vision-key"

[llm.planning]
api_key = "sk-planning-key"

10.2 使用问题

Q: Agent 执行时间过长?

A: 调整 max_steps:

class Manus(ToolCallAgent):
    max_steps: int = 10  # 减少最大步数

Q: 如何禁用某个工具?

A: 在创建 Agent 时移除:

# 默认配置
available_tools = ToolCollection(
    PythonExecute(),
    # BrowserUseTool(),  # 注释掉不需要的工具
    StrReplaceEditor(),
    Terminate()
)

Q: 浏览器工具无法使用?

A: 安装 Playwright:

playwright install

10.3 开发问题

Q: 如何创建自定义工具?

A: 继承 BaseTool:

class CustomTool(BaseTool):
    name: str = "custom_tool"
    description: str = "自定义工具的描述"
    parameters: dict = {
        "type": "object",
        "properties": {
            "param1": {"type": "string", "description": "参数 1"}
        },
        "required": ["param1"]
    }

    async def execute(self, param1: str, **kwargs) -> ToolResult:
        # 实现工具逻辑
        result = do_something(param1)
        return ToolResult(output=result)

# 使用
agent = Manus()
agent.available_tools.add_tool(CustomTool())

Q: 如何调试 Agent 行为?

A: 查看日志:

# app/logger.py 已配置 loguru
# 日志位置: logs/{timestamp}.log

# 查看实时日志
tail -f logs/*.log

Q: 如何集成到自己的项目?

A: 作为库使用:

from app.agent.manus import Manus

async def my_function():
    agent = Manus()
    result = await agent.run("你的任务")
    return result

附录

A. 架构图

OpenManus 完整架构图

┌─────────────────────────────────────────────────────────┐
│                    用户界面层                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐              │
│  │   CLI    │  │   API    │  │   Web    │              │
│  └─────┬────┘  └─────┬────┘  └─────┬────┘              │
└────────┼─────────────┼─────────────┼────────────────────┘
         └─────────────┴─────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                   Agent 层                               │
│  ┌─────────────────────────────────────────────────┐    │
│  │              BaseAgent                          │    │
│  │  - state management                             │    │
│  │  - memory management                            │    │
│  │  - step execution loop                          │    │
│  └──────────────────┬──────────────────────────────┘    │
│                     │                                    │
│  ┌──────────────────▼──────────────────────────────┐    │
│  │            ReActAgent                           │    │
│  │  - think() [abstract]                           │    │
│  │  - act() [abstract]                             │    │
│  └──────────────────┬──────────────────────────────┘    │
│                     │                                    │
│  ┌──────────────────▼──────────────────────────────┐    │
│  │          ToolCallAgent                          │    │
│  │  - tool selection via LLM                       │    │
│  │  - tool execution                               │    │
│  │  - result observation                           │    │
│  └──────────────────┬──────────────────────────────┘    │
│                     │                                    │
│  ┌──────────────────▼──────────────────────────────┐    │
│  │   Manus / MCPAgent / CustomAgents               │    │
│  │  - specific implementations                     │    │
│  │  - custom tools & behaviors                     │    │
│  └─────────────────────────────────────────────────┘    │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                   工具层                                 │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐    │
│  │   Python     │ │   FileEdit   │ │   Browser    │    │
│  │   Execute    │ │              │ │   Automation │    │
│  └──────────────┘ └──────────────┘ └──────────────┘    │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐    │
│  │   Web        │ │   Terminate  │ │   Custom     │    │
│  │   Search     │ │              │ │   Tools      │    │
│  └──────────────┘ └──────────────┘ └──────────────┘    │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                基础设施层                                │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐    │
│  │     LLM      │ │   Sandbox    │ │   Config     │    │
│  │   - OpenAI   │ │   - Docker   │ │   - TOML     │    │
│  │   - Azure    │ │   - Local    │ │   - Env Vars │    │
│  │   - AWS      │ │              │ │              │    │
│  └──────────────┘ └──────────────┘ └──────────────┘    │
│  ┌──────────────┐ ┌──────────────┐                     │
│  │   Memory     │ │   Logger     │                     │
│  │   (Messages) │ │   (Loguru)   │                     │
│  └──────────────┘ └──────────────┘                     │
└─────────────────────────────────────────────────────────┘

B. 数据流图

用户请求 → Agent 处理流程

1. 初始化
   User Input
      │
      ▼
   Agent.run(request)
      │
      ▼
   Memory.add_message(user_message)

2. 主循环
   ┌─────────────────────────────────┐
   │  while step < max_steps:        │
   │                                 │
   │  ┌───────────────────────────┐ │
   │  │  THINK Phase              │ │
   │  │  ┌─────────────────────┐  │ │
   │  │  │ 构造 Messages       │  │ │
   │  │  │ - System Prompt     │  │ │
   │  │  │ - Conversation      │  │ │
   │  │  │ - Next Step Prompt  │  │ │
   │  │  └──────────┬──────────┘  │ │
   │  │             ▼              │ │
   │  │  ┌─────────────────────┐  │ │
   │  │  │ LLM.ask_tool()      │  │ │
   │  │  │ - Send to API       │  │ │
   │  │  │ - Parse Response    │  │ │
   │  │  └──────────┬──────────┘  │ │
   │  │             ▼              │ │
   │  │  ┌─────────────────────┐  │ │
   │  │  │ Extract Tool Calls  │  │ │
   │  │  └──────────┬──────────┘  │ │
   │  │             ▼              │ │
   │  │  Memory.add_message(       │ │
   │  │    assistant_message)      │ │
   │  └───────────────────────────┘ │
   │             │                   │
   │             ▼                   │
   │  ┌───────────────────────────┐ │
   │  │  ACT Phase                │ │
   │  │  for tool_call in calls:  │ │
   │  │  ┌─────────────────────┐  │ │
   │  │  │ Parse Arguments     │  │ │
   │  │  └──────────┬──────────┘  │ │
   │  │             ▼              │ │
   │  │  ┌─────────────────────┐  │ │
   │  │  │ Execute Tool        │  │ │
   │  │  │ - PythonExecute     │  │ │
   │  │  │ - FileEdit          │  │ │
   │  │  │ - Browser           │  │ │
   │  │  │ - etc.              │  │ │
   │  │  └──────────┬──────────┘  │ │
   │  │             ▼              │ │
   │  │  ┌─────────────────────┐  │ │
   │  │  │ Collect Results     │  │ │
   │  │  └──────────┬──────────┘  │ │
   │  │             ▼              │ │
   │  │  Memory.add_message(       │ │
   │  │    tool_message)           │ │
   │  └───────────────────────────┘ │
   │             │                   │
   │             ▼                   │
   │  ┌───────────────────────────┐ │
   │  │  Check State              │ │
   │  │  - FINISHED?              │ │
   │  │  - Stuck Loop?            │ │
   │  └───────────────────────────┘ │
   └─────────────────────────────────┘
                 │
                 ▼
3. 清理和返回
   Cleanup Resources
      │
      ▼
   Return Results

C. 快速参考

常用命令:

# 安装
uv pip install -r requirements.txt
playwright install

# 运行
python main.py                    # 标准模式
python run_mcp.py                 # MCP 模式
python run_flow.py                # 多 Agent 模式

# 配置
cp config/config.example.toml config/config.toml
vim config/config.toml

# 开发
pytest                            # 运行测试
pre-commit run --all-files       # 代码检查

关键文件路径:

app/agent/manus.py          # 主 Agent
app/llm.py                  # LLM 接口
app/schema.py               # 数据模型
app/tool/                   # 工具目录
config/config.toml          # 配置文件

重要类和方法:

# Agent
Manus()                     # 创建主 Agent
agent.run(prompt)           # 执行任务

# LLM
LLM(config_name="default")  # 创建 LLM
llm.ask(messages)           # 对话
llm.ask_tool(messages, tools)  # 工具调用

# 工具
ToolCollection(*tools)      # 工具集合
tool.execute(**kwargs)      # 执行工具

# 配置
from app.config import config
config.llm                  # LLM 配置
config.sandbox              # 沙箱配置
config.workspace_root       # 工作目录

总结

OpenManus 是一个强大而灵活的 AI Agent 框架，通过清晰的架构设计和丰富的工具支持，能够解决各种复杂任务。

核心优势:

✅ 架构清晰: ReAct 范式 + 分层设计
✅ 易于扩展: 插件化工具系统
✅ 生产就绪: 错误处理、重试、日志完善
✅ 安全可靠: 沙箱环境、Token 限制
✅ 文档完整: 代码注释 + 示例丰富

学习路径建议:

从 main.py 开始，理解基本流程
阅读 BaseAgent → ReActAgent → ToolCallAgent → Manus
深入理解 LLM 接口和 Token 管理
学习各个内置工具的实现
尝试创建自定义工具和 Agent

后续探索:

实现自定义 Agent
集成新的工具
优化 Prompt Engineering
探索多 Agent 协作