从0开发大模型之实现Agent（Bash 到 SKILL）从《learn-claude-code》受到启发，于是参考源代

从《learn-claude-code》受到启发，于是参考源代码写了这篇文章：如何实现从0开始构建Agent？

1. 前提

1.1 `AI Agent` 构成

模型：为智能体的推理和决策提供动力的LLM，决定了智能体的下限。
工具：智能体可用于采取行动的外部函数或API。
指令：定义智能体行为的明确指导方针和安全策略。

三者构成了 AI Agent 的下限和上限，模型越强，对于指令和工具的调度能力更准确，但是从工程学的角度来考虑，较弱的模型通过对工具和指令的结合，一样能制造功能强大的 Agent。

1.2 `Tool Calling`

由于 Anthropic Tool Calling 协议现在大部分模型已经支持，并且非常成熟，所以本文默认都通过传入 TOOLS 来让大模型调用的工具，样例如下：

{
    "name": "my_function_name", # The name of the function
    "description": "The description of my function", # Describe the function so Claude knows when and how to use it.
    "input_schema": { # input schema describes the format and the type of parameters Claude needs to generate to use the function
        "type": "object", # format of the generated Claude response
        "properties": { # properties defines the input parameters of the function
            "query": { # the function expects a query parameter
                "description": "The search query to perform.", # describes the parameter to Claude
            },
        },
        "required": ["query"], # define which parameters are required
    },
}

给模型传入 TOOLS 参数，LLM API 返回如下：

{
    "type": "tool_use",
    "id": "toolu_01A09q90qw90lq917835123",
    "name": "my_function_name",
    "input": {
        "query": "Latest developments in quantum computing"
    }
}

1.3 简单的 v0:Shell 到复杂 v4:Skills

前面提到 AI Agent 构成包括工具和指令，那么从最简的工具开始，到最后复杂的指令，Agent 的核心原理：感知 -> 认知 -> 执行 -> 反馈 -> 感知 ...，然后循环执行。

基于这个核心原理，我们从最简的 Shell 构建，规划如下：

v0: Shell 是基础的工具
v1: 模型即代理
v2: 结构化规划与 Todo
v3: 子代理
v4: Skills

2. v0: Shell 是基础的工具

Shell 在计算机是最常用的，所有的操作系统中都有类似 Bash 脚本的功能，那么基于脚本定义一系列工具如下：

工具	对应 Bash 命令示例
读文件	`cat file.txt`, `head -n 20 file.py`, `grep "TODO" -n -r .`
写文件	`echo '...' > file`, `cat << 'EOF' > main.py`
搜索/导航	`find . -name "*.py"`, `ls -R`, `rg "keyword" .`

2.1 架构

逻辑核心只有一个循环：模型 → 工具调用 → 工具结果 → 模型
不额外引入 Task/Plan/Registry 等抽象，全部由「自然语言 + Bash + 递归」实现

不足：

工程化的安全边界（如路径沙盒、命令白名单）
更丰富的可观测性（结构化日志、任务树可视化）
人类语义层面的「角色」区分（计划者/执行者/审阅者）

在这个极简版本中，我们遵循的架构设计，只保留最小完成闭环所需的要素。

一个工具就够了
Bash 本身就是工具的「元工具」：其他一切工具都可以通过 Bash 间接调用（curl、git、python、docker 等）。

递归 = 层级结构
无需实现复杂的 Task/Plan 抽象；只要允许「调用自己」，层级结构自然涌现。

进程 = 上下文隔离
不需要额外的「会话 ID」或「上下文容器」：操作系统的进程模型天然提供隔离。

提示词 = 行为约束
系统提示词定义了当前 Agent 的「角色 + 责任边界」，决定了它如何使用 Bash 能力。

核心循环

while True:
    response = model(messages, tools)
    if response.stop_reason != "tool_use":
        return response.text
    results = execute(response.tool_calls)
    messages.append(results)

2.2 完整代码

import sys
import os
import traceback
from llm_factory import LLMFactory, LLMChatAdapter
from util.mylog import logger
from utils import run_bash, BASH_TOOLS

# 初始化 API 客户端
# 使用 LLMFactory 创建 LLM 实例
llm = LLMFactory.create(
    model_type="openai",
    model_name="deepseek-v3.2", # 使用支持的模型
    temperature=0.0,
    max_tokens=8192
)
client = LLMChatAdapter(llm)

# 系统提示词
SYSTEM = f"""你是一个位于 {os.getcwd()} 的 CLI 代理，系统为 {sys.platform}。使用 bash 命令解决问题。

## 规则：
- 优先使用工具而不是文字描述。先行动，后简要解释。
- 读取文件：cat, grep, find, rg, ls, head, tail
- 写入文件：echo '...' > file, sed -i, 或 cat << 'EOF' > file
- 避免危险操作，如 rm -rf等删除或者清理文件, 或格式化挂载点，或对系统文件进行写操作

## 要求
- 不使用其他工具，仅使用 bash 命令或者 shell 脚本
- 子代理可以通过生成 shell 代码执行
- 如果当前任务超过 bash 的处理范围，则终止不处理
"""

def extract_bash_commands(text):
    """从 LLM 响应中提取 bash 命令"""
    import re
    pattern = r'```bash\n(.*?)\n```'
    matches = re.findall(pattern, text, re.DOTALL)
    return [cmd.strip() for cmd in matches if cmd.strip()]

def chat(prompt, history=None, max_steps=10):
    if history isNone:
        history = []
    
    # 检查历史记录中是否已有系统提示词（作为系统消息）
    has_system = any(msg.get("role") == "system"for msg in history)
    ifnot has_system:
         # 在开头添加系统提示词作为系统消息
         history.insert(0, {"role": "system", "content": SYSTEM})

    history.append({"role": "user", "content": prompt})

    step = 0
    while step < max_steps:
        step += 1
        # 1. 调用模型（传递 tools 参数）
        # 使用 chat_with_tools 接口，支持 function calling
        response = client.chat_with_tools(
            prompt=prompt,
            messages=history,
            tools=BASH_TOOLS
        )
        if step == 1:
            prompt = '继续'

        # 2. 解析响应内容
        assistant_text = []
        tool_calls = []
        
        logger.info(f"第 {step} 步响应: {response}")

        # chat_with_tools 返回的是 Response 对象，包含 content 列表
        for block in response.content:
            if getattr(block, "type", "") == "text":
                assistant_text.append(block.text)
            elif getattr(block, "type", "") == "tool_use":
                tool_calls.append(block)

        # 记录助手文本回复
        full_text = "\n".join(assistant_text)
        if full_text:
            logger.info(f"助手: {full_text}")
            history.append({"role": "assistant", "content": full_text})
        elif tool_calls:
            # 如果只有工具调用没有文本，添加一个占位文本到历史，保持对话连贯
            history.append({"role": "assistant", "content": "(Executing tools...)"})
        
        # 3. 如果没有工具调用，直接返回内容
        ifnot tool_calls:
            logger.info(f"第 {step} 步结束，无工具调用")
            if response.stop_reason == "end_turn":
                return full_text
            # 如果异常结束，也返回
            return full_text or"(No response)"

        # 4. 执行工具
        logger.info(f"第 {step} 步工具调用: {tool_calls}")
        all_outputs = []
        for tc in tool_calls:
            if tc.name == "bash":
                cmd = tc.input.get("command")
                if cmd:
                    logger.info(f"[使用工具] {cmd}")  # 黄色显示命令
                    output = run_bash(cmd)
                    all_outputs.append(f"$ {cmd}\n{output}")
                    # 如果输出太长则截断打印
                    if len(output) > 200:
                        logger.info(f"输出: {output[:200]}... (已截断)")
                    else:
                        logger.info(f"输出: {output}")
            else:
                logger.warning(f"Unknown tool: {tc.name}")
        
        # 5. 将命令执行结果添加到历史记录中
        if all_outputs:
            combined_output = "\n".join(all_outputs)
            history.append({"role": "user", "content": f"执行结果：\n{combined_output}\n\n请继续处理。"})
        else:
            # 有工具调用但没产生输出（可能是解析失败或空命令）
            history.append({"role": "user", "content": "Error: Tool call failed or produced no output."})

    return"达到最大执行步数限制，停止执行。"

if __name__ == "__main__":
    if len(sys.argv) > 1:
        logger.info(chat(sys.argv[1]))
    else:
        # 交互模式
        logger.info("Bash 代理已启动。输入 'exit' 退出。")
        history = []
        whileTrue:
            try:
                user_input = input("> ")
                if user_input.lower() in ['exit', 'quit']:
                    break
                chat(user_input, history)
            except KeyboardInterrupt:
                logger.info("\n正在退出...")
                break
            except Exception as e:
                logger.info(f"\n错误: {e}")
                traceback.print_exc()

3.1 v1: 模型即代理

Agent 的核心就是自主决策，只需要人工给定目标+约束规则，让大模型自主决策怎么调用工具，怎么遵循规则，这样才能做到通用性。

传统助手模式：

用户 -> 模型 -> 文本回复

Agent 系统模式：

用户 -> 模型 -> [工具 -> 结果]* -> 回复
                     ^_________|

* 模型可以反复调用工具，直到它认为任务完成为止，把「聊天机器人」升级为「自主代理」。

Claude Code 之类系统可能挂了 10～20 个工具，但对于一个「本地代码助手」，4 个就够覆盖 80–90% 的场景：

工具	用途	示例能力
`bash`	运行命令	`npm install`, `git status`, `pytest`, `ls`
`read_file`	读取文件内容	查看 `src/index.ts` 的具体实现
`write_file`	创建/覆盖文件	创建 `README.md`、生成新模块
`edit_file`	精确修改片段	在一个函数内插入日志、重构一个方法

有了这 4 个工具，模型就是可以完成如下事情：

探索代码库：bash: find, ls, tree, rg/grep
理解代码：read_file 查看具体文件内容
做出修改：
- 新建/完全重写：write_file
- 小范围精修：edit_file
运行和验证：bash: python, npm test, make, pytest, go test ...

3.2 架构

模型是决策者：何时调用工具、调用哪些工具、以什么顺序、何时停止，都由模型决定。
代码只做两件事：
1. 提供一组工具（带清晰的输入 schema 和语义）
2. 驱动「模型 → 工具 → 结果 → 模型」的循环

agent_loop 正是 v1 的核心：

while True:
    response = model(messages, tools)

    # 打印文本输出
    if no tool_use:
        return

    results = execute(response.tool_calls)
    messages.append(response)
    messages.append(results)

模型控制循环
只要 stop_reason == "tool_use"，说明模型还在「思考 + 操作」，没准备给出最终答案。
一旦 stop_reason != "tool_use"，模型就认为任务完成，返回最终文本。

工具结果成为上下文
工具执行结果以 "user" 消息的形式追加回对话，模型下次调用时就能「看到」自己刚刚操作的结果。

记忆自动累积
所有对话内容、工具调用、结果都放在 messages 里，模型自然拥有整个任务的上下文。

代码逻辑极薄
你几乎看不到复杂状态机、计划器、子任务调度器——这些都交给模型的「思考 + 工具调用策略」来涌现。

3.3 为什么这样设计?

1. 简单

没有显式状态机
没有 planner/parser 模块
没有自定义框架

只有：messages、tools、while True。

2. 模型负责思考

哪个工具：bash / read / write / edit？
什么顺序：先看文件？先找入口？再修改？再跑测试？
何时停止：任务是否已经完成？

全都交给模型，用自然语言和工具 schema 引导，而不是手动写死流程。

3. 透明可观测

每个工具调用都显式出现在 messages 中 (tool_use + tool_result)
你可以把 messages 持久化，恢复整个过程、调试行为

4. 可扩展性强

添加新工具的成本非常低：

实现一个 Python 函数（例如 tool_http_request）。
在 TOOLS 列表中增加一个带 JSON schema 的条目。
在 execute_tool 中分发到对应函数。

无需改 Agent 循环，循环 Agent 核心调度方式。

3.4 完整代码

from pathlib import Path
import sys
import traceback
from llm_factory import LLMFactory, LLMChatAdapter
from util.mylog import logger
from utils import execute_base_tools, BASIC_TOOLS

# 初始化 API 客户端
# 使用 LLMFactory 创建 LLM 实例
llm = LLMFactory.create(
    model_type="openai",
    model_name="deepseek-v3.2", # 使用支持的模型
    temperature=0.0,
    max_tokens=8192
)
client = LLMChatAdapter(llm)
WORKDIR = Path.cwd()

SYSTEM = f"""你是一个位于 {WORKDIR} 的编码代理，系统为 {sys.platform}。

## 执行流程
简要思考 -> 使用工具（使用 TOOLS） -> 报告结果。

## 规则
- 优先使用工具而不是文字描述。先行动，不要只是解释。  
- 永远不要臆造文件路径。如果不确定，先使用 bash ls/find 确认。  
- 做最小的修改。不要过度设计。  
- 完成后，总结变更内容。  

## 要求：
- 循环尽量简单，不要复杂。  
"""

def execute_tool(name: str, args: dict) -> str:
    result = execute_base_tools(name, args)
    if result isnotNone:
        return result
    returnf"Unknown tool: {name}"

def agent_loop(prompt, history=None, max_steps=10) -> list:
    if history isNone:
        history = []
    
    # 检查历史记录中是否已有系统提示词（作为系统消息）
    has_system = any(msg.get("role") == "system"for msg in history)
    ifnot has_system:
        # 在开头添加系统提示词作为系统消息
        history.insert(0, {"role": "system", "content": SYSTEM})

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt=prompt,
            messages=history,
            tools=BASIC_TOOLS,
        )

        assistant_text = []
        tool_calls = []
        
        for block in response.content:
            if getattr(block, "type", "") == "text":
                assistant_text.append(block.text)
            elif getattr(block, "type", "") == "tool_use":
                tool_calls.append(block)
        
        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            history.append({"role": "assistant", "content": full_text})
            logger.info(f"第 {step} 步结束，无工具调用")
            return history

        results = []
        for tc in tool_calls:
            logger.info(f"\n> [使用工具] {tc.name} 第 {step} 步调用工具: {tc.input}")
            output = execute_tool(tc.name, tc.input)
            preview = output[:200] + "..."if len(output) > 200else output
            logger.info(f"  [使用工具] {tc.name}, 输入: {tc.input}, 返回: {preview}")
            results.append(f"工具 {tc.name}, 输入: {tc.input}, 返回: {output}")

        history.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        history.append({"role": "user", "content": f"执行结果：\n{combined_output}\n\n请继续处理"})

    logger.info(f"第 {step} 步达到最大执行步数限制，停止执行。")

def main():
    logger.info(f"Mini Claude Code v1 - {WORKDIR}")
    logger.info("Type 'exit' to quit.\n")

    history = []
    whileTrue:
        try:
            user_input = input("You: ").strip()
        except (EOFError, KeyboardInterrupt):
            break

        ifnot user_input or user_input.lower() in ("exit", "quit", "q"):
            break

        history.append({"role": "user", "content": user_input})
        try:
            agent_loop('', history, max_steps=10)
        except Exception as e:
            logger.error(f"Error: {e}")
            traceback.print_exc()

if __name__ == "__main__":
    main()

4 v2: 结构化规划与 Todo

v1 已经能正常工作，但在复杂任务上，它容易失去方向：

在不同子任务之间来回跳转
很难记住已经做了什么、未做什么
难以向用户呈现清晰的进度

v2 我们只需要做一个 Plan：Todo 工具（增加少量状态管理与提示逻辑），就能实现 "计划（使用 TodoWrite） -> 使用工具行动（使用 TOOLS）"。

在 v1 的工具列表基础上，v2 新增 TodoManager 管理和 TodoWrite 工具。

4.1 TodoManager：带约束的任务列表

核心是一个带约束的列表管理器：

class TodoManager:
    """
    管理具有强制约束的结构化任务列表。

    关键设计决策：
    --------------------
    1. 最多 20 项：防止模型创建无尽的列表
    2. 一个进行中：强制专注 - 一次只能做一件事
    3. 必填字段：每个项目需要 content, status 和 activeForm

    activeForm 字段值得解释：
    - 它是正在发生的事情的现在时形式
    - 当 status 为 "in_progress" 时显示
    - 示例：content="Add tests", activeForm="Adding unit tests..."

    这提供了代理正在做什么的实时可见性。
    """

    def __init__(self):
        self.items = []

    def update(self, items: list) -> str:
        """
        验证并更新任务列表。

        模型每次发送一个完整的列表。我们验证它，
        存储它，并返回一个模型将看到的渲染视图。

        验证规则：
        - 每个项目必须有：content, status, activeForm
        - Status 必须是：pending | in_progress | completed
        - 一次只能有 ONE 个项目处于 in_progress 状态
        - 最多允许 20 个项目

        Returns:
            任务列表的渲染文本视图
        """
        validated = []
        in_progress_count = 0

        for i, item in enumerate(items):
            # Extract and validate fields
            content = str(item.get("content", "")).strip()
            status = str(item.get("status", "pending")).lower()
            active_form = str(item.get("activeForm", "")).strip()

            # Validation checks
            ifnot content:
                raise ValueError(f"Item {i}: content required")
            if status notin ("pending", "in_progress", "completed"):
                raise ValueError(f"Item {i}: invalid status '{status}'")
            ifnot active_form:
                raise ValueError(f"Item {i}: activeForm required")

            if status == "in_progress":
                in_progress_count += 1

            validated.append({
                "content": content,
                "status": status,
                "activeForm": active_form
            })

        # Enforce constraints
        if len(validated) > 20:
            raise ValueError("Max 20 todos allowed")
        if in_progress_count > 1:
            raise ValueError("Only one task can be in_progress at a time")

        self.items = validated
        return self.render()

    def render(self) -> str:
        """
        将任务列表渲染为人类可读的文本。

        格式：
            [x] 已完成任务
            [>] 进行中任务 <- 正在做某事...
            [ ] 待办任务

            (2/3 completed)

        这个渲染后的文本是模型作为工具结果看到的内容。
        然后它可以根据当前状态更新列表。
        """
        ifnot self.items:
            return"No todos."

        lines = []
        for item in self.items:
            if item["status"] == "completed":
                lines.append(f"[x] {item['content']}")
            elif item["status"] == "in_progress":
                lines.append(f"[>] {item['content']} <- {item['activeForm']}")
            else:
                lines.append(f"[ ] {item['content']}")

        completed = sum(1for t in self.items if t["status"] == "completed")
        lines.append(f"\n({completed}/{len(self.items)} completed)")

        return"\n".join(lines)

这里的约束是有意设计的「规则」：

规则	原因
最多 20 条	防止模型把 todo 当成无限备忘录
必须有 `content`	保证每条任务有明确描述
必须有 `status`	让任务进度可追踪
必须有 `activeForm`	帮助描述“当前正在做的具体动作”
只能一个 `in_progress`	强制模型一次只专注一个任务
内容不能重复	防止模型无意义地复制条目

这些约束同时限制了行为空间，又增强了可控性和可观察性。

4.2 Todo 工具的执行与反馈

在 Agent 端的实现中：

todo_manager = TodoManager()

def execute_tool(name, args):
    if name == "TodoWrite":
        try:
            todo_manager.update(args["items"])
            # 返回给模型看的文本形式，模型可以据此继续思考和更新
            return todo_manager.summary_text()
        except Exception as e:
            return f"[TodoWrite error] {e}"

    # 其他工具: bash / read_file / write_file / edit_file...
    ...

以一次调用为例：

模型调用 TodoWrite 输入：

{
    "items": [
        {
            "content": "重构认证模块",
            "status": "completed",
            "activeForm": "重构已完成"
        },
        {
            "content": "添加单元测试",
            "status": "in_progress",
            "activeForm": "正在为认证模块编写单元测试"
        },
        {
            "content": "更新文档",
            "status": "pending",
            "activeForm": "准备在完成测试后更新文档"
        }
    ]
}

工具返回给模型的文本（作为 tool_result）：

[x] 重构认证模块  (重构已完成)
[>] 添加单元测试  (正在为认证模块编写单元测试)
[ ] 更新文档      (准备在完成测试后更新文档)
(1/3 已完成)

模型在下一轮调用时，就能「看到自己刚刚整理的计划」，并基于这个结构化状态决定接下来要做什么。

4.3 系统提示词：软约束鼓励使用 Todo

Todo 的使用不是强制性的，而是通过 软提醒 进行引导：

INITIAL_REMINDER = "<reminder>对于多步骤任务，请使用 TodoWrite 工具创建和维护一个清晰的 todo 列表。</reminder>"
NAG_REMINDER = "<reminder>已经超过 10 轮未更新 todo，请检查是否需要补充或更新 TodoWrite。</reminder>"

在对话的合适位置，注入一段额外文本（通常作为 system 或额外的 user 内容），让模型意识到应该使用 Todo 工具。
这些提醒不是一个单独工具调用，也不需要模型回复，只是一个“背景提示”。

示例：

while True:
    ...

    if first_message:
        content.append(INITIAL_REMINDER)
        first_message = False
    elif rounds_without_todo > max_steps:
        content.append(NAG_REMINDER)

    content.append(f"输入：{user_input}")
    history.append({"role": "user", "content": "\n".join(content)})

    try:
        agent_loop('', history, max_steps=max_steps)
    except Exception as e:
        logger.error(f"Error: {e}")
        traceback.print_exc()

效果：

模型被“温柔地提醒”：对于多步骤任务，Todo 是推荐做法。
如果模型长期不更新 Todo（比如执行了很多工具调用），会收到 “该更新计划了” 的提示。

4.4 架构

显式规划让 Agent 更可靠，而 Todo 是实现显式规划的最小结构单元。
结构既是约束，也是能力放大的脚手架。

约束：
- 限制条目数、状态字段、唯一进行中
- 要求完整列表，而非局部 patch
赋能：
- 提供了可见的计划（用户、模型都能看到）
- 提供了进度追踪和当前焦点的清晰标记
- 为后续的总结、回顾提供结构

类似的模式在 Agent 设计中普遍存在：

max_tokens 约束 → 赋能了响应可控与流式体验
工具 JSON Schema 约束 → 赋能了结构化调用和验证
Todo 约束 → 赋能了复杂任务的可靠完成和可解释性

好的约束不是阻碍，而是让能力更稳定、更可控的支架。

4.5 完整代码

v2 是在 v1 基础上的增量扩展，不改变核心 Agent 循环：

import traceback
import sys
from pathlib import Path
from llm_factory import LLMFactory, LLMChatAdapter
from util.mylog import logger
from utils import execute_base_tools, TodoManager, BASE_TOOLS

# 初始化 API 客户端
# 使用 LLMFactory 创建 LLM 实例
llm = LLMFactory.create(
    model_type="openai",
    model_name="deepseek-v3.2", # 使用支持的模型
    temperature=0.0,
    max_tokens=8192
)
client = LLMChatAdapter(llm)
WORKDIR = Path.cwd()
TODO = TodoManager()

SYSTEM = f"""你是一个位于 {WORKDIR} 的编码代理，系统为 {sys.platform}。

## 执行流程
计划（使用 TodoWrite） -> 使用工具行动（使用 TOOLS） -> 更新任务列表 -> 报告。

## 规则
- 使用 TodoWrite 跟踪多步骤任务
- 开始前将任务标记为 in_progress，完成后标记为 completed
- 优先使用工具而不是文字描述。先行动，不要只是解释。
- 完成后，总结变更内容。"""

# 在对话开始时显示
INITIAL_REMINDER = "<reminder>使用 TodoWrite 处理多步骤任务。</reminder>"

# 如果模型在一段时间内没有更新任务列表，则显示此提醒
NAG_REMINDER = "<reminder>超过 10 轮未更新任务列表。请更新任务列表。</reminder>"

max_steps = 20
rounds_without_todo = 0

def run_todo(items: list) -> str:
    try:
        return TODO.update(items)
    except Exception as e:
        returnf"Error: {e}"

def execute_tool(name: str, args: dict) -> str:
    """Dispatch tool call to implementation."""
    result = execute_base_tools(name, args)
    if result isnotNone:
        return result

    if name == "TodoWrite":
        return run_todo(args["items"])

    returnf"Unknown tool: {name}"

def agent_loop(prompt: str, history: list = [], max_steps: int = max_steps) -> list:
    global rounds_without_todo

    # 检查历史记录中是否已有系统提示词（作为系统消息）
    has_system = any(msg.get("role") == "system"for msg in history)
    ifnot has_system:
        # 在开头添加系统提示词作为系统消息
        history.insert(0, {"role": "system", "content": SYSTEM})

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt=prompt,
            messages=history,
            tools=BASE_TOOLS,
        )

        assistant_text = []
        tool_calls = []

        for block in response.content:
            if hasattr(block, "text"):
                assistant_text.append(block.text)
                logger.info(block.text)
            if block.type == "tool_use":
                tool_calls.append(block)

        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            history.append({"role": "assistant", "content": full_text})
            logger.info(f"第 {step} 步结束，无工具调用")
            return history

        results = []
        used_todo = False

        for tc in tool_calls:
            logger.info(f"\n> [使用工具] {tc.name} 第 {step} 步调用: {tc.input}")
            output = execute_tool(tc.name, tc.input)
            preview = output[:200] + "..."if len(output) > 200else output
            logger.info(f" [使用工具] {tc.name}, 输入: {tc.input}, 返回: {preview}")
            results.append(f"工具 {tc.name}, 输入: {tc.input}, 返回: {output}")
            if tc.name == "TodoWrite":
                used_todo = True

        if used_todo:
            rounds_without_todo = 0
        else:
            rounds_without_todo += 1

        history.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        history.append({"role": "user", "content": f"执行结果：\n{combined_output}\n\n请继续处理"})

def main():
    global rounds_without_todo

    logger.info(f"Mini Claude Code v2 (with Todos) - {WORKDIR}")
    logger.info("Type 'exit' to quit.\n")

    history = []
    first_message = True

    whileTrue:
        try:
            user_input = input("You: ").strip()
        except (EOFError, KeyboardInterrupt):
            break

        ifnot user_input or user_input.lower() in ("exit", "quit", "q"):
            break

        content = []

        if first_message:
            content.append(INITIAL_REMINDER)
            first_message = False
        elif rounds_without_todo > max_steps:
            content.append(NAG_REMINDER)

        content.append(f"输入：{user_input}")
        history.append({"role": "user", "content": "\n".join(content)})

        try:
            agent_loop('', history, max_steps=max_steps)
        except Exception as e:
            logger.error(f"Error: {e}")
            traceback.print_exc()

if __name__ == "__main__":
    main()

5 v3: 子代理

v2 已经有了 Todo 规划，但对于更大的任务，比如：“先探索代码库，再重构认证，然后补测试和文档”

单一 Agent 很容易撞上上下文污染与角色混乱：

探索阶段读了 20 个文件，把大量细节塞进上下文
重构时，又在同一个上下文里继续对话
模型很难在「海量历史」中保持聚焦和角色清晰

v3 添加了一个新工具：SubTask，它可以生成带有隔离上下文的「子代理」，每个子代理专注完成一个子任务。

5.1 问题：单 Agent 的上下文污染

在 v2 中，大型任务的历史会变成这样：

主 Agent 历史：
  [探索中...] cat src/auth/login.py -> 500 行
  [探索中...] cat src/auth/session.py -> 300 行
  [探索中...] cat src/models/user.py -> 400 行
  ...
  （15+ 个文件内容）
  [现在重构...] "等等，login.py 里具体是什么来着？"

模型需要在「聊天记录 + N 个文件内容 + Todo 列表」的混合上下文里继续工作，容易：

失焦：在旧文件和新任务之间来回跳跃
浪费上下文：很多信息是阶段性的，只用于探索
难以区分角色：探索 vs 规划 vs 实现

解决方案：把不同阶段委托给不同的子代理。

5.2 思路：用子代理隔离阶段

把「探索 → 规划 → 实现」拆成三个子代理，每个子代理在一个干净的上下文里工作：

主 Agent 历史：
  [Task: explore] 探索代码库
    -> 子代理( explore )：读取 20 个文件
    -> 返回摘要: "认证在 src/auth/，数据库在 src/models/..."
  [Task: plan] 设计 JWT 迁移方案
    -> 子代理( plan )：分析结构，输出步骤
    -> 返回总结: "1. 添加 jwt 库 2. 创建 token 工具..."
  [Task: code] 实现 JWT 重构
    -> 子代理( code )：编辑文件、运行测试
    -> 返回结果: "创建 jwt_utils.py，修改 login.py ..."
  [主 Agent] 汇总更改并向用户汇报

对于主 Agent 来说：

存的是子代理摘要，而不是全部细节
每个子代理内部使用和 v1/v2 相同的工具循环，但上下文是隔离的

5.3 代理类型注册表：给不同子代理不同「角色 + 权限」

通过一个简单的注册表定义不同类型代理：

AGENT_TYPES = {
    "explore": {
        "description": "只读，用于搜索和分析代码结构",
        "tools": ["bash", "read_file"],  # 不能写
        "prompt": "你是一个探索子代理，只负责搜索和分析项目代码。不要修改任何文件。返回简洁、结构化的摘要。"
    },
    "code": {
        "description": "完整读写能力，用于实现变更",
        "tools": "*",  # 所有工具（但通常不含 Task，避免递归）
        "prompt": "你是一个代码实现子代理，负责根据要求修改代码并跑测试。要高效、谨慎，做完后总结改动。"
    },
    "plan": {
        "description": "规划与分析，不做修改",
        "tools": ["bash", "read_file"],  # 只读
        "prompt": "你是一个规划子代理，负责分析现有代码并输出编号计划。不要编辑文件，只提出可执行方案。"
    }
}

这样可以明确区分：

explore：只看不写
plan：只看不写，专注输出计划
code：可以改动代码并跑测试

模型在主 Agent 中，通过 Task 工具选择 agent_type，从而决定派出什么「橘色」的子代理。

5.4 Task: 定义 schema + 控制每种子代理的能力边界

5.4.1 定义 schema

TASK_TOOL = {
    "name": "Task",
    "description": "创建一个聚焦的子任务，并用指定类型的子代理在隔离上下文中执行它。",
    "input_schema": {
        "type": "object",
        "properties": {
            "description": {
                "type": "string",
                "description": "子任务的短名称（3-5 个词），用来在日志/进度条中显示。"
            },
            "prompt": {
                "type": "string",
                "description": "详细的自然语言指令，子代理看到的用户任务描述。"
            },
            "agent_type": {
                "type": "string",
                "enum": ["explore", "code", "plan"],
                "description": "子代理的类型（决定工具与系统提示）。"
            }
        },
        "required": ["description", "prompt", "agent_type"],
    },
}

主 Agent 调用 Task 时，大致会传：

{
    "description": "探索认证代码",
    "prompt": "找到所有与用户登录和认证相关的文件，阅读后给出结构化摘要。",
    "agent_type": "explore"
}

Task 工具会启动一个子代理，子代理跑完整个 v1/v2 的工具循环，最后返回一段总结文本给主 Agent。

5.4.2 控制每种子代理的能力边界

get_tools_for_agent 的实现决定了每种子代理可以用哪些工具：

def get_tools_for_agent(agent_type):
    allowed = AGENT_TYPES[agent_type]["tools"]
    if allowed == "*":
        # code 子代理：拥有所有基础工具（不包含 Task 避免递归）
        return [t for t in BASE_TOOLS if t["name"] != "Task"]
    else:
        # explore / plan 子代理：只给指定子集
        return [
            t for t in BASE_TOOLS
            if t["name"] in allowed and t["name"] != "Task"
        ]

默认策略（可根据需要微调）：

explore：只读 -> bash + read_file
plan：只读 -> bash + read_file
code：读写/执行 -> bash + read_file + write_file + edit_file + TodoWrite（但不含 Task）

5.5 对比 v2 与 v3

探索/规划子代理不可能意外修改文件
code 子代理可以对当前子任务做完整修改与验证
主 Agent 负责「宏观协调」，子代理负责「局部执行」

功能	v2	v3
上下文形态	单一上下文，不断增长	主 Agent + 多个子代理，各自隔离
探索行为	直接在主 Agent 中执行	通过 explore 子代理执行并总结
规划行为	使用 Todo + 主 Agent 自己规划	可选 plan 子代理生成更结构化的计划
代码修改	主 Agent 直接改	code 子代理在专用上下文中改
并行潜力	理论可以，但逻辑复杂	子代理天然支持并行（演示版未实现并发调度）

核心循环仍然是：

while True:
    resp = model(messages, tools)
    if resp.stop_reason != "tool_use":
        return resp
    results = execute(resp.tool_calls)
    messages.append(resp)
    messages.append(results)

v3 只是让这个循环在多个上下文中同时存在（主 Agent + 子代理），并通过 Task 工具协调这些循环。

5.6 模式：分而治之 + 上下文隔离

抽象出来就是：

复杂任务
  └─ 主 Agent（协调者）
       ├─ 子代理 A (explore) -> 返回「结构摘要」
       ├─ 子代理 B (plan)    -> 返回「任务计划」
       └─ 子代理 C (code)    -> 返回「实现结果」

所有 Agent（主 + 子）的内部结构都是同一个：

有工具
有系统提示词
有 while True 工具循环

只是上下文不同、工具集不同、角色Prompt不同。

5.7 详细代码

import sys
import time
import traceback
from pathlib import Path
from llm_factory import LLMFactory, LLMChatAdapter
from util.mylog import logger
from utils import execute_base_tools, TodoManager, get_agent_descriptions, BASE_TOOLS, SUBAGENT_ALL_TOOLS, AGENT_TYPES

# 初始化 API 客户端
# 使用 LLMFactory 创建 LLM 实例
llm = LLMFactory.create(
    model_type="openai",
    model_name="deepseek-v3.2", # 使用支持的模型
    temperature=0.0,
    max_tokens=8192
)
client = LLMChatAdapter(llm)
WORKDIR = Path.cwd()

TODO = TodoManager()

SYSTEM = f"""你是一个位于 {WORKDIR} 的编码代理，系统为 {sys.platform}。

## 执行流程
计划（使用 TodoWrite）-> 使用工具行动 -> 执行子代理工具 -> 报告。

你可以为复杂的子任务生成子代理：
{get_agent_descriptions()}

## 规则
- 对需要集中探索或实现的子任务使用 Task 工具
- 使用 TodoWrite 跟踪多步骤工作
- 优先使用工具而不是文字描述。先行动，不要只是解释。
- 完成后，总结变更内容。"""

max_steps = 20

def get_tools_for_agent(agent_type: str) -> list:
    allowed = AGENT_TYPES.get(agent_type, {}).get("tools", "*")

    if allowed == "*":
        return BASE_TOOLS  # All base tools, but NOT Task (no recursion in demo)

    return [t for t in BASE_TOOLS if t["name"] in allowed]

def run_task(description: str, prompt: str, agent_type: str, max_steps: int = max_steps) -> str:
    """
    在隔离的上下文中执行子代理任务。

    这是子代理机制的核心：

    1. 创建隔离的消息历史（关键：没有父级上下文！）
    2. 使用特定于代理的系统提示词
    3. 根据代理类型过滤可用工具
    4. 运行与主代理相同的查询循环
    5. 仅返回最终文本（不是中间细节）

    父代理只看到总结，保持其上下文干净。

    进度显示：
    ----------------
    运行时，我们会显示：
      [explore] find auth files ... 5 tools, 3.2s

    这在不污染主对话的情况下提供了可见性。
    """
    if agent_type notin AGENT_TYPES:
        returnf"Error: Unknown agent type '{agent_type}'"

    config = AGENT_TYPES[agent_type]

    sub_system = f"""你是一个位于 {WORKDIR} 的 {agent_type} 子代理，系统为 {sys.platform}。

{config["prompt"]}

完成任务并返回清晰、简洁的总结。"""

    sub_tools = get_tools_for_agent(agent_type)
    sub_messages = [{"role": "system", "content": sub_system}, {"role": "user", "content": prompt}]

    logger.info(f"      [子代理][{agent_type}] {description}")
    start = time.time()
    tool_count = 0

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt='',
            messages=sub_messages,
            tools=sub_tools,
        )

        assistant_text = []
        tool_calls = []

        for block in response.content:
            if hasattr(block, "text"):
                assistant_text.append(block.text)
                logger.info(f"      [子代理][{agent_type}] {block.text}")
            if block.type == "tool_use":
                tool_calls.append(block)

        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            logger.info(f"      [子代理][{agent_type}] 第 {step} 步结束，无工具调用")
            break

        results = []

        for tc in tool_calls:
            tool_count += 1
            output = execute_tool(tc.name, tc.input)
            results.append(f"      [子代理][{agent_type}] 工具 {tc.name}, 输入: {tc.input}, 返回: {output}")
            elapsed = time.time() - start
            sys.stdout.write(
                f"\r      [子代理][{agent_type}] {description} ... {tool_count} tools, {elapsed:.1f}s\n"
            )
            sys.stdout.flush()

        sub_messages.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        sub_messages.append({"role": "user", "content": f"子代理执行结果：\n{combined_output}\n\n请继续处理"})

    elapsed = time.time() - start
    sys.stdout.write(
        f"\r      [子代理][{agent_type}] {description} - done ({tool_count} tools, {elapsed:.1f}s)\n"
    )

    for block in response.content:
        if hasattr(block, "text"):
            return full_text

    return"(subagent returned no text)"

def execute_tool(name: str, args: dict) -> str:
    result = execute_base_tools(name, args)
    if result isnotNone:
        return result

    if name == "TodoWrite":
        try:
            return TODO.update(args["items"])
        except Exception as e:
            returnf"Error: {e}"

    if name == "Task":
        return run_task(args["description"], args["prompt"], args["agent_type"])

    returnf"Unknown tool: {name}"

def agent_loop(prompt: str, history: list, max_steps: int = max_steps) -> list:
    # 检查历史记录中是否已有系统提示词（作为系统消息）
    has_system = any(msg.get("role") == "system"for msg in history)
    ifnot has_system:
        # 在开头添加系统提示词作为系统消息
        history.insert(0, {"role": "system", "content": SYSTEM})

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt=prompt,
            messages=history,
            tools=SUBAGENT_ALL_TOOLS,
        )

        assistant_text = []
        tool_calls = []

        for block in response.content:
            if hasattr(block, "text"):
                assistant_text.append(block.text)
                logger.info(block.text)
            if block.type == "tool_use":
                tool_calls.append(block)

        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            history.append({"role": "assistant", "content": full_text})
            logger.info(f"第 {step} 步结束，无工具调用")
            return history

        results = []
        for tc in tool_calls:
            if tc.name == "Task":
                logger.info(f"\n> [使用工具] Task 第 {step} 步调用: {tc.input.get('description', 'subtask')}")
            else:
                logger.info(f"\n> [使用工具] {tc.name} 第 {step} 步调用: {tc.input}")

            logger.info(f"  输入: {tc.input}")
            output = execute_tool(tc.name, tc.input)
            if tc.name != "Task":
                preview = output[:200] + "..."if len(output) > 200else output
                logger.info(f"  [使用工具] {tc.name}, 返回: {preview}")
                
            results.append(f"工具 {tc.name}, 输入: {tc.input}, 返回: {output}")

        history.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        history.append({"role": "user", "content": f"执行结果：\n{combined_output}\n\n请继续处理"})

def main():
    logger.info(f"Mini Claude Code v3 (with Subagents) - {WORKDIR}")
    logger.info(f"Agent types: {', '.join(AGENT_TYPES.keys())}")
    logger.info("Type 'exit' to quit.\n")

    history = []

    whileTrue:
        try:
            user_input = input("You: ").strip()
        except (EOFError, KeyboardInterrupt):
            break

        ifnot user_input or user_input.lower() in ("exit", "quit", "q"):
            break

        history.append({"role": "user", "content": user_input})

        try:
            agent_loop('', history, max_steps=max_steps)
        except Exception as e:
            logger.error(f"Error: {e}")
            traceback.print_exc()

if __name__ == "__main__":
    main()

6 v4: Skills

Skills 是知识包，不是工具本身，Skills 机制体现了一个深刻的范式转变：知识外化 (Knowledge Externalization)。

传统方式：知识内化于参数

传统 AI 系统中，大部分知识被封装在模型参数里：

你看不到里面有什么
你不能精确编辑其中某一部分
很难在不同系统之间精确复用

想让模型学会新技能，一般流程是：

收集大量相关训练数据
准备/租用分布式训练集群
运行复杂的微调流程（LoRA / 全量微调等）
部署一个新模型版本，并维护兼容性

知识被锁死在神经网络的权重矩阵中，对用户来说基本是不可见、不可编辑、不可精确复用的。

新范式：知识外化为文档

有了「代码执行 + 文件系统」的范式后，知识可以逐层外化：

┌─────────────────────────────────────────────────────────────────┐
│                         知识存储层级                              │
│                                                                 │
│  Model Parameters → Context Window → File System → Skill Library│
│       (内化)            (运行时)        (持久化)       (结构化)     │
│                                                                 │
│  ←───────── 训练修改 ──────────→  ←──── 自然语言/文本编辑 ────→     │
│     需要集群、数据、专业知识              任何人都能参与               │
└─────────────────────────────────────────────────────────────────┘

关键变化：

过去：修改行为 = 修改参数 → 需要训练 → 需要 GPU 集群 + 数据 + ML 技能
现在：修改行为 = 修改 SKILL.md → 就像写文档 → 任何人都可以做

这类似于给 base model 外挂一个「可热插拔的知识模块」，但你不需要对模型做任何参数级别的改动。

6.1 为什么 `SKill` 这种工程范式很重要

民主化
不再需要 ML 专业知识就能「定制模型行为」。写 Markdown 文档即可。
透明性
知识存在于人类可读的 SKILL.md 中，可审计、可理解、可讨论。
复用性
一个 Skill 编写一次，可以在任何兼容的 Agent 框架中加载使用。
版本控制
用 Git 管理 Skill 变更：支持协作、Code Review 和回滚。
在线学习
模型在更大的上下文窗口中**即时「学习」**技能内容，无需离线训练。

Skills 正好处在一个「可以工程化的范式」中：

持久化存储（文件系统）
按需加载（只在需要时注入）
人类可编辑（Markdown 文档）

6.2 v3 做了结构与上下文，v4 解决「知识从哪来」的问题

v3 引入了子代理机制，让 Agent 可以：

分阶段处理任务（explore / plan / code）
每个阶段在各自的上下文中运行

但还有一个更深的问题：模型怎么知道「做这件事的正确方法」？
这些不是「工具」，而是领域知识 / 专业技能，Tools 决定模型「能做什么」，Skills 决定模型「知道怎么做」。

工具 vs 技能

概念	定义	例子
Tool	模型能做什么	bash, read_file, http, etc.
Skill	模型知道怎么做	PDF 处理、MCP 构建、代码审查

Tool 是动作层面的能力（能执行什么操作）。
Skill 是策略/知识层面的积累（做这件事的正确方法、最佳实践）。

控制上下文
为了控制上下文开销，Skill 被分成三层：

Layer 1: 元数据 (始终加载，极小)    ~100 tokens/skill
         └─ name + description （用于检索/展示）

Layer 2: SKILL.md 正文 (触发时加载) ~2k tokens 级别
         └─ 详细指南、分步骤说明、示例等

Layer 3: 资源文件 (必要时再查)      无硬限制
         └─ scripts/, references/, assets/ 等

这样可以做到：

平时只把轻量的「技能列表/描述」放进 prompt（或系统提示）。
当模型「决定使用某个 Skill」时，再通过工具调用真正加载 SKILL.md 正文。
如果正文还引用了脚本/示例，可以按需再用普通文件工具读取。

6.3 Skill 目录结构与 SKILL.md 标准

Skill 目录结构

skills/
├── pdf/
│   └── SKILL.md          # 必需
├── mcp-builder/
│   ├── SKILL.md
│   └── references/       # 可选（额外资料）
└── code-review/
    ├── SKILL.md
    └── scripts/          # 可选（示例脚本/模板）

SKILL.md 采用「YAML 前置 + Markdown 正文」格式：

---
name: pdf
description: 处理 PDF 文件。用于读取、创建或合并 PDF。
---

**Skill 工具定义**

```python
SKILL_TOOL = {
    "name": "Skill",
    "description": "加载一个技能的文档，以获得领域知识和最佳实践。",
    "input_schema": {
        "type": "object",
        "properties": {
            "skill": {
                "type": "string",
                "description": "要加载的 skill 名称，例如 'pdf'、'mcp-builder'。"
            }
        },
        "required": ["skill"],
    },
}

6.4 Skill 工具执行逻辑：缓存友好式注入

关键点：Skill 内容作为 tool_result 追加到 messages 末尾，而不是修改 system prompt 或历史前缀。

def run_skill(skill_name: str) -> str:
    try:
        content = skill_loader.get_skill_content(skill_name)
    except KeyError:
        return f"[Skill error] Skill '{skill_name}' not found."

    # 完整 Skill 内容作为 tool_result 返回
    # 包在一个标签里，方便模型识别
    return f"""<skill-loaded name="{skill_name}">
{content}
</skill-loaded>

Now follow the instructions and best practices described in this skill."""

在统一的 execute_tool 中：

def execute_tool(name, args):
    if name == "Skill":
        return run_skill(args["skill"])

    # 其他工具: bash / read_file / write_file / edit_file / TodoWrite / Task...
    ...

在 agent_loop 中不需要任何结构性变化，只要像处理其他工具那样处理 Skill 即可：

def agent_loop(prompt: str, history: list, max_steps: int = max_steps) -> list:
    # 检查历史记录中是否已有系统提示词（作为系统消息）
    has_system = any(msg.get("role") == "system"for msg in history)
    ifnot has_system:
        # 在开头添加系统提示词作为系统消息
        history.insert(0, {"role": "system", "content": SYSTEM})

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt=prompt,
            messages=history,
            tools=ALL_TOOLS,
        )

        assistant_text = []
        tool_calls = []

        for block in response.content:
            if hasattr(block, "text"):
                assistant_text.append(block.text)
                logger.info(block.text)
            if block.type == "tool_use":
                tool_calls.append(block)

        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            history.append({"role": "assistant", "content": full_text})
            logger.info(f"第 {step} 步结束，无工具调用")
            return history

        results = []
        for tc in tool_calls:
            if tc.name == "Task":
                logger.info(f"\n> [使用工具] Task 第 {step} 步调用: {tc.input.get('description', 'subtask')}")
            elif tc.name == "Skill":
                logger.info(f"\n> [使用工具] 第 {step} 步调用 Loading skill: {tc.input.get('skill', '?')}")
            else:
                logger.info(f"\n> [使用工具] {tc.name} 第 {step} 步调用: {tc.input}")

            output = execute_tool(tc.name, tc.input)

            if tc.name == "Skill":
                logger.info(f"  Skill loaded ({len(output)} chars)")
            elif tc.name != "Task":
                preview = output[:200] + "..."if len(output) > 200else output
                logger.info(f"  [使用工具] {tc.name}, 返回: {preview}")

            results.append(f"工具 {tc.name}, 输入: {tc.input}, 返回: {output}")

        history.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        history.append({"role": "user", "content": f"执行结果：\n{combined_output}\n\n请继续处理"})

这样：

Skill 内容出现在对话末尾的 tool_result 文本里
前缀（system + 之前 history）完全不变 → prompt cache 可以完全复用
下次请求时，只需对新增的 Skill 内容和新问题部分做计算

6.5 设计哲学：从「训练 AI」到「教育 AI」

知识被提升为一等公民资源。

传统观点把 Agent 看作「调用工具的模型」——模型负责决策，工具负责执行，但这隐含一个前提：模型已经知道「如何使用这些工具解决问题」。

Skills 机制把领域知识从模型参数中剥离，做了一件事：

过去：要教模型新技能 → 收集数据 + 训练
现在：要教模型新技能 → 写/编辑 SKILL.md 文档

这是一种从「训练 AI」到「教育 AI」的转变：

技术上：从参数微调 → 上下文注入
组织上：从 ML 团队独占 → 任何工程师/领域专家都能参与
工程上：从黑盒权重 → 白盒文档（可审计、可 review）

6.6 架构图

6.7 详细代码

import re
import sys
import time
import traceback
from pathlib import Path
from llm_factory import LLMFactory, LLMChatAdapter
from util.mylog import logger
from utils import execute_base_tools, TodoManager, AGENT_TYPES, get_agent_descriptions, SUBAGENT_ALL_TOOLS, BASE_TOOLS

# 初始化 API 客户端
# 使用 LLMFactory 创建 LLM 实例
llm = LLMFactory.create(
    model_type="openai",
    model_name="deepseek-v3.2", # 使用支持的模型
    temperature=0.0,
    max_tokens=8192
)
client = LLMChatAdapter(llm)
WORKDIR = Path.cwd()
SKILLS_DIR = WORKDIR / "skills"

class SkillLoader:
    """
    从 SKILL.md 文件加载和管理技能。

    技能是一个包含以下内容的文件夹：
    - SKILL.md (必须): YAML frontmatter + markdown 说明
    - scripts/ (可选): 模型可以运行的辅助脚本
    - references/ (可选): 额外的文档
    - assets/ (可选): 模板，输出文件
    """

    def __init__(self, skills_dir: Path):
        self.skills_dir = skills_dir
        self.skills = {}
        self.load_skills()

    def parse_skill_md(self, path: Path) -> dict:
        content = path.read_text()

        # Match YAML frontmatter between --- markers
        match = re.match(r"^---\s*\n(.*?)\n---\s*\n(.*)$", content, re.DOTALL)
        ifnot match:
            returnNone

        frontmatter, body = match.groups()

        # Parse YAML-like frontmatter (simple key: value)
        metadata = {}
        for line in frontmatter.strip().split("\n"):
            if":"in line:
                key, value = line.split(":", 1)
                metadata[key.strip()] = value.strip().strip(""'")

        # Require name and description
        if"name"notin metadata or"description"notin metadata:
            returnNone

        return {
            "name": metadata["name"],
            "description": metadata["description"],
            "body": body.strip(),
            "path": path,
            "dir": path.parent,
        }

    def load_skills(self):
        ifnot self.skills_dir.exists():
            return

        for skill_dir in self.skills_dir.iterdir():
            ifnot skill_dir.is_dir():
                continue

            skill_md = skill_dir / "SKILL.md"
            ifnot skill_md.exists():
                continue

            skill = self.parse_skill_md(skill_md)
            if skill:
                self.skills[skill["name"]] = skill

    def get_descriptions(self) -> str:
        ifnot self.skills:
            return"(no skills available)"

        return"\n".join(
            f"- {name}: {skill['description']}"
            for name, skill in self.skills.items()
        )

    def get_skill_content(self, name: str) -> str:
        if name notin self.skills:
            returnNone

        skill = self.skills[name]
        content = f"# Skill: {skill['name']}\n\n{skill['body']}"

        resources = []
        for folder, label in [
            ("scripts", "Scripts"),
            ("references", "References"),
            ("assets", "Assets")
        ]:
            folder_path = skill["dir"] / folder
            if folder_path.exists():
                files = list(folder_path.glob("*"))
                if files:
                    resources.append(f"{label}: {', '.join(f.name for f in files)}")

        if resources:
            content += f"\n\n**Available resources in {skill['dir']}:**\n"
            content += "\n".join(f"- {r}"for r in resources)

        return content

    def list_skills(self) -> list:
        return list(self.skills.keys())


SKILLS = SkillLoader(SKILLS_DIR)
TODO = TodoManager()

SYSTEM = f"""你是一个位于 {WORKDIR} 的编码代理，系统为 {sys.platform}。

## 执行流程
计划（使用 TodoWrite） -> 使用工具行动 -> 报告。

**可用技能**（当任务匹配时使用 Skill 工具调用）：
{SKILLS.get_descriptions()}

**可用子代理**（对于需要集中注意力的子任务，使用 Task 工具调用）：
{get_agent_descriptions()}

规则：
- 当任务匹配技能描述时，**立即**使用 Skill 工具
- 对需要集中探索或实现的子任务使用 Task 工具
- 使用 TodoWrite 跟踪多步骤工作
- 优先使用工具而不是文字描述。先行动，不要只是解释。
- 完成后，总结变更内容。"""

# NEW in v4: Skill tool
SKILL_TOOL = {
    "name": "Skill",
    "description": f"""Load a skill to gain specialized knowledge for a task.

Available skills:
{SKILLS.get_descriptions()}

When to use:
- IMMEDIATELY when user task matches a skill description
- Before attempting domain-specific work (PDF, MCP, etc.)

The skill content will be injected into the conversation, giving you
detailed instructions and access to resources.""",
    "input_schema": {
        "type": "object",
        "properties": {
            "skill": {
                "type": "string",
                "description": "Name of the skill to load"
            }
        },
        "required": ["skill"],
    },
}

ALL_TOOLS = SUBAGENT_ALL_TOOLS + [SKILL_TOOL]
max_steps = 50

def get_tools_for_agent(agent_type: str) -> list:
    allowed = AGENT_TYPES.get(agent_type, {}).get("tools", "*")

    if allowed == "*":
        return BASE_TOOLS  # All base tools, but NOT Task (no recursion in demo)

    return [t for t in BASE_TOOLS if t["name"] in allowed]

def run_subagent_task(description: str, prompt: str, agent_type: str, max_steps: int = max_steps) -> str:
    if agent_type notin AGENT_TYPES:
        returnf"Error: Unknown agent type '{agent_type}'"

    config = AGENT_TYPES[agent_type]

    sub_system = f"""你是一个位于 {WORKDIR} 的 {agent_type} 子代理，系统为 {sys.platform}。

{config["prompt"]}

完成任务并返回清晰、简洁的总结。"""

    sub_tools = get_tools_for_agent(agent_type)
    sub_messages = [{"role": "system", "content": sub_system}, {"role": "user", "content": prompt}]

    logger.info(f"      [子代理][{agent_type}] {description}")
    start = time.time()
    tool_count = 0

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt='',
            messages=sub_messages,
            tools=sub_tools,
        )

        assistant_text = []
        tool_calls = []

        for block in response.content:
            if hasattr(block, "text"):
                assistant_text.append(block.text)
                logger.info(f"      [子代理][{agent_type}] {block.text}")
            if block.type == "tool_use":
                tool_calls.append(block)

        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            logger.info(f"      [子代理][{agent_type}] 第 {step} 步结束，无工具调用")
            break

        results = []

        for tc in tool_calls:
            tool_count += 1
            output = execute_tool(tc.name, tc.input)
            results.append(f"      [子代理][{agent_type}] 工具 {tc.name}, 输入: {tc.input}, 返回: {output}")
            elapsed = time.time() - start
            sys.stdout.write(
                f"\r      [子代理][{agent_type}] {description} ... {tool_count} tools, {elapsed:.1f}s\n"
            )
            sys.stdout.flush()

        sub_messages.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        sub_messages.append({"role": "user", "content": f"子代理执行结果：\n{combined_output}\n\n请继续处理"})

    elapsed = time.time() - start
    sys.stdout.write(
        f"\r      [子代理][{agent_type}] {description} - done ({tool_count} tools, {elapsed:.1f}s)\n"
    )

    for block in response.content:
        if hasattr(block, "text"):
            return full_text

    return"(subagent returned no text)"

def run_skill(skill_name: str) -> str:
    """
    加载一项技能并将其注入到对话中。

    这是关键机制：
    1. 获取技能内容（SKILL.md 正文 + 资源提示）
    2. 将其包装在 <skill-loaded> 标签中返回
    3. 模型作为 tool_result（用户消息）接收此内容
    4. 模型现在"知道"如何执行任务

    为什么使用 tool_result 而不是系统提示词？
    - 系统提示词更改会使缓存失效（成本增加 20-50 倍）
    - 工具结果追加到末尾（前缀不变，缓存命中）

    这就是生产系统保持成本效益的方式。
    """
    content = SKILLS.get_skill_content(skill_name)

    if content isNone:
        available = ", ".join(SKILLS.list_skills()) or"none"
        returnf"Error: Unknown skill '{skill_name}'. Available: {available}"

    # Wrap in tags so model knows it's skill content
    returnf"""<skill-loaded name="{skill_name}">
{content}
</skill-loaded>

Follow the instructions in the skill above to complete the user's task."""

def execute_tool(name: str, args: dict) -> str:
    result = execute_base_tools(name, args)
    if result isnotNone:
        return result

    if name == "TodoWrite":
        try:
            return TODO.update(args["items"])
        except Exception as e:
            returnf"Error: {e}"

    if name == "Task":
        try:
            return run_subagent_task(args["description"], args["prompt"], args["agent_type"])
        except Exception as e:
            returnf"Error: {e}"

    if name == "Skill":
        try:
            return run_skill(args["skill"])
        except Exception as e:
            returnf"Error: {e}"

    returnf"Unknown tool: {name}"

def agent_loop(prompt: str, history: list, max_steps: int = max_steps) -> list:
    # 检查历史记录中是否已有系统提示词（作为系统消息）
    has_system = any(msg.get("role") == "system"for msg in history)
    ifnot has_system:
        # 在开头添加系统提示词作为系统消息
        history.insert(0, {"role": "system", "content": SYSTEM})

    step = 0
    while step < max_steps:
        step += 1
        response = client.chat_with_tools(
            prompt=prompt,
            messages=history,
            tools=ALL_TOOLS,
        )

        assistant_text = []
        tool_calls = []

        for block in response.content:
            if hasattr(block, "text"):
                assistant_text.append(block.text)
                logger.info(block.text)
            if block.type == "tool_use":
                tool_calls.append(block)

        full_text = "\n".join(assistant_text)
        ifnot tool_calls:
            history.append({"role": "assistant", "content": full_text})
            logger.info(f"第 {step} 步结束，无工具调用")
            return history

        results = []
        for tc in tool_calls:
            if tc.name == "Task":
                logger.info(f"\n> [使用工具] Task 第 {step} 步调用: {tc.input.get('description', 'subtask')}")
            elif tc.name == "Skill":
                logger.info(f"\n> [使用工具] 第 {step} 步调用 Loading skill: {tc.input.get('skill', '?')}")
            else:
                logger.info(f"\n> [使用工具] {tc.name} 第 {step} 步调用: {tc.input}")

            output = execute_tool(tc.name, tc.input)

            if tc.name == "Skill":
                logger.info(f"  Skill loaded ({len(output)} chars)")
            elif tc.name != "Task":
                preview = output[:200] + "..."if len(output) > 200else output
                logger.info(f"  [使用工具] {tc.name}, 返回: {preview}")

            results.append(f"工具 {tc.name}, 输入: {tc.input}, 返回: {output}")

        history.append({"role": "assistant", "content": full_text})
        combined_output = "\n".join(results)
        history.append({"role": "user", "content": f"执行结果：\n{combined_output}\n\n请继续处理"})

def main():
    logger.info(f"Mini Claude Code v4 (with Skills) - {WORKDIR}")
    logger.info(f"Skills: {', '.join(SKILLS.list_skills()) or 'none'}")
    logger.info(f"Agent types: {', '.join(AGENT_TYPES.keys())}")
    logger.info("Type 'exit' to quit.\n")

    history = []

    whileTrue:
        try:
            user_input = input("You: ").strip()
        except (EOFError, KeyboardInterrupt):
            break

        ifnot user_input or user_input.lower() in ("exit", "quit", "q"):
            break

        history.append({"role": "user", "content": user_input})

        try:
            agent_loop('', history, max_steps=max_steps)
        except Exception as e:
            logger.error(f"Error: {e}")
            traceback.print_exc()

if __name__ == "__main__":
    main()

7 总结

本文通过使用 deepseek-v3.2 最常用的模型，通过不断规则+工具，实现最小版本的 Claude Code（后续功能继续完善中），代码开源地址：

https://github.com/linkxzhou/mylib/tree/master/llm/llmapi/miniagent

参考

（1）github.com/shareAI-lab…
（2）github.com/linkxzhou/S…

从0开发大模型之实现Agent（Bash 到 SKILL）