learn-claude-code-s03_todo_write原址这份 s03_todo_write.py 的核心是

原址

这份 s03_todo_write.py 的核心是在前面“AI Coding Agent + Tools”的基础上，新增一个 TodoWrite / TodoManager 机制：让模型在执行多步骤任务时，自己维护任务清单，并且在忘记更新 todo 时被程序提醒。文件开头注释也明确说了：模型通过 TodoManager 跟踪进度，如果忘了更新，会注入提醒；核心洞察是“Agent 可以追踪自己的进度，并且人也能看到”。(GitHub)

1. 这个文件整体在干啥

它实现了一个简单的 命令行 AI Coding Agent：

用户输入需求后，Agent 会调用 Claude 模型；模型可以使用几个工具：

工具

作用

bash

执行 shell 命令

read_file

读取文件

write_file

写入文件

edit_file

替换文件中的一段文本

todo

更新任务列表，记录当前进度

和 s02_tool_use.py 相比，这个文件的重点不是“能不能调用工具”，而是：

让 Agent 在复杂任务中先规划、再执行、再更新进度。

也就是更接近真实 Claude Code / AI Coding Agent 的工作方式。

2. 初始化部分

代码先加载环境变量，然后初始化 Anthropic 客户端和模型：

load_dotenv(override=True)

if os.getenv("ANTHROPIC_BASE_URL"):
    os.environ.pop("ANTHROPIC_AUTH_TOKEN", None)

WORKDIR = Path.cwd()
client = Anthropic()
MODEL = os.environ["MODEL_ID"]

意思是：

从 .env 文件加载环境变量；
如果设置了 ANTHROPIC_BASE_URL，说明可能走的是代理或兼容服务，所以移除 ANTHROPIC_AUTH_TOKEN；
WORKDIR = Path.cwd() 把当前目录作为 Agent 的工作目录；
client = Anthropic() 创建 Anthropic API 客户端；
MODEL = os.environ["MODEL_ID"] 从环境变量中读取模型 ID。(GitHub)

注意：这里用的是 os.environ["MODEL_ID"]，如果没有配置 MODEL_ID，程序会直接报错。

3. SYSTEM Prompt 的作用

SYSTEM = f"""You are a coding agent at {WORKDIR}.

Use the todo tool to plan multi-step tasks. Mark in_progress before starting, completed when done.

Prefer tools over prose."""

这个系统提示词告诉模型三件事：

你是当前目录下的 coding agent；
多步骤任务要使用 todo 工具做规划；
开始任务前要标记 in_progress，完成后要标记 completed；
优先使用工具，不要只靠嘴说。(GitHub)

这就是 TodoWrite 能起作用的关键：不是 Python 代码强制模型规划，而是通过 system prompt + tool schema 引导模型主动规划。

4. TodoManager：这个文件的核心新增点

class TodoManager:
    def __init__(self):
        self.items = []

TodoManager 是一个专门管理任务列表的类，里面维护一个 items 数组。每个 todo 大概长这样：

{
    "id": "1",
    "text": "读取项目结构",
    "status": "in_progress"
}

update() 方法

def update(self, items: list) -> str:

这个方法负责接收模型传来的 todo 列表，并进行校验。

它做了几件事：

1. 最多 20 个任务

if len(items) > 20:
    raise ValueError("Max 20 todos allowed")

避免模型生成一大堆无意义任务。

2. 校验每个任务必须有 text

text = str(item.get("text", "")).strip()

if not text:
    raise ValueError(f"Item {item_id}: text required")

任务必须有描述，不能为空。

3. 校验 status 只能是三种

status = str(item.get("status", "pending")).lower()

if status not in ("pending", "in_progress", "completed"):
    raise ValueError(...)

状态只能是：

状态

含义

pending

待处理

in_progress

正在处理

completed

已完成

4. 同一时间只能有一个 in_progress

if in_progress_count > 1:
    raise ValueError("Only one task can be in_progress at a time")

这很重要。

因为一个 Agent 一次最好专注做一件事。如果允许多个 in_progress，任务状态就会变乱。(GitHub)

5. render()：把 todo 渲染成人能看懂的格式

def render(self) -> str:

这个方法会把内部 todo 状态转换成文本：

[ ] #1: 分析项目结构
[>] #2: 修改 main.py
[x] #3: 运行测试

(1/3 completed)

代码里对应的 marker 是：

marker = {
    "pending": "[ ]",
    "in_progress": "[>]",
    "completed": "[x]"
}[item["status"]]

然后统计完成数量：

done = sum(1 for t in self.items if t["status"] == "completed")

最后返回完整字符串。(GitHub)

所以这个 todo 工具不只是给模型用的，也是给用户看的。

6. 文件安全限制：safe_path()

def safe_path(p: str) -> Path:
    path = (WORKDIR / p).resolve()
    if not path.is_relative_to(WORKDIR):
        raise ValueError(f"Path escapes workspace: {p}")
    return path

这个函数的作用是：防止 Agent 读写工作目录外面的文件。

例如当前目录是：

/Users/me/project

如果模型想读：

../../.ssh/id_rsa

safe_path() 会发现这个路径已经逃出了 WORKDIR，于是报错。

这是 AI Coding Agent 很重要的安全边界。(GitHub)

7. 四个基础工具

7.1 run_bash()

def run_bash(command: str) -> str:

作用：执行 shell 命令。

它会先拦截一些危险命令：

dangerous = ["rm -rf /", "sudo", "shutdown", "reboot", "> /dev/"]

如果命中这些内容，就返回：

Error: Dangerous command blocked

然后通过：

subprocess.run(
    command,
    shell=True,
    cwd=WORKDIR,
    capture_output=True,
    text=True,
    timeout=120
)

执行命令，并把 stdout + stderr 返回给模型。输出最多保留 50000 字符，防止结果太大。(GitHub)

7.2 run_read()

def run_read(path: str, limit: int = None) -> str:

作用：读取文件内容。

特点：

先通过 safe_path() 检查路径；
支持 limit，只读前 N 行；
输出最多 50000 字符。(GitHub)

7.3 run_write()

def run_write(path: str, content: str) -> str:

作用：写文件。

逻辑：

检查路径是否在工作目录内；
如果父目录不存在，就自动创建；
写入内容；
返回写入字节数。(GitHub)

7.4 run_edit()

def run_edit(path: str, old_text: str, new_text: str) -> str:

作用：编辑文件。

它不是复杂 diff，而是简单的“精确文本替换”：

content.replace(old_text, new_text, 1)

也就是只替换第一次出现的 old_text。

如果找不到原文本，就返回：

Error: Text not found in xxx

这个设计很简单，但很适合教学，因为它能展示 AI Agent 如何通过工具修改代码。(GitHub)

8. TOOL_HANDLERS：工具名和 Python 函数的映射

TOOL_HANDLERS = {
    "bash": lambda **kw: run_bash(kw["command"]),
    "read_file": lambda **kw: run_read(kw["path"], kw.get("limit")),
    "write_file": lambda **kw: run_write(kw["path"], kw["content"]),
    "edit_file": lambda **kw: run_edit(kw["path"], kw["old_text"], kw["new_text"]),
    "todo": lambda **kw: TODO.update(kw["items"]),
}

这相当于一个路由表。

模型说：

{
  "name": "read_file",
  "input": {
    "path": "main.py"
  }
}

程序就会根据 "read_file" 找到：

run_read(...)

然后执行。

新增的关键就是：

"todo": lambda **kw: TODO.update(kw["items"])

这让模型可以像调用普通工具一样调用 todo 工具。(GitHub)

9. TOOLS：告诉 Claude 有哪些工具可以用

TOOLS 是传给 Anthropic API 的工具定义。

比如 bash 工具：

{
    "name": "bash",
    "description": "Run a shell command.",
    "input_schema": {
        "type": "object",
        "properties": {
            "command": {"type": "string"}
        },
        "required": ["command"]
    }
}

意思是告诉模型：

你可以调用一个叫 bash 的工具，它需要传入一个 command 字符串。

todo 工具的 schema 更复杂：

{
    "name": "todo",
    "description": "Update task list. Track progress on multi-step tasks.",
    "input_schema": {
        "type": "object",
        "properties": {
            "items": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "id": {"type": "string"},
                        "text": {"type": "string"},
                        "status": {
                            "type": "string",
                            "enum": ["pending", "in_progress", "completed"]
                        }
                    },
                    "required": ["id", "text", "status"]
                }
            }
        },
        "required": ["items"]
    }
}

也就是说模型调用 todo 时，必须传一个 items 数组，每个任务都要有 id、text、status。(GitHub)

10. agent_loop()：Agent 主循环

def agent_loop(messages: list):
    rounds_since_todo = 0
    while True:
        response = client.messages.create(...)

主循环的逻辑是：

把用户历史消息发给 Claude；
把工具列表 TOOLS 也发给 Claude；
Claude 要么直接回答，要么请求调用工具；
如果 Claude 调工具，Python 执行工具；
工具结果再作为 user message 塞回对话；
继续下一轮，直到 Claude 不再调用工具。(GitHub)

核心 API 调用：

response = client.messages.create(
    model=MODEL,
    system=SYSTEM,
    messages=messages,
    tools=TOOLS,
    max_tokens=8000,
)

这里 tools=TOOLS 就是让模型具备工具调用能力。

11. 工具调用处理逻辑

for block in response.content:
    if block.type == "tool_use":
        handler = TOOL_HANDLERS.get(block.name)
        output = handler(**block.input)

Claude 的响应内容可能包含多个 block，其中 tool_use 类型表示模型想调用工具。

例如模型可能返回：

{
  "type": "tool_use",
  "name": "todo",
  "input": {
    "items": [
      {
        "id": "1",
        "text": "读取文件",
        "status": "in_progress"
      }
    ]
  }
}

程序看到 block.name == "todo"，就从 TOOL_HANDLERS 里取出对应函数执行。

执行完后，会把结果包装成：

{
    "type": "tool_result",
    "tool_use_id": block.id,
    "content": str(output)
}

再追加到 messages 里，让模型看到工具执行结果。(GitHub)

12. rounds_since_todo：提醒模型更新 todo

这是这个文件最有意思的部分：

rounds_since_todo = 0

它记录模型已经连续多少轮没有使用 todo 工具。

每轮工具调用之后：

if block.name == "todo":
    used_todo = True

rounds_since_todo = 0 if used_todo else rounds_since_todo + 1

如果模型用了 todo，就清零。

如果模型没用 todo，就加 1。

当达到 3 轮时：

if rounds_since_todo >= 3:
    results.append({
        "type": "text",
        "text": "<reminder>Update your todos.</reminder>"
    })

也就是说，如果模型连续 3 轮只顾着读文件、改文件、跑命令，却忘记更新任务状态，程序会额外塞一条提醒：

<reminder>Update your todos.</reminder>

这不是用户手动提醒，而是 Agent 框架自动提醒模型保持任务管理。(GitHub)

这就是标题里说的：

keeping the model on course without scripting the route

意思是：

不强行规定模型每一步怎么走，但通过 todo 和 reminder 让模型不要跑偏。

13. 命令行入口

if __name__ == "__main__":
    history = []
    while True:
        query = input("\033[36ms03 >> \033[0m")

当你直接运行这个 Python 文件时，会进入命令行交互模式。

用户输入内容后：

history.append({"role": "user", "content": query})
agent_loop(history)

程序会把用户输入加入历史消息，然后启动 Agent 循环。

如果输入：

q
exit

或者直接回车空内容，就退出。(GitHub)

最后这段：

response_content = history[-1]["content"]
if isinstance(response_content, list):
    for block in response_content:
        if hasattr(block, "text"):
            print(block.text)

作用是：把模型最后的自然语言文本打印出来。

因为 Anthropic 返回的 content 可能是一个 block 列表，里面可能有 text block、tool_use block，所以这里要判断是否有 .text 属性。

14. 一次典型运行流程

假设你输入：

帮我创建一个 hello.py，并写一个函数打印 hello world

理想情况下，Agent 的流程可能是：

第 1 轮：模型先规划

调用 todo：

[  {    "id": "1",    "text": "创建 hello.py 文件",    "status": "in_progress"  },  {    "id": "2",    "text": "写入 hello world 函数",    "status": "pending"  },  {    "id": "3",    "text": "运行文件验证结果",    "status": "pending"  }]

[>] #1: 创建 hello.py 文件
[ ] #2: 写入 hello world 函数
[ ] #3: 运行文件验证结果

(0/3 completed)

第 2 轮：调用 write_file

写入：

def hello():
    print("hello world")

hello()

第 3 轮：更新 todo

把第 1 个任务改成 completed，第 2 个任务改成 in_progress。

第 4 轮：调用 bash

运行：

python hello.py

第 5 轮：更新 todo

把所有任务标记为 completed。

第 6 轮：最终回复用户

告诉用户文件已创建并验证通过。

15. 这个文件真正想教你的点

它不是单纯教 Python 语法，而是在讲 Agent 架构里的一个关键能力：

对于复杂任务，Agent 不能只是一边想一边调用工具，它还需要显式维护任务状态。

可以理解成：

LLM = 大脑
Tools = 手
TodoManager = 任务看板
Agent Loop = 执行循环
Reminder = 防跑偏机制

TodoManager 不是替模型做决策，而是让模型把决策过程结构化地记录下来。

这和真实 AI Coding Agent 很像：
复杂任务中，模型应该先列计划，然后每完成一步就更新状态，而不是一口气乱改文件。

16. 和前两个文件的关系

大致可以这样理解：

文件

重点

s01_agent_loop.py

最小 Agent Loop：用户输入 → 模型回复

s02_tool_use.py

给 Agent 加工具：能读文件、写文件、跑命令

s03_todo_write.py

给 Agent 加任务管理：能规划、跟踪进度、防止跑偏

所以 s03_todo_write.py 是在 s02 的基础上往真实 Claude Code 更进一步：
不只是“会调用工具”，而是“会管理复杂任务执行过程”。

17. 一句话总结

s03_todo_write.py 实现了一个带 TodoWrite 任务管理能力 的简易 AI Coding Agent：它可以调用文件和命令行工具完成开发任务，同时通过 todo 工具维护任务清单，并在模型忘记更新进度时自动注入提醒，让 Agent 在多步骤任务中更可控、更透明。