【从零手写 ClaudeCode:learn-claude-code 项目实战笔记】(3)TodoWrite (待办写入)

0 阅读6分钟

第三章 TodoWrite (待办写入)

s01 > s02 > [ s03 ] s04 > s05 > s06 | s07 > s08 > s09 > s10 > s11 > s12

“本专栏基于开源项目 learn-claude-code 的官方文档。原文档非常硬核,为了方便像我一样的新手小白理解,我对文档进行了逐行精读,并加入了很多中文注释、大白话解释和踩坑记录。希望这套‘咀嚼版’教程能帮你推开 AI Agent 开发的大门。”

项目地址:shareAI-lab/learn-claude-code: Bash is all you need - A nano Claude Code–like agent, built from 0 to 1

"没有计划的 agent 走哪算哪" -- 先列步骤再动手, 完成率翻倍。

一、问题-Models Forget (模型会健忘)

大模型在处理长链路任务 (Long-Horizon Tasks) 时的致命缺陷:多步任务中, 模型会丢失进度 -- 重复做过的事、跳步、跑偏。对话越长越严重: 工具结果不断填满上下文, 系统提示的影响力逐渐被稀释。一个 10 步重构可能做完 1-3 步就开始即兴发挥, 因为 4-10 步已经被挤出注意力了。

二、解决方案

+--------+      +-------+      +---------+
|  User  | ---> |  LLM  | ---> | Tools   |
| prompt |      |       |      | + todo  |
+--------+      +---+---+      +----+----+
                    ^                |
                    |   tool_result  |
                    +----------------+
                          |
              +-----------+-----------+
              | TodoManager state     |
              | [ ] task A            |
              | [>] task B  <- doing  |
              | [x] task C            |
              +-----------------------+
                          |
              if rounds_since_todo >= 3:
                inject <reminder> into tool_result

1. 显式状态管理(Explicit State with todo tool)

它不仅让 LLM 思考,还强迫 LLM 把思考过程写下来

  • 引入 TodoManager 类:这是一个简单的 Python 类,用来存储任务列表。

  • 增加 todo 工具:允许模型调用这个工具来创建、更新任务列表。

    • 模型必须明确标出:哪个任务 pending(待办),哪个 in_progress(进行中),哪个 completed(已完成)。
    • 效果:即使对话历史很长,模型的“当前任务状态”始终是清晰的,存储在 TODO 变量里,并在每次工具调用后返回给模型看。

2. 强制提醒机制(The "Nag" Reminder)

光有工具模型可能不用,或者用着用着就忘了。所以 s03 增加了一个监督逻辑

三、工作原理

  1. TodoManager 存储带状态的项目。同一时间只允许一个 in_progress。这是对模型的一种行为约束(Constraint) 。 大模型有时候思维很发散,“双线程”工作:一边想改文件 A,一边又想顺手把文件 B 也修了。这通常会导致混乱和顾此失彼。这里强制规定:你此时此刻只能做 ONE thing
class TodoManager:
    def update(self, items: list) -> str:
        validated, in_progress_count = [], 0
        for item in items:
            status = item.get("status", "pending")
            if status == "in_progress":
                in_progress_count += 1
            # <-- 状态必须是 pending, in_progress, completed 三者之一
            validated.append({"id": item["id"], "text": item["text"],
                              "status": status})
        if in_progress_count > 1:
            raise ValueError("Only one task can be in_progress")
        self.items = validated
        return self.render()
  1. todo 工具和其他工具一样加入 dispatch map。
TOOL_HANDLERS = {
    # ...base tools...
    "todo": lambda **kw: TODO.update(kw["items"]),
}
  1. 显式状态与计数器逻辑:代码使用 used_todo 标志位监测 todo 工具的使用。
  • 如果用了 todo,计数器 rounds_since_todo 清零 。
  • 如果没用 (比如只用了 bash),计数器 +1 。
  • 积攒到 3 次,上面的 nag reminder 就会被插入到用户消息中,强迫模型反思进度。

        used_todo = False

        for block in response.content:

            # ...

            if block.name == "todo":

                used_todo = True

        rounds_since_todo = 0 if used_todo else rounds_since_todo + 1
  1. nag reminder: 模型连续 3 轮以上不调用 todo 时注入提醒。
if rounds_since_todo >= 3 and messages:
    last = messages[-1]
    if last["role"] == "user" and isinstance(last.get("content"), list):
        last["content"].insert(0, {
            "type": "text",
            "text": "<reminder>Update your todos.</reminder>",
        })

"同时只能有一个 in_progress" 强制顺序聚焦。nag reminder 制造问责压力 -- 你不更新计划, 系统就追着你问。

四、相对 s02 的变更

组件之前 (s02)之后 (s03)
Tools45 (+todo)
规划带状态的 TodoManager
Nag 注入3 轮后注入 <reminder>
Agent loop简单分发+ rounds_since_todo 计数器

五、试一试

cd learn-claude-code
python agents/s03_todo_write.py

试试这些 prompt (英文 prompt 对 LLM 效果更好, 也可以用中文):

  1. Refactor the file hello.py: add type hints, docstrings, and a main guard
  2. Create a Python package with __init__.py, utils.py, and tests/test_utils.py
  3. Review all Python files and fix any style issues

完整代码

#!/usr/bin/env python3
import os

import subprocess

from pathlib import Path

  

from anthropic import Anthropic

from dotenv import load_dotenv

  

load_dotenv(override=True)

  

if os.getenv("ANTHROPIC_BASE_URL"):

    os.environ.pop("ANTHROPIC_AUTH_TOKEN", None)

  

WORKDIR = Path.cwd()

client = Anthropic(base_url=os.getenv("ANTHROPIC_BASE_URL"))

MODEL = os.environ["MODEL_ID"]

  

SYSTEM = f"""You are a coding agent at {WORKDIR}.

Use the todo tool to plan multi-step tasks. Mark in_progress before starting, completed when done.

Prefer tools over prose."""

  
  

# -- TodoManager: structured state the LLM writes to --

class TodoManager:

    def __init__(self):

        self.items = []

  

    def update(self, items: list) -> str:

        if len(items) > 20:

            raise ValueError("Max 20 todos allowed")

        validated = []

        in_progress_count = 0

        for i, item in enumerate(items):

            text = str(item.get("text", "")).strip()

            status = str(item.get("status", "pending")).lower()

            item_id = str(item.get("id", str(i + 1)))

            if not text:

                raise ValueError(f"Item {item_id}: text required")

            if status not in ("pending", "in_progress", "completed"):

                raise ValueError(f"Item {item_id}: invalid status '{status}'")

            if status == "in_progress":

                in_progress_count += 1

            validated.append({

                "id": item_id,

                "text": text,

                "status": status

            })

        if in_progress_count > 1:

            raise ValueError("Only one task can be in_progress at a time")

        self.items = validated

        return self.render()

  

    def render(self) -> str:

        if not self.items:

            return "No todos."

        lines = []

        for item in self.items:

            marker = {"pending": "[ ]", "in_progress": "[>]", "completed": "[x]"}[item["status"]]

            lines.append(f"{marker} #{item['id']}: {item['text']}")

        done = sum(1 for t in self.items if t["status"] == "completed")

        lines.append(f"\n({done}/{len(self.items)} completed)")

        return "\n".join(lines)

  
  

TODO = TodoManager()

  
  

# -- Tool implementations --

def safe_path(p: str) -> Path:

    path = (WORKDIR / p).resolve()

    if not path.is_relative_to(WORKDIR):

        raise ValueError(f"Path escapes workspace: {p}")

    return path

  

def run_bash(command: str) -> str:

    dangerous = ["rm -rf /", "sudo", "shutdown", "reboot", "> /dev/"]

    if any(d in command for d in dangerous):

        return "Error: Dangerous command blocked"

    try:

        r = subprocess.run(command, shell=True, cwd=WORKDIR,

                           capture_output=True, text=True, timeout=120)

        out = (r.stdout + r.stderr).strip()

        return out[:50000] if out else "(no output)"

    except subprocess.TimeoutExpired:

        return "Error: Timeout (120s)"

  

def run_read(path: str, limit: int = None) -> str:

    try:

        lines = safe_path(path).read_text().splitlines()

        if limit and limit < len(lines):

            lines = lines[:limit] + [f"... ({len(lines) - limit} more)"]

        return "\n".join(lines)[:50000]

    except Exception as e:

        return f"Error: {e}"

  

def run_write(path: str, content: str) -> str:

    try:

        fp = safe_path(path)

        fp.parent.mkdir(parents=True, exist_ok=True)

        fp.write_text(content)

        return f"Wrote {len(content)} bytes"

    except Exception as e:

        return f"Error: {e}"

  

def run_edit(path: str, old_text: str, new_text: str) -> str:

    try:

        fp = safe_path(path)

        content = fp.read_text()

        if old_text not in content:

            return f"Error: Text not found in {path}"

        fp.write_text(content.replace(old_text, new_text, 1))

        return f"Edited {path}"

    except Exception as e:

        return f"Error: {e}"

  
  

TOOL_HANDLERS = {

    "bash":       lambda **kw: run_bash(kw["command"]),

    "read_file":  lambda **kw: run_read(kw["path"], kw.get("limit")),

    "write_file": lambda **kw: run_write(kw["path"], kw["content"]),

    "edit_file":  lambda **kw: run_edit(kw["path"], kw["old_text"], kw["new_text"]),

    "todo":       lambda **kw: TODO.update(kw["items"]),

}

  

TOOLS = [

    {"name": "bash", "description": "Run a shell command.",

     "input_schema": {"type": "object", "properties": {"command": {"type": "string"}}, "required": ["command"]}},

    {"name": "read_file", "description": "Read file contents.",

     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "limit": {"type": "integer"}}, "required": ["path"]}},

    {"name": "write_file", "description": "Write content to file.",

     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}},

    {"name": "edit_file", "description": "Replace exact text in file.",

     "input_schema": {"type": "object", "properties": {"path": {"type": "string"}, "old_text": {"type": "string"}, "new_text": {"type": "string"}}, "required": ["path", "old_text", "new_text"]}},

    {"name": "todo", "description": "Update task list. Track progress on multi-step tasks.",

     "input_schema": {"type": "object", "properties": {"items": {"type": "array", "items": {"type": "object", "properties": {"id": {"type": "string"}, "text": {"type": "string"}, "status": {"type": "string", "enum": ["pending", "in_progress", "completed"]}}, "required": ["id", "text", "status"]}}}, "required": ["items"]}},

]

  
  

# -- Agent loop with nag reminder injection --

def agent_loop(messages: list):

    rounds_since_todo = 0

    while True:

        # Nag reminder is injected below, alongside tool results

        response = client.messages.create(

            model=MODEL, system=SYSTEM, messages=messages,

            tools=TOOLS, max_tokens=8000,

        )

        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":

            return

        results = []

        used_todo = False

        for block in response.content:

            if block.type == "tool_use":

                handler = TOOL_HANDLERS.get(block.name)

                try:

                    output = handler(**block.input) if handler else f"Unknown tool: {block.name}"

                except Exception as e:

                    output = f"Error: {e}"

                print(f"> {block.name}: {str(output)[:200]}")

                results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(output)})

                if block.name == "todo":

                    used_todo = True

        rounds_since_todo = 0 if used_todo else rounds_since_todo + 1

        if rounds_since_todo >= 3:

            results.insert(0, {"type": "text", "text": "<reminder>Update your todos.</reminder>"})

        messages.append({"role": "user", "content": results})

  
  

if __name__ == "__main__":

    history = []

    while True:

        try:

            query = input("\033[36ms03 >> \033[0m")

        except (EOFError, KeyboardInterrupt):

            break

        if query.strip().lower() in ("q", "exit", ""):

            break

        history.append({"role": "user", "content": query})

        agent_loop(history)

        response_content = history[-1]["content"]

        if isinstance(response_content, list):

            for block in response_content:

                if hasattr(block, "text"):

                    print(block.text)

        print()