🏢 每日开源研读 #003 — MetaGPT：当 AI 组成一家软件公司研读 MetaGPT 开源项目："Enabl

项目信息

项目名: MetaGPT (FoundationAgents/MetaGPT)
GitHub: github.com/geekan/Meta…
Star: 67,502 ⭐
License: MIT
语言: Python

为什么看这个项目？

最近在研究多 Agent 系统的设计，发现 MetaGPT 的 Slogan 特别有意思——"Enable GPT to work in a software company"，让 GPT 像真实软件公司一样协作！

核心理念 "Code = SOP(Team)" 也很有启发性：把真实软件公司的标准作业程序（SOP）固化下来，应用到 LLM Agent 团队中。

核心理念：Code = SOP(Team)

现实软件公司                    MetaGPT
─────────────────────────────────────────
老板（一句话需求）    →       用户输入
产品经理            →       ProductManager
架构师              →       Architect  
工程师              →       Engineer
测试工程师          →       QaEngineer
SOP 流程            →       SOP 驱动

MetaGPT 把整个软件开发流程做了一个抽象：输入是一句话需求，输出是完整的代码、规范文档、测试用例等。而内部运作则模拟了一家软件公司的 SOP。

整体架构

┌──────────────────────────────────────────────────────┐
│                      Team (公司)                      │
│                                                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │
│  │  产品经理    │  │   架构师    │  │   工程师    │   │
│  │ ProductMgr  │  │ Architect  │  │  Engineer   │   │
│  │  Alice      │  │  Brian     │  │  Cathy      │   │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘   │
│         │                 │                 │         │
│         └────────────────┴┴─────────────────┘         │
│                          ▼                            │
│              ┌───────────────────────┐               │
│              │     Environment       │               │
│              │  ─────────────────── │               │
│              │  • 消息总线（广播）   │               │
│              │  • 角色注册表         │               │
│              │  • 历史记录           │               │
│              └───────────────────────┘               │
│                          │                            │
│              ┌───────────┴───────────┐                │
│              │     CostManager       │                │
│              │  预算控制 & 计费      │                │
│              └───────────────────────┘               │
└──────────────────────────────────────────────────────┘

核心文件结构

文件	作用
`software_company.py`	公司入口，`hire()` 招聘角色、`invest()` 设置预算、`run()` 运行
`team.py`	Team 类，管理多个 Role 和共享的 Environment
`environment/base_env.py`	环境类，承载角色、消息广播、异步执行
`roles/role.py`	Role 基类，核心的 think/act/observe 循环
`roles/product_manager.py`	产品经理：写 PRD、竞品分析
`roles/architect.py`	架构师：系统设计
`roles/engineer.py`	工程师：写代码
`actions/`	各种可执行的动作
`utils/cost_manager.py`	预算控制和费用计算

Think / Act / Observe 循环详解

这是 MetaGPT 最核心的运行机制！每个 Role 都是按照这个循环运作的：

┌─────────────────────────────────────────┐
│            Role 运行循环                 │
│                                         │
│    ┌─────── observe ────────┐           │
│    │                        │           │
│    │   1. 从消息缓冲区        │           │
│    │      读取新消息         │           │
│    │                        │           │
│    │   2. 根据 watch 过滤    │           │
│    │      只保留关心的消息    │           │
│    │                        │           │
│    │   3. 存入自身记忆       │           │
│    └───┬───────────────────┘           │
│        │ 有新消息                         │
│        ▼                                 │
│    ┌─────── think ─────────┐             │
│    │                       │             │
│    │   1. 根据 react_mode  │             │
│    │      决定下一步状态     │             │
│    │                       │             │
│    │   2. 选择要执行的       │             │
│    │      Action            │             │
│    │                       │             │
│    │   3. 设置 state        │             │
│    └───┬───────────────────┘             │
│        │ 确定要执行                         │
│        ▼                                 │
│    ┌─────── act ──────────┐              │
│    │                       │              │
│    │   1. 调用 Action.run() │              │
│    │      执行具体动作       │              │
│    │                       │              │
│    │   2. 生成回复消息       │              │
│    │                       │              │
│    │   3. 发布到环境         │              │
│    │   4. 存入记忆           │              │
│    └───┬───────────────────┘              │
│        │                                   │
│        └──────────┐                        │
│                   │                        │
│         循环直到 state=-1                   │
└─────────────────────────────────────────┘

1. Observe — 观察（感知环境）

async def _observe(self) -> int:
    """从消息缓冲区读取新消息并过滤"""
    # 1. 从消息缓冲区 pop 所有消息
    news = self.rc.msg_buffer.pop_all()
    
    # 2. 根据 watch 列表过滤
    # watch 记录了这个 Role 关心哪些 Action 产生的消息
    self.rc.news = [
        n for n in news 
        if (n.cause_by in self.rc.watch or self.name in n.send_to) 
        and n not in old_messages
    ]
    
    # 3. 存入自身记忆
    self.rc.memory.add_batch(self.rc.news)
    
    return len(self.rc.news)  # 返回新消息数量

关键概念 watch：每个 Role 有一个 watch 集合，里面是要监听的消息类型。比如产品经理会监听 UserRequirement（用户需求）和 PrepareDocuments（准备文档）这两个 Action 产生的消息。

2. Think — 思考（决定行动）

async def _think(self) -> bool:
    """决定下一步执行哪个 Action"""
    
    # 情况1: 只有一个 Action，直接执行
    if len(self.actions) == 1:
        self._set_state(0)
        return True
    
    # 情况2: BY_ORDER 模式，按顺序执行
    if self.rc.react_mode == RoleReactMode.BY_ORDER:
        self._set_state(self.rc.state + 1)
        return self.rc.state < len(self.actions)
    
    # 情况3: REACT 模式，用 LLM 决定
    prompt = self._get_prefix()  # 获取角色设定
    prompt += STATE_TEMPLATE.format(
        history=self.rc.history,      # 历史记忆
        states="\n".join(self.states), # 可选状态列表
        previous_state=self.rc.state, # 上一个状态
    )
    
    # LLM 返回下一个状态编号
    next_state = await self.llm.aask(prompt)
    next_state = extract_state_value_from_output(next_state)
    
    # -1 表示结束
    if next_state == -1:
        logger.info("结束动作")
    
    self._set_state(next_state)
    return True

三种 react_mode：

模式	行为	适用场景
`REACT`	LLM 动态决定下一步	灵活、需要推理的任务
`BY_ORDER`	按 actions 列表顺序执行	固定流程、SOP 驱动的任务
`PLAN_AND_ACT`	先计划，再按计划执行	复杂任务、需要规划

3. Act — 行动（执行动作）

async def _act(self) -> Message:
    """执行当前 state 对应的 Action"""
    # 1. 记录日志
    logger.info(f"{self._setting}: to do {self.rc.todo}")
    
    # 2. 调用 Action.run() 执行
    response = await self.rc.todo.run(self.rc.history)
    
    # 3. 封装成 AIMessage
    msg = AIMessage(
        content=response.content,
        cause_by=self.rc.todo,  # 标记由哪个 Action 产生
        sent_from=self,         # 标记发送者
    )
    
    # 4. 存入记忆
    self.rc.memory.add(msg)
    
    return msg

角色间通信机制

MetaGPT 的角色间通信采用发布-订阅模式，核心是消息路由：

消息发布流程

Role A                              Environment                           Role B
  │                                       │                                  │
  │  publish_message(msg)                 │                                  │
  │  ────────────────────────────────────►│                                  │
  │                                       │                                  │
  │                     ┌─────────────────────────────────────────┐          │
  │                     │ 遍历所有角色的地址表 (member_addrs)     │          │
  │                     │ 检查 msg.send_to 是否匹配               │          │
  │                     └─────────────────────────────────────────┘          │
  │                                       │                                  │
  │                                       │ put_message(msg)                 │
  │                                       │─────────────────────────────────►│
  │                                       │                                  │

具体代码

# Role 发布消息
def publish_message(self, msg):
    # 如果没有指定环境，消息无法发送
    if not self.rc.env:
        return
    
    # 如果指定了环境，委托给环境发布
    self.rc.env.publish_message(msg)

# Environment 广播消息
def publish_message(self, message: Message, peekable: bool = True):
    for role, addrs in self.member_addrs.items():
        # 检查消息的 send_to 是否包含这个角色
        if is_send_to(message, addrs):
            role.put_message(message)  # 放入角色的私有消息缓冲区

消息的结构

class Message(BaseModel):
    content: str           # 消息内容
    send_to: set[str]      # 发送给谁（角色名或 "all"）
    cause_by: str          # 由哪个 Action 产生
    sent_from: str         # 发送者是谁

通信示例

假设产品经理（Alice）写完了 PRD，要告诉架构师（Brian）：

Alice: 
  msg = Message(
    content="PRD 已完成，包含游戏需求文档",
    send_to={"Brian"},  # 指定发送给架构师
    cause_by="WritePRD",
    sent_from="Alice"
  )
  self.publish_message(msg)

Brian (在下一次循环中):
  news = self._observe()  # 发现有新消息来自 Alice
  self._think()           # 决定要处理这个消息
  self._act()             # 开始写架构设计

预算控制机制

MetaGPT 用 CostManager 来控制成本，防止 LLM 调用失控。

工作流程

用户调用                         
  │
  │ invest(3.0)  设置预算 $3
  ▼
Team.invest()
  │
  │ cost_manager.max_budget = 3.0
  ▼
┌──────────────────┐
│   CostManager    │
│  max_budget=$3   │
│  total_cost=$0   │
└────────┬─────────┘
         │
         │ 每次 LLM 调用
         ▼
┌──────────────────┐
│  LLM API 返回     │
│  usage 信息       │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ cost_manager.    │
│ update_cost()     │
│                  │
│ total_cost +=    │
│ 新增费用          │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Team._check_     │
│ balance()         │
│                  │
│ if total_cost >= │
│ max_budget:       │
│ raise NoMoneyException │
└──────────────────┘

费用计算代码

class CostManager(BaseModel):
max_budget: float = 10.0 # 最大预算
total_cost: float = 0 # 当前累计费用

def update_cost(self, prompt_tokens, completion_tokens, model):
"""每次 LLM 调用后更新费用"""
# 从预定义的 TOKEN_COSTS 表查费率
cost = (
prompt_tokens * self.token_costs[model]["prompt"] 
+ completion_tokens * self.token_costs[model]["completion"]
) / 1000 # 除以 1000 是因为费率是 per 1K tokens

self.total_cost += cost
logger.info(f"Total: ${self.total_cost:.3f} / Max: ${self.max_budget:.3f}")

预算检查

class Team:
def _check_balance(self):
"""每次运行循环前检查"""
if self.cost_manager.total_cost >= self.cost_manager.max_budget:
raise NoMoneyException(...)

async def run(self, n_round=3, idea=""):
while n_round > 0:
if self.env.is_idle:
break
n_round -= 1
self._check_balance() # ← 每次循环检查预算
await self.env.run()

Role 状态机详解

每个 Role 有 states 列表和 state 属性，表示当前在执行流程中的哪个阶段。

状态的定义

class Role:
    states: list[str] = []  # 可选状态列表
    rc.state: int = -1     # 当前状态索引，-1 表示空闲/结束

设置 Action 时会自动生成状态：

def set_actions(self, actions: list[Action]):
    for action in actions:
        self.actions.append(action)
        self.states.append(f"{len(self.actions) - 1}. {action}")
    # 例如：states = ["0. WritePRD", "1. WriteSpec", "2. Review"]

状态的作用

1. 追踪进度 — 知道执行到哪一步了

2. 决定下一步 — 根据当前状态选择下一个 Action

# 状态转换示例
State 0: WritePRD (写需求文档)
    ↓ 产品经理完成
State 1: DesignAPI (设计 API)
    ↓ 架构师完成
State 2: WriteCode (写代码)
    ↓ 工程师完成
State -1: 结束

3. 恢复执行 — 如果中间出错，可以从某个状态恢复

# 恢复时从之前的状态继续
if self.recovered and self.rc.state >= 0:
    self._set_state(self.rc.state)  # 从恢复的状态继续

不同 react_mode 的状态行为

react_mode	状态行为
`REACT`	LLM 根据 history 和当前 state 动态决定下一个 state
`BY_ORDER`	每次 act 后 state +1，按顺序执行
`PLAN_AND_ACT`	先用 Planner 生成计划，计划中的每个 task 对应一个 state

总结

MetaGPT 展示了一种很有意思的多 Agent 协作范式：把真实公司的 SOP 流程迁移到 AI Agent 团队中。

核心设计模式

模式	说明
Think-Act-Observe 循环	Agent 的基本运作机制
发布-订阅消息	解耦的异步通信
状态机驱动	通过 state 控制流程
预算保护	防止 LLM 费用失控
角色分工	不同角色有不同目标、约束、Action

架构亮点

Environment 作为消息总线 — 所有角色通过环境广播消息
Role 私有消息缓冲区 — msg_buffer 实现异步消息接收
多层记忆 — memory（长期）、working_memory（工作）、msg_buffer（缓冲）
可插拔的 react_mode — 支持多种思考-行动策略

对多 Agent 开发的启示

如果想开发类似"AI 公司"的多 Agent 系统，MetaGPT 的设计值得参考：

用 SOP 固化流程比完全动态协作更可控
消息路由和状态追踪是协作的关键
一定要有预算控制防止失控

项目链接: github.com/geekan/Meta…

参考链接:

AFlow 论文: openreview.net/forum?id=z5…
MGX 产品: mgx.dev/

系列往期:

#002: mem0 — 记忆增强的 AI Agent
#001: browser-use — 让 AI Agent 操作浏览器

「Code = SOP(Team)」— MetaGPT 让 LLM Agent 协作有了一种新的可能性。

🏢 每日开源研读 #003 — MetaGPT：当 AI 组成一家软件公司