AutoGPT 源码深度剖析与生产应用教程目录一、项目宏观全景二、autogpt_platform:多进程分布式架构

仓库:Significant-Gravitas/AutoGPT 本地路径:AutoGPT/ 分析对象:autogpt_platform/(现代生产平台,Polyform Shield License)+ classic/(原版与 Forge 框架,MIT License) 版本:dev 分支(2026-04 快照)

一、项目宏观全景
二、autogpt_platform:多进程分布式架构
三、Block 系统:可组合的执行原子
四、Graph 执行引擎深度剖析
五、数据层与持久化模型
六、前端:Next.js + React Flow 可视化构建器
七、Classic / Forge:Agent Component 框架
八、生产部署完整教程
九、扩展开发实战:自定义 Block
十、生产运维要点
附录:关键源码索引

一、项目宏观全景

1.1 仓库双轨结构

AutoGPT 仓库实质上包含两个完全独立的项目,共用一个 git 历史:

模块	路径	许可证	定位
AutoGPT Platform(主线)	autogpt_platform/	Polyform Shield	生产级 Agent 编排平台,可视化拖拽,云原生
AutoGPT Classic	classic/	MIT	原版 GPT-4 自治 Agent + Forge 框架

新业务全部走 Platform,Classic 仅供历史/教育用途,官方明确标记 unsupported(见 classic/CLAUDE.md)。
二者架构哲学完全不同:Classic 是单进程的 LLM-as-orchestrator(LLM 决定下一步调用什么命令),Platform 是显式 DAG 编排(用户用拖拽方式连接确定性的 Block)。

1.2 Platform 顶层目录

autogpt_platform/
├── backend/                 # FastAPI 多进程后端
│   ├── backend/
│   │   ├── api/             # REST + WebSocket 路由
│   │   ├── blocks/          # 92 个文件,数百个 Block 实现
│   │   ├── data/            # Prisma 模型 + 业务实体
│   │   ├── executor/        # 图执行引擎(管理器 1976 行)
│   │   ├── integrations/    # OAuth、Webhook、Provider
│   │   ├── notifications/   # 邮件/Discord 通知
│   │   ├── copilot/         # AI 副驾驶(对话式建图)
│   │   └── app.py           # 多进程总入口
│   ├── schema.prisma        # 1476 行,~50 个数据库表
│   └── Dockerfile
├── autogpt_libs/            # 公共库:auth、logging、utils
├── frontend/                # Next.js 15 App Router
│   ├── src/app/(platform)/  # 受认证路由组
│   └── src/components/      # atoms/molecules/organisms 设计系统
├── db/                      # Supabase 自托管 docker-compose
├── docker-compose.yml       # 顶层堆栈编排
└── docker-compose.platform.yml  # 平台核心服务

1.3 核心概念总览

概念	说明	源码位置
Block	一个原子执行单元(HTTP 调用、LLM 调用、文本处理…)	backend/blocks/_base.py:443 `class Block`
AgentGraph	由 Block 构成的有向图(DAG,但允许有循环)	data/graph.py
Node	Graph 中的一个 Block 实例,带 `input_default`	data/graph.py:106
Link	节点之间的有向连线,从 `source_name` 到 `sink_name`	data/graph.py:84
AgentGraphExecution	一次完整的图执行实例	schema.prisma:578
AgentNodeExecution	单个节点的一次执行(可被同一图执行多次重用)	schema.prisma:628
AgentPreset	Agent + 用户配置的快照(支持 webhook/cron 触发)	schema.prisma:350
LibraryAgent	用户书架上的 Agent 引用	schema.prisma:430
StoreListing	Marketplace 商店上架的 Agent	schema.prisma:1067
Credentials	用户挂载的第三方凭据(OAuth / API Key)	`data/model.py`
Webhook / Trigger	外部事件触发图运行	backend/integrations/webhooks/
CoPilot	LLM 驱动的"对话式建图"副驾驶	backend/copilot/
HITL / Review	Human-in-the-loop,危险动作前阻塞等待人审	executor/manager.py:650 `is_block_exec_need_review`

二、autogpt_platform:多进程分布式架构

2.1 8 个常驻进程

入口 backend/app.py:34 在一个 Python 解释器内 fork 出 8 个独立进程:

def main(**kwargs):
    run_processes(
        DatabaseManager().set_log_level("warning"),  # Pyro RPC: 业务 DB 访问统一入口
        Scheduler(),                                  # APScheduler: cron 触发
        NotificationManager(),                        # 通知聚合 + 发送
        PlatformLinkingManager(),                     # 第三方平台 bot 链接
        WebsocketServer(),                            # 实时执行状态推送
        AgentServer(),                                # FastAPI REST API
        ExecutionManager(),                           # 图执行引擎
        CoPilotExecutor(),                            # 对话式建图 LLM agent
        **kwargs,
    )

run_processes 把前 7 个跑在后台、最后一个跑在前台,任一进程崩溃整套优雅退出。

2.2 服务拆分(docker-compose 视角)

docker-compose.platform.yml 将上述进程映射到独立容器:

┌──────────┐  ┌────────────┐  ┌───────────┐  ┌──────────┐
│ frontend │→ │ rest_server│→ │ websocket │  │  copilot │
│  (3000)  │  │   (8006)   │  │  (8001)   │  │  (8008)  │
└──────────┘  └─────┬──────┘  └─────┬─────┘  └────┬─────┘
                    │                │             │
       ┌────────────┴────────────────┴─────────────┘
       ▼                                            
┌────────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐
│ database_  │  │executor  │  │scheduler │  │notification│
│  manager   │  │ (8002)   │  │ (8003)   │  │  (8007)    │
│  (8005)    │  └────┬─────┘  └────┬─────┘  └────────────┘
└─────┬──────┘       │             │
      │              ▼             ▼
      ▼      ┌──────────────────────────┐
┌──────────┐ │   RabbitMQ (5672)        │ ←─ graph_execution_queue
│PostgreSQL│ │   Redis (6379)           │ ←─ pubsub + 分布式锁 + 计数
│ (5432)   │ │   ClamAV (3310)          │ ←─ 用户上传文件病毒扫描
│ pgvector │ │   FalkorDB (6380)        │ ←─ Graphiti 知识图谱(CoPilot)
└──────────┘ └──────────────────────────┘

服务	端口	职责
`rest_server`	8006	FastAPI 主 API + OpenAPI 规范源
`websocket_server`	8001	实时推送图/节点状态变更给前端
`executor`	8002	从 RabbitMQ 消费图执行任务,跑 Block
`scheduler`	8003	APScheduler 定时调度 + cron 解析
`database_manager`	8005	Pyro5 RPC 单点数据库访问层(避免连接耗尽)
`copilot_executor`	8008	对话式建图,挂 RAG / 知识图谱
`notification_server`	8007	邮件/Discord 批量通知
`frontend`	3000	Next.js

2.3 通信通道

同步 RPC:Pyro5 把 DatabaseManager 暴露成进程间可调用对象;get_db_async_client() 在 executor 里直接像本地对象一样调用 db_client.upsert_execution_output(...)。
异步消息:RabbitMQ 三类队列
- graph_execution_queue(派发待执行图)
- graph_execution_cancel_queue(取消事件)
- notifications_*(异步通知)
实时事件:Redis Pub/Sub,AsyncRedisEventBus(data/event_bus.py),WebSocket 服务订阅推给浏览器。
分布式锁:Redis Lock,用于
- 同一 credential 同时只能被一个 Block 使用(executor/manager.py:295)
- 集群级图执行去重(ClusterLock,executor/cluster_lock.py)
- 同一节点的下游入边写入互斥(upsert_input-{next_node_id}-{graph_exec_id},manager.py:472)

2.4 启动顺序与依赖

docker-compose 使用 depends_on.condition 强制顺序:

db healthy → migrate (执行 Prisma migrations)
migrate completed → database_manager
database_manager started + redis/rabbitmq healthy → rest_server / executor / ws / scheduler / copilot

三、Block 系统:可组合的执行原子

3.1 Block 抽象基类

backend/blocks/_base.py:443:

class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
    def __init__(
        self,
        id: str = "",                   # UUID,数据库主键(必须永久不变)
        description: str = "",
        input_schema: Type[BlockSchemaInputType] = EmptyInputSchema,
        output_schema: Type[BlockSchemaOutputType] = EmptyOutputSchema,
        test_input, test_output, test_mock, test_credentials,
        disabled: bool = False,
        static_output: bool = False,    # 输出能否被多次消费(扇出)
        block_type: BlockType = BlockType.STANDARD,
        webhook_config: ... = None,
        is_sensitive_action: bool = False,  # 是否触发 HITL 审核
    ):
        ...

    @abstractmethod
    async def run(self, input_data, **kwargs) -> BlockOutput:
        """异步生成器:yield (output_name, output_data)"""

关键设计:

Block 只发一次实例:启动时 initialize_blocks()(data/block.py:21)扫描 backend/blocks/* 注册到 AgentBlock 表,运行时通过 get_block(block_id) 拿到单例。
输入/输出全是 Pydantic 模型:类型安全 + 自动 JSON Schema → 直接喂给前端表单渲染。
生成器式产出:async for output_name, output_data in block.run(...) 允许一个 Block 多次 yield(例如循环 Block 在每个迭代 yield)。

3.2 BlockSchema:类型 + 凭据感知

blocks/_base.py:137:

通过 __pydantic_init_subclass__ 强制约束:任何名为 credentials 或 *_credentials 的字段必须是 CredentialsMetaInput 类型,反之亦然 → 编译期防止凭据字段写错。
jsonschema() 生成 OpenAPI 兼容的扁平化 schema(jsonref.replace_refs),前端直接消费。
validate_data() 支持 exclude_fields,让 dry-run 时跳过凭据校验。

3.3 BlockType 与 BlockCategory

BlockType	用途
`STANDARD`	普通业务 Block
`INPUT` / `OUTPUT`	图的入口/出口节点(由 blocks/io.py 实现)
`WEBHOOK` / `WEBHOOK_MANUAL`	外部 webhook 触发的入口(自动/手动配置)
`AGENT`	子图调用(嵌套调用其他 Agent)
`AI`	LLM 类 Block
`HUMAN_IN_THE_LOOP`	阻塞等待人工回复
`MCP_TOOL`	调用 MCP server 暴露的工具
`NOTE`	画布上的便笺(无执行)

3.4 一个最小 Block 的典型结构

blocks/basic.py:69 StoreValueBlock —— 整本仓库最容易理解的样板:

class StoreValueBlock(Block):
    class Input(BlockSchemaInput):
        input: Any = SchemaField(description="...")
        data: Any = SchemaField(description="...", default=None)

    class Output(BlockSchemaOutput):
        output: Any = SchemaField(description="...")

    def __init__(self):
        super().__init__(
            id="1ff065e9-88e8-4358-9d82-8dc91f622ba9",
            description="...",
            categories={BlockCategory.BASIC},
            input_schema=StoreValueBlock.Input,
            output_schema=StoreValueBlock.Output,
            test_input=[{"input": "Hello, World!"}, ...],
            test_output=[("output", "Hello, World!"), ...],
            static_output=True,
        )

    async def run(self, input_data: Input, **kwargs) -> BlockOutput:
        yield "output", input_data.data or input_data.input

注意:

id 是永久 UUID,改它会破坏所有引用此 Block 的存量图。
test_input/test_output 不只是测试,通用 test_block.py::test_available_blocks[StoreValueBlock] 会在 CI 里自动执行验证 schema 对齐。
static_output=True 表示输出可被多次消费 → 扇出到下游多个节点。

3.5 凭据生命周期(关键安全点)

manager.py:262-310:

for field_name, input_type in input_model.get_credentials_fields().items():
    field_value = input_data.get(field_name)
    credentials_meta = input_type(**field_value)
    credentials, lock = await creds_manager.acquire(user_id, credentials_meta.id)
    creds_locks.append(lock)
    extra_exec_kwargs[field_name] = credentials  # 解密后的真凭据,仅本次执行可见
    ...
try:
    block_iter = node_block.execute(input_data, **extra_exec_kwargs)
    async for ...
finally:
    for creds_lock in creds_locks:
        await creds_lock.release()  # 必释放

要点:

凭据永远不进入 input_data 主体,只通过 **kwargs 注入到 run(),确保不会被序列化进 executionData。
同一凭据同时只能被一个 Block 使用(Redis 锁),避免如 OAuth refresh_token 并发刷新冲突。
凭据到执行时才解密,前置只存 CredentialsMetaInput(id + provider + type)。

3.6 HITL 与敏感操作

blocks/_base.py:650-714 的 is_block_exec_need_review:

if not (self.is_sensitive_action and execution_context.sensitive_action_safe_mode):
    return False, input_data

decision = await HITLReviewHelper.handle_review_decision(...)
if decision is None:
    return True, input_data           # → 整张图进入 REVIEW 状态,持久化等待
if not decision.should_proceed:
    raise BlockExecutionError(...)    # 用户拒绝
return False, decision.review_result.data  # 用户可能修改了入参

数据库表 PendingHumanReview(schema.prisma:698)保存待审项;AgentExecutionStatus.REVIEW 是合法的"半完成"状态,与 RUNNING/QUEUED 一样可被重入。

四、Graph 执行引擎深度剖析

执行引擎是整个平台最复杂、最值得学习的部分。executor/manager.py 共 1976 行,围绕"单图单线程串行调度 + 节点级 asyncio 并发"展开。

4.1 三层并发模型

┌──────────────────────────────────────────────────────────────┐
│ ExecutionManager (主进程)                                    │
│  - 从 RabbitMQ 消费 GraphExecutionEntry                      │
│  - ThreadPoolExecutor(max_workers=N) 派发到工作线程          │
└────────────────────────┬─────────────────────────────────────┘
                         │ 每图 1 个 worker 线程
                         ▼
┌──────────────────────────────────────────────────────────────┐
│ ExecutionProcessor (线程级单例,thread-local)                │
│  - on_graph_execution: 同步主循环,管理 ExecutionQueue       │
│  - 每线程自带 2 条独立 asyncio event loop:                   │
│      ▸ node_execution_loop  (跑 Block.run)                   │
│      ▸ node_evaluation_loop (跑出/入队评估)                  │
└────────────────────────┬─────────────────────────────────────┘
                         │ run_coroutine_threadsafe
                         ▼
┌──────────────────────────────────────────────────────────────┐
│ on_node_execution (asyncio coroutine)                        │
│  - execute_node:校验 → 拉凭据 → 调 Block.run                 │
│  - persist_output:DB upsert + WS 推送                        │
│  - 计费 + Sentry/打点 + 释放凭据锁                           │
└──────────────────────────────────────────────────────────────┘

为什么要这么复杂?

图维度用线程隔离 → 一个图崩溃不影响其他图;ThreadPoolExecutor 上限即并发图数。
节点维度用 asyncio → I/O 密集(LLM 调用、HTTP)零线程开销并发。
2 条 event loop 分离:节点执行可能很慢,但出队评估必须实时,避免互相阻塞。
init_worker() 保证每个 worker 线程只在第一次进入时初始化 ExecutionProcessor(放进 threading.local()),复用 event loop,避免每次开图重启 asyncio。

4.2 主循环骨架

executor/manager.py:1011:

while not execution_queue.empty():
    if cancel.is_set(): break

    queued_node_exec = execution_queue.get()

    # 1. 跳过条件(可选凭据未配置时直接 COMPLETED)
    if queued_node_exec.node_id in graph_exec.nodes_to_skip:
        update_node_execution_status(..., COMPLETED); continue

    # 2. 计费(InsufficientBalanceError 会优雅停图)
    if not dry_run:
        cost, remaining = billing.charge_usage(...)
        billing.handle_low_balance(...)

    # 3. 输入覆写(presets / nodes_input_masks)
    queued_node_exec.inputs.update(node_input_mask)

    # 4. 把节点执行 coroutine 投到 node_execution_loop
    node_execution_task = asyncio.run_coroutine_threadsafe(
        self.on_node_execution(...),
        self.node_execution_loop,
    )
    running_node_execution[node_id].add_task(...)

    # 5. 内层轮询:等队列回填或所有 inflight 完成
    while execution_queue.empty() and (running_node_execution or running_node_evaluation):
        for node_id, inflight_exec in list(running_node_execution.items()):
            # 5a. 评估完成 → pop
            # 5b. 执行完成 → pop
            # 5c. 有新输出 → 投到 node_evaluation_loop 算下游入边
            if output := inflight_exec.pop_output():
                running_node_evaluation[node_id] = asyncio.run_coroutine_threadsafe(
                    self._process_node_output(...),  # → _enqueue_next_nodes
                    self.node_evaluation_loop,
                )
        if 没新输出:
            cluster_lock.refresh()  # 续租分布式锁,防止其他副本抢
            time.sleep(0.1)

4.3 数据流:Link → Output → 下游入队

manager.py:415-574 _enqueue_next_nodes:

每当一个节点 yield 出 (output_name, output_data):

遍历该节点的所有 output_links(出边)。
每条出边检查 source_name 是否匹配当前 output_name(支持嵌套字段抽取 parse_execution_output,例:response.data.0.url)。
进入下游节点(sink_id),关键操作 db_client.upsert_execution_input:
- 找下游节点最早的 INCOMPLETE 执行(可能是同图内之前轮 yield 留下的半成品),把当前数据补进对应 sink_name;
- 没有就新建 AgentNodeExecution,状态 INCOMPLETE。
静态输入回填:对 is_static=True 的入边(@@unique 标记),从最近一次完整执行拿值补齐缺失字段。
validate_exec 全量验证下游节点的合并输入:
- 缺必填字段 → 留 INCOMPLETE,等其他上游 yield;
- 完整 → 更新为 QUEUED,加进 execution_queue。
静态出边触发所有等待该 static 值的 INCOMPLETE 执行重新 validate → 形成扇出。

这套机制让用户无需手动管理:扇入(多上游汇聚)、扇出(一上游多下游)、循环(同节点多次执行)都自然涌现。

4.4 状态机

data/execution.py:134 VALID_STATUS_TRANSITIONS:

INCOMPLETE ─┬─→ QUEUED  ─┬─→ RUNNING ─┬─→ COMPLETED
            │             │            ├─→ FAILED ───┐
            │             │            ├─→ TERMINATED ┤
            │             │            └─→ REVIEW ───┐│
            │             │                          ││
            │             └──────────────────────────┘│
            └─────────────────────────────────────────┘
                          (任意失败/终止/审核态都能恢复重入)

特别值得注意:COMPLETED → RUNNING 也是合法转换!这意味着前端展示的"已完成"可能只是上一轮,当用户提供更多 input 或 webhook 二次触发时,执行会接力继续。

4.5 子图调用:`AgentExecutorBlock`

blocks/agent.py —— Agent 嵌套的实现机制:

async def run(self, input_data, *, graph_exec_id, execution_context, **kwargs):
    # 1. 在数据库新建一个子图 execution
    graph_exec = await execution_utils.add_graph_execution(
        graph_id=input_data.graph_id,
        execution_context=execution_context.model_copy(
            update={"parent_execution_id": graph_exec_id},  # 父子链路记录
        ),
    )
    # 2. 订阅子图的事件总线,把子图所有 OUTPUT 节点的产出原样 yield 出去
    async for event in event_bus.listen(...):
        if event.status in [COMPLETED, TERMINATED, FAILED]:
            ...  # 子图结束,合并 stats
        block = get_block(event.block_id)
        if block.block_type != BlockType.OUTPUT:
            continue
        for output_data in event.output_data.get("output", []):
            yield output_name, output_data

实现优雅之处:

子图作为正式的 GraphExecution 走完整流程(独立计费、独立 HITL、独立可见于历史)。
父图通过订阅 Redis pub/sub 拿子图输出,不引入直接函数调用 → 子图可以跑在其他 executor pod 上。
parent_execution_id 让用户在 UI 里能下钻看到嵌套 trace。

4.6 Dry-Run / 模拟执行

executor/simulator.py:

execution_context.dry_run=True 时:

大部分 Block 不真正执行,而是用平台 OpenRouter key 调一个轻量 LLM(simulate_block)生成"看起来合理"的输出 → 让用户在画布上预览数据流。
特殊 Block(AgentExecutorBlock、OrchestratorBlock)走 prepare_dry_run 真执行,但用平台凭据。
凭据字段在 dry-run 时被 validate_data(exclude_fields=cred_field_names) 跳过校验。

这是 Platform 跟 Classic 的重要差异:Classic 是"真跑出错再调",Platform 用 dry-run 让用户在不烧钱的前提下设计图。

4.7 计费与限流

executor/billing.py + data/credit.py:

每个 Block 有 BlockCost(cost_amount, cost_filter, cost_type ∈ {RUN, BYTE, SECOND})。
预扣:charge_usage 在节点 dispatch 前从 UserBalance 扣一份基础费用,失败抛 InsufficientBalanceError 优雅停图。
后置追加:LLM 类 Block 跑完后通过 extra_runtime_cost(stats) 补扣 token 费用(如 OrchestratorBlock 一次 run 调多次 LLM)。
低余额告警:handle_low_balance 跨阈值时触发邮件。
限流:copilot/rate_limit.py 按 subscriptionTier 给 5x/20x/60x 的不同倍率(FREE/PRO/BUSINESS/ENTERPRISE)。

五、数据层与持久化模型

5.1 Schema 总览

schema.prisma 共 ~50 个表,分 7 类:

类别	主要表
用户与权限	`User`, `UserOnboarding`, `Profile`, `APIKey`, `UserBalance`
工作区与文件	`UserWorkspace`, `UserWorkspaceFile`, `SharedExecutionFile`
Agent 定义	`AgentGraph`, `AgentNode`, `AgentNodeLink`, `AgentBlock`, `AgentPreset`
执行	`AgentGraphExecution`, `AgentNodeExecution`, `AgentNodeExecutionInputOutput`, `AgentNodeExecutionKeyValueData`
库与商店	`LibraryAgent`, `LibraryFolder`, `StoreListing`, `StoreListingVersion`, `StoreListingReview`, `UnifiedContentEmbedding`(pgvector)
集成与触发	`IntegrationWebhook`, `PendingHumanReview`
计费/分析/通知	`CreditTransaction`, `CreditRefundRequest`, `PlatformCostLog`, `AnalyticsDetails`, `NotificationEvent`, `UserNotificationBatch`
OAuth 提供方(双向)	`OAuthApplication`, `OAuthAuthorizationCode`, `OAuthAccessToken`, `OAuthRefreshToken`
平台联动	`PlatformLink`, `PlatformUserLink`, `ChatSession`, `ChatMessage`

5.2 关键数据模型设计

5.2.1 AgentGraph 版本化

model AgentGraph {
  id      String  @default(uuid())
  version Int     @default(1)
  ...
  forkedFromId      String?
  forkedFromVersion Int?
  forkedFrom        AgentGraph?  @relation("AgentGraphForks", ...)
  forks             AgentGraph[] @relation("AgentGraphForks")

  @@id(name: "graphVersionId", [id, version])  // 复合主键
}

(id, version) 是复合主键:每次保存都新建版本,旧版本依然可被引用执行,前端可以"回滚"。isActive 决定哪个版本是用户当前活动版本。fork 链路追溯 Marketplace 复制的 Agent。

5.2.2 NodeExecution 的输入输出分离

model AgentNodeExecution {
  id               String
  agentNodeId      String
  Input  AgentNodeExecutionInputOutput[] @relation("AgentNodeExecutionInput")
  Output AgentNodeExecutionInputOutput[] @relation("AgentNodeExecutionOutput")
  executionStatus  AgentExecutionStatus
  ...
}

model AgentNodeExecutionInputOutput {
  name String        // 输入或输出 pin 的名字
  data Json?
  ...
  @@unique([referencedByInputExecId, referencedByOutputExecId, name])
  @@index([name, time])    // 关键: upsert_execution_input 的复合索引
}

为什么不直接 executionData: Json?:扇入场景下,多个上游会异步地往同一节点的不同 pin 写入。把 input/output 拆成独立行 + @@unique 让 Postgres 的 INSERT ... ON CONFLICT 天然支持原子合并。

5.2.3 触发器与 Webhook

model AgentPreset {
  ...
  webhookId String?
  Webhook   IntegrationWebhook?
}

model IntegrationWebhook {
  ...
  // 一个 webhook 可对应多个 Preset
}

AgentPreset 是触发器侧入口:用户配置一个 GitHub PR webhook → Platform 注册到 GitHub API → 收到回调 → 找对应 Preset → 用 AgentPreset.InputPresets 作为图入参 → add_graph_execution。

5.3 DatabaseManager:Pyro RPC 网关

backend/data/db_manager.py 是所有进程访问 DB 的唯一入口(除 migration):

服务端用 Pyro5 暴露 DatabaseManager 类,所有方法注解为 @expose。
客户端用 get_database_manager_async_client() 获得透明代理,调用方法等价于 RPC。
好处:
- 连接池统一:Postgres 连接数不会随 executor pod 数线性增长。
- 更易加缓存/限流/审计:全部走一个出入口。
- 解耦 ORM:executor 不直接依赖 Prisma client。

六、前端:Next.js + React Flow 可视化构建器

6.1 技术栈速览

Next.js 15 App Router + 客户端优先(server component 仅用于 SEO/TTFB)。
API 客户端零手写:Orval 从 backend OpenAPI 生成 React Query hooks → pnpm generate:api 一键同步。
状态管理:服务态用 React Query;跨组件 UI 态(Builder 等)用 Zustand;局部 UI 态原地 useState。
设计系统三层:atoms/molecules/organisms(类原子设计),Tailwind tokens + shadcn/ui + Phosphor Icons。
测试金字塔:90% Vitest+RTL+MSW 集成测试(MSW handler 也由 Orval 生成);关键流程用 Playwright;设计系统组件用 Storybook + Chromatic。

6.2 路由结构

src/app/
├── layout.tsx                    # 全局根布局
├── (no-navbar)/                  # 不含导航的页面组(登录前)
└── (platform)/                   # 受认证路由组
    ├── layout.tsx                # 平台 chrome
    ├── library/                  # 我的 Agent 库
    ├── build/                    # 可视化构建器(主战场)
    ├── marketplace/              # 商店
    ├── copilot/                  # CoPilot 对话
    ├── admin/                    # 管理后台
    ├── login/, signup/, auth/    # 认证流程
    └── profile/

middleware.ts(Supabase 会话校验)保护 (platform) 路由组。

6.3 FlowEditor:基于 React Flow 的画布

frontend/src/app/(platform)/build/components/FlowEditor/ 自带架构文档 ARCHITECTURE_FLOW_EDITOR.md:

Flow.tsx                        # 总编排器
 ├─ ReactFlow canvas
 ├─ 4 个 Zustand store
 │   ├─ nodeStore        # 节点 + 状态 + advanced 折叠状态
 │   ├─ edgeStore        # 边 + 动画 EdgeBead(数据流粒子)
 │   ├─ graphStore       # 图元数据 + dirty 标记
 │   └─ controlPanelStore
 ├─ nodes/               # 节点渲染组件(每种 BlockType 一个)
 ├─ edges/               # 边渲染组件
 └─ handlers/            # onConnect, onDrop, onPaste, etc.

典型用户操作链路:

用户从 NewControlPanel 拖块 → onDrop → nodeStore.addBlock()(从 /api/blocks 拿 schema)。
拖出连线 → onConnect → edgeStore.onConnect(),Source/Sink handle 类型校验(基于 BlockSchema 推导的颜色编码)。
节点表单基于 inputSchema 自动生成(SchemaField → react-hook-form)。
保存图:useSaveGraph.ts 把当前 Zustand 状态序列化成 backend 期望的 CreateGraph 体,调 usePostV1CreateGraph。
运行:usePostV1ExecuteGraph(graphId) → backend 创建 GraphExecution → WebSocket 推送实时状态 → useExecutionUpdates → nodeStore.updateNodeStatus() 在画布上把节点边框变绿/红。

6.4 前端关键代码约定(摘自 frontend/CONTRIBUTING.md)

客户端优先:除非 SEO/TTFB,否则不写 server component。
数据获取必须用生成的 hooks(use{Method}{Version}{OperationName}),禁止新增 BackendAPI(legacy)用法。
组件分层:Component.tsx(纯渲染)+ useComponent.ts(逻辑)+ helpers.ts(纯函数);文件 ≤ 200 行,函数 ≤ 50 行。
不写 barrel/index 文件,不滥用 useMemo/useCallback。
错误处理三档:渲染错误 <ErrorCard />、mutation 错误 toast、未捕获 → Sentry。
特性开关走 LaunchDarkly + useGetFlag(Flag.X)。

七、Classic / Forge:Agent Component 框架

7.1 与 Platform 的根本区别

维度	Platform	Classic / Forge
编排者	用户(可视化)	LLM(自治)
决策粒度	设计期固定	每"步"问 LLM 下一步
状态	DB 持久化的图执行	Agent State JSON + 工作区文件
入口	FastAPI / WebSocket	Agent Protocol REST(`/ap/v1`)
并发	多进程 + 多线程 + asyncio	单进程
安全模型	凭据 + Webhook + HITL	命令模式匹配 + 工作区沙盒

7.2 BaseAgent 与 Pipeline

classic/forge/forge/agent/base.py:

class BaseAgent(Generic[AnyProposal], metaclass=AgentMeta):
    @abstractmethod
    async def propose_action(self) -> AnyProposal: ...

    @abstractmethod
    async def execute(self, proposal, user_feedback) -> ActionResult: ...

    async def run_pipeline(self, protocol_method, *args, retry_limit=3):
        """
        遍历所有实现了某个 Protocol 的组件,串起来调用,
        汇总结果。带分级重试:
          ComponentEndpointError → 重试同组件
          EndpointPipelineError  → 重启整条 pipeline
        """

核心抽象:Component + Protocol

AgentComponent:可插拔模块基类,_run_after 声明依赖顺序(由 AgentMeta 拓扑排序)。
Protocol:接口约定(纯 Python 类),组件多重继承多个 Protocol 表示自己能干什么:
- DirectiveProvider: 提供约束/资源/最佳实践给系统提示词
- CommandProvider: 提供工具(用 @command 装饰)
- MessageProvider: 注入历史消息
- AfterParse / AfterExecute / ExecutionFailure: 钩子
ConfigurableComponent[BM]:绑定 Pydantic 配置类,from_env() 读环境变量。

7.3 命令系统

classic/forge/forge/command/decorator.py:

@command(
    names=["greet"],
    description="...",
    parameters={"name": JSONSchema(type=JSONSchema.Type.STRING, required=True)},
)
def greet(self, name: str) -> str:
    return f"Hello, {name}!"

被装饰的方法成为 Command 对象,能转成 OpenAI/Anthropic 的 function-calling spec 喂给 LLM,LLM 调用就反序列化回来执行。

7.4 内置组件矩阵(摘自 classic/forge/CLAUDE.md)

组件	实现的 Protocol	用途
SystemComponent	Directive/Message/CommandProvider	系统提示 + `finish` 命令
FileManagerComponent	Directive/CommandProvider	read/write/list 文件
CodeExecutorComponent	CommandProvider	Python/shell(Docker 隔离)
WebSearchComponent	Directive/CommandProvider	DuckDuckGo / Google
WebPlaywrightComponent	Directive/CommandProvider	浏览器自动化
ActionHistoryComponent	Message/AfterParse/AfterExecute	历史摘要
WatchdogComponent	AfterParse	检测循环、热切到 smart_llm
ContextComponent	Message/CommandProvider	把指定文件钉在上下文
ImageGeneratorComponent	CommandProvider	DALL·E / SD
GitOperationsComponent	CommandProvider	Git
UserInteractionComponent	CommandProvider	`ask_user`

7.5 权限系统(Classic 独有)

{workspace}/.autogpt/autogpt.yaml:

allow:
  - read_file({workspace}/**)
  - web_search(*)
deny:
  - read_file(**.env)
  - execute_shell(rm -rf:*)

首次匹配胜出的优先级:agent deny → workspace deny → agent allow → workspace allow → 会话级 deny → 交互式询问。

八、生产部署完整教程

8.1 系统要求

Linux(Ubuntu 22.04+ 推荐)/ macOS / Windows + WSL2
Docker Engine 20.10+,Docker Compose v2
4 核 CPU / 8 GB RAM(最低)/ 16 GB(推荐)
10 GB 磁盘
出站 HTTPS

8.2 一键脚本(本地最快)

# Linux/macOS
curl -fsSL https://setup.agpt.co/install.sh -o install.sh && bash install.sh

# Windows PowerShell
powershell -c "iwr https://setup.agpt.co/install.bat -o install.bat; ./install.bat"

8.3 手工部署(生产推荐)

# 1. 拉代码
git clone https://github.com/Significant-Gravitas/AutoGPT.git
cd AutoGPT/autogpt_platform

# 2. 复制环境配置
cp backend/.env.default backend/.env
cp frontend/.env.default frontend/.env
cp .env.default .env

# 3. 改关键配置(必须):
#   backend/.env:
#     - OPENAI_API_KEY 或 ANTHROPIC_API_KEY(平台自带 LLM Block 用)
#     - JWT_SECRET_KEY(>= 32 字节随机)
#     - SUPABASE_JWT_SECRET
#     - PLATFORM_BASE_URL(webhook 回调用)
#     - 各种 OAuth client_id/secret(GitHub/Google/Slack/...)
#   frontend/.env:
#     - NEXT_PUBLIC_AGPT_SERVER_URL=https://api.your.domain
#     - NEXT_PUBLIC_AGPT_WS_SERVER_URL=wss://ws.your.domain

# 4. 启动整套
docker compose up -d

# 5. 看日志/健康检查
docker compose logs -f rest_server
curl http://localhost:8006/health

8.4 服务端口与反代

服务	内部端口	用途
frontend	3000	Next.js
rest_server	8006	REST API + OpenAPI
websocket_server	8001	WS
executor	8002	(内部)
scheduler	8003	(内部)
database_manager	8005	Pyro RPC(内部)
copilot	8008	(内部)
supabase kong	8000	Auth 代理

Nginx 反代示例(只暴露 frontend / rest / ws / kong):

server {
  listen 443 ssl http2;
  server_name app.your.domain;
  location /     { proxy_pass http://frontend:3000; }
}
server {
  listen 443 ssl http2;
  server_name api.your.domain;
  location /     { proxy_pass http://rest_server:8006; }
}
server {
  listen 443 ssl http2;
  server_name ws.your.domain;
  location / {
    proxy_pass http://websocket_server:8001;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 3600s;     # 必须 ≥ 最长 Agent 执行时长
  }
}
server {
  listen 443 ssl http2;
  server_name auth.your.domain;
  location /     { proxy_pass http://kong:8000; }
}

8.5 数据库与扩展

Postgres 必须装 pgvector(schema.prisma:5 已声明)用于 embedding。
migrate 容器自动跑 prisma migrate deploy,不要在生产手工跑 prisma migrate dev。
备份:pg_dump 整库;恢复后再 prisma generate。

8.6 凭据 / OAuth 配置

每个第三方 Provider 需要在 autogpt_platform/backend/backend/integrations/oauth/ 注册的回调 URL 是 https://api.your.domain/api/integrations/{provider}/callback。常见配置:

# GitHub
GITHUB_CLIENT_ID=...
GITHUB_CLIENT_SECRET=...

# Google
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...

# OpenAI(系统级,作为 dry-run 模拟用)
OPENAI_API_KEY=...
OPENROUTER_API_KEY=...

8.7 Webhook 触发器

要让 Webhook Block 生效,PLATFORM_BASE_URL 必须是公网可达 HTTPS。本地开发可用 ngrok:

ngrok http 8006
# 把临时 URL 写到 backend/.env: PLATFORM_BASE_URL=https://abc.ngrok.io

8.8 横向扩展

executor 可水平扩:docker compose up -d --scale executor=3,RabbitMQ 自动按 prefetch 分发,ClusterLock 防双跑。
rest_server 可水平扩:无状态,前面挂 LB。
scheduler 不可水平扩:有调度状态,1 个就好。
database_manager 也只 1 个:它本身是连接收口。
websocket 可扩,但前端要支持粘性会话(同一用户落到同一 pod)或共用 Redis pub/sub(已实现)。

九、扩展开发实战:自定义 Block

9.1 最小可用 Block

新建 backend/blocks/my_calculator.py:

import uuid
from backend.blocks._base import (
    Block, BlockCategory, BlockOutput,
    BlockSchemaInput, BlockSchemaOutput, BlockType,
)
from backend.data.model import SchemaField


class MultiplyBlock(Block):
    class Input(BlockSchemaInput):
        a: float = SchemaField(description="First operand")
        b: float = SchemaField(description="Second operand")

    class Output(BlockSchemaOutput):
        product: float = SchemaField(description="a * b")

    def __init__(self):
        super().__init__(
            id=str(uuid.uuid4()),  # 一次性生成,粘到代码里永久使用
            description="Multiply two numbers",
            categories={BlockCategory.BASIC},
            input_schema=MultiplyBlock.Input,
            output_schema=MultiplyBlock.Output,
            test_input={"a": 6, "b": 7},
            test_output=[("product", 42)],
        )

    async def run(self, input_data: Input, **kwargs) -> BlockOutput:
        yield "product", input_data.a * input_data.b

测试:

poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[MultiplyBlock]' -xvs

下次 rest_server 启动时,initialize_blocks() 会把它注册到 AgentBlock 表,前端拖块面板自动出现。

9.2 带凭据的 Block(以 HTTP API 为例)

from backend.data.model import APIKeyCredentials, CredentialsField, CredentialsMetaInput
from backend.integrations.providers import ProviderName
from backend.util.request import Requests

CredsType = CredentialsMetaInput[
    Literal[ProviderName.OPENAI], Literal["api_key"]
]

class MyApiBlock(Block):
    class Input(BlockSchemaInput):
        url: str = SchemaField(description="...")
        credentials: CredsType = CredentialsField(...)  # 必须叫 credentials

    class Output(BlockSchemaOutput):
        body: dict = SchemaField(description="...")

    def __init__(self):
        super().__init__(id="...", ...)

    async def run(
        self, input_data: Input, *,
        credentials: APIKeyCredentials,   # 注意:这里收的是解密后的对象,不是 meta
        **kwargs,
    ) -> BlockOutput:
        async with Requests() as r:
            resp = await r.get(
                input_data.url,
                headers={"Authorization": f"Bearer {credentials.api_key.get_secret_value()}"},
            )
            yield "body", resp.json()

9.3 Webhook 触发 Block

from backend.blocks._base import BlockWebhookConfig

class GitHubPRWebhookBlock(Block):
    class Input(BlockSchemaInput):
        credentials: CredsType = CredentialsField(...)
        repo: str = SchemaField(description="owner/repo")
        events: GithubPREventFilter = SchemaField(...)  # 全 bool 字段的 BaseModel
        payload: dict = SchemaField(hidden=True)        # webhook 必须有

    def __init__(self):
        super().__init__(
            id="...",
            block_type=BlockType.WEBHOOK,    # 自动设
            webhook_config=BlockWebhookConfig(
                provider=ProviderName.GITHUB,
                webhook_type="repo",
                event_filter_input="events",
                event_format="pull_request.{event}",
                resource_format="{repo}",
            ),
            ...
        )

    async def run(self, input_data: Input, **kwargs) -> BlockOutput:
        yield "payload", input_data.payload

平台会:

当用户配置了这个 Block 的 AgentPreset → 自动调用 GitHub API 注册 webhook 到 PLATFORM_BASE_URL/api/integrations/github/webhook/{webhook_id}。
GitHub 推送回调 → 找到对应 webhook_id → 找到关联的 Preset → 创建 GraphExecution。

9.4 文件类 Block

backend/util/file.py 的 store_media_file() 是关键:

local_path = await store_media_file(
    file=input_data.image,
    execution_context=execution_context,
    return_format="for_local_processing",   # 拿本地路径,给 ffmpeg/PIL
)

三种 return_format:

"for_local_processing" → 本地路径(给本地工具用)
"for_external_api" → data URI(给 Replicate/OpenAI 等)
"for_block_output" → 智能切换(CoPilot 内是 workspace://,普通图内是 data URI)

返回输出永远用 for_block_output。

9.5 自定义 Provider 凭据

backend/sdk/ 的 ProviderBuilder:

from backend.sdk import ProviderBuilder

ProviderBuilder("myservice")
    .with_api_key(env_var="MYSERVICE_API_KEY", title="MyService API Key")
    .with_base_cost(amount=1, type=BlockCostType.RUN)
    .register()

会自动:

暴露 OAuth/API key 配置 UI
把环境变量识别为系统级凭据(不需要每用户配置)
把 cost 同步到 BlockCost

9.6 单元测试模板

backend/blocks/test/test_block.py 提供框架:

@pytest.mark.parametrize("block_cls", get_blocks().values(), ids=lambda b: b.__name__)
async def test_available_blocks(block_cls):
    block = block_cls()
    if block.test_input is None:
        pytest.skip("no test_input")

    inputs = block.test_input if isinstance(block.test_input, list) else [block.test_input]
    expected = block.test_output if isinstance(block.test_output, list) else [block.test_output]

    actual_outputs = []
    for ti in inputs:
        async for name, data in block.execute(ti, **mock_kwargs):
            actual_outputs.append((name, data))

    assert outputs_match(actual_outputs, expected)

如果 Block 调外部 API,用 test_mock={"_call_api": lambda url: {...}} 即可在测试时把方法替换掉。

十、生产运维要点

10.1 可观测性

Prometheus:rest_server / executor 自带 /metrics,关键指标:
- execution_manager_active_runs
- execution_manager_pool_size
- execution_manager_utilization_ratio
Sentry:代码内大量 _sentry_capture_exception(见 manager.py:387)。配置 SENTRY_DSN 即接入。
结构化日志:backend/util/logging.py 的 TruncatedLogger 自动截断超长字段(防止日志爆磁盘)。
审计:PlatformCostLog 记录系统级凭据(如 OpenRouter dry-run key)的所有调用归账到具体用户。

10.2 安全清单

✅ CORS 白名单:backend_cors_allow_origins 必须收敛,禁止 *(rest_api.py 默认所有 method/header * 是为了灵活,但 origin 受 env 控制)。
✅ Cache 中间件:middleware/security.py 默认禁所有缓存,只白名单静态资源。
✅ JWT:认证全部委托 Supabase,通过 autogpt_libs.auth.jwt_utils.get_jwt_payload。
✅ 凭据加密:User.integrations 字段被 migrate_and_encrypt_user_integrations 加密。
✅ 病毒扫描:用户上传文件强制走 ClamAV(CLAMAV_SERVICE_HOST),失败拒收。
✅ Path 防漏:错误信息 os.path.basename() 截断,防泄漏目录结构(后端 AGENTS.md 明确要求)。
✅ TOCTOU:计费/文件检查避免 check-then-act,用原子操作。
⚠️ Polyform Shield 许可证:autogpt_platform 不能用于"实质性竞争服务",自托管个人/团队使用 OK,做 SaaS 转售要先看许可证文本。

10.3 性能调优

num_graph_workers(env)= 每个 executor 容器并发的图数。CPU/内存允许下越大越好,但被 Postgres 连接池限制。
fastapi_thread_pool_size(env,默认 40 太低)= 同步端点/依赖在 FastAPI 默认线程池跑,建议 200+。
GZip 在 rest_api.py:200 已开,阈值 50KB(主要保护 /api/blocks 这种大 schema 列表)。
WebSocket 长连接:反代 proxy_read_timeout 必须 ≥ 单图最长执行时长,否则被切断会导致前端状态错乱。
RabbitMQ 消费者**prefetch_count = pool_size**(manager.py:1455),保证不会因预取过多导致单 pod 把队列吃光。

10.4 常见问题排查

症状	原因	解决
图卡在 QUEUED	executor 没起 / RabbitMQ 不通	`docker compose logs executor`、检查 `rabbitmq` 健康
节点卡在 INCOMPLETE	上游某 pin 没值或验证失败	看 `validate_exec` 日志的 `validation_msg`
Webhook 不触发	`PLATFORM_BASE_URL` 不对 / 不是 HTTPS	改 env 重启 + 删除并重建 Preset
凭据"找不到"	用户改密 / token 过期	让用户在前端 `Integrations` 重新授权
Pyro 连接报错	`database_manager` 未起 / network 名错	docker network 检查
余额扣到负	并发执行时多次扣款	看 `credit_concurrency_test.py`,确保 Redis 锁可用
图执行幂等问题	同一节点被重入	`ExecutionStatus` 状态机已经处理,看是否手动跳了状态

10.5 升级与备份

Schema 变更:开发期 poetry run prisma migrate dev 生成 migration → 提交;生产 migrate 容器自动 migrate deploy。
回滚:Prisma 不支持自动 down,要么手写 down SQL 要么先 backup 再 forward。
备份:每天 pg_dump + 7 天保留 + 异地副本。
Redis 不需要持久化(只是锁/队列辅助/计数,RabbitMQ 才是消息源)。
RabbitMQ 队列声明为 durable + 消息 persistent;迁移 broker 时先 drain 再切。

附录:关键源码索引

所有路径基于本地 AutoGPT/ 克隆。

文档/规范

仓库总规范:AGENTS.md
平台规范:autogpt_platform/AGENTS.md
Backend 规范:backend/AGENTS.md
Backend 测试指南:backend/TESTING.md
官方文档:docs.agpt.co

写在最后

AutoGPT Platform 体现了一个成熟生产级 Agent 编排平台的全部关键设计:

可视化 + 类型驱动:用户不写代码,但所有连线背后都有 Pydantic schema 校验。
异构并发:进程级隔离故障域 + 线程级隔离图 + asyncio 级压榨 I/O。
状态全可恢复:AgentExecutionStatus 状态机能从任何中间态(包括人审 REVIEW)接力。
HITL 一等公民:危险操作前阻塞、可编辑、有审计轨迹。
凭据零信任:不进 input,运行时拉取,执行后释放,Redis 锁防并发。
dry-run 经济模型:让用户用平台凭据"白嫖"模拟,降低试错成本。
API-as-data:Backend OpenAPI → 前端 Orval 生成的 hooks → MSW handler,一处定义、三处复用。

而 Classic / Forge 则是另一种范式的优秀样本:LLM-as-orchestrator + Component/Protocol 插件化 —— 适合做研究性 Agent 框架的骨架。

两套代码、两种哲学,同一个仓库 —— AutoGPT 的双轨结构本身就是一份非常好的 Agent 系统演化教材。

AutoGPT 源码深度剖析与生产应用教程

目录