Trae Agent Patch 生成与选择机制
概述
Trae Agent 项目采用多 Patch 生成 + Selector Agent 选择的两阶段策略来解决软件工程问题。本文档详细分析这一机制的实现原理和代码结构。
一、整体流程
┌─────────────────────────────────────────────────────────────────────────────┐
│ Phase 1: Patch 生成阶段 │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 1. 运行多次 TraeAgent(每次独立执行) │
│ - 相同的问题描述 │
│ - 相同的代码库 │
│ - 不同的随机种子或温度参数 │
│ - 生成多个候选 Patch │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 2. 收集候选 Patch │
│ - 保存每个 Patch 的 diff │
│ - 记录回归测试结果(可选) │
│ - 生成 candidate_patches.jsonl │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Phase 2: Patch 选择阶段 │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 3. Selector Agent 评估 │
│ - 分析每个 Patch 的正确性 │
│ - 在沙箱中验证(可选) │
│ - 选择最佳 Patch │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ 4. 多数投票(可选) │
│ - 多次运行 Selector Agent │
│ - 统计选择频率 │
│ - 选择得票最高的 Patch │
└─────────────────────────────────────────────────────────────────────────────┘
二、Phase 1: Patch 生成
2.1 多次运行生成多 Patch
原理:通过多次独立运行 TraeAgent,利用 LLM 的随机性生成不同的解决方案。
# 伪代码示意
def generate_multiple_patches(instance, num_candidates=10):
"""生成多个候选 Patch"""
candidates = []
for i in range(num_candidates):
# 每次使用略微不同的参数
config = TraeAgentConfig(
model=model,
temperature=0.5 + i * 0.05, # 略微调整温度
max_steps=200,
)
agent = TraeAgent(config)
execution = await agent.execute_task()
# 提取生成的 Patch
patch = extract_patch_from_execution(execution)
candidates.append({
"instance_id": instance["instance_id"],
"patch": patch,
"regression": [], # 回归测试结果
})
return candidates
2.2 候选 Patch 数据格式
文件: evaluation/patch_selection/example/example.jsonl
{
"instance_id": "astropy__astropy-14369",
"issue": "问题描述...",
"patches": [
"patch diff 1",
"patch diff 2",
...,
"patch diff N"
],
"success_id": [
1,
0,
...,
1
],
"regressions": [
[], // Patch 1 通过所有回归测试
["test_a"], // Patch 2 有回归测试失败
...,
[] // Patch N 通过所有回归测试
]
}
| 字段 | 说明 |
|---|---|
instance_id | 问题实例 ID |
issue | 问题描述 |
patches | 候选 Patch 列表(diff 格式) |
success_id | 是否正确(1=正确,0=错误) |
regressions | 回归测试失败的测试名列表 |
三、Phase 2: Patch 选择
3.1 入口: selector.py
文件: evaluation/patch_selection/selector.py
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--candidate_path", required=True) # 候选 Patch 文件
parser.add_argument("--num_candidate", type=int, default=10)
parser.add_argument("--group_size", type=int, default=10)
parser.add_argument("--majority_voting", action="store_true")
args = parser.parse_args()
# 加载候选 Patch
candidate_dic = {}
with open(args.candidate_path, "r") as file:
for line in file.readlines():
candidate = json.loads(line.strip())
candidate_dic[candidate["instance_id"]] = candidate
# 创建评估器
evaluation = SelectorEvaluation(
llm_config,
args.num_candidate,
args.max_retry,
args.max_turn,
args.log_path,
args.output_path,
args.patches_path,
instance_list,
candidate_dic,
tools_path,
args.statistics_path,
args.group_size,
majority_voting=args.majority_voting,
)
# 运行评估
evaluation.run_all(max_workers=args.max_workers)
3.2 分组处理策略
文件: evaluation/patch_selection/trae_selector/selector_evaluation.py
def run_instance(
instance,
candidate_log,
num_candidate: int,
group_size: int,
...
):
"""
将候选 Patch 分组处理
例如:50 个候选,group_size=10,分为 5 组
"""
groups = []
for i in range(0, num_candidate, group_size):
this_group = {
"instance_id": candidate_log["instance_id"],
"issue": candidate_log["issue"],
"patches": candidate_log["patches"][i:i + group_size],
"regressions": candidate_log["regressions"][i:i + group_size],
"success_id": candidate_log["success_id"][i:i + group_size],
}
groups.append(this_group)
# 每组独立选择
for group_id, group in enumerate(groups):
run_instance_by_group(
instance=instance,
candidate_log=group,
group_id=group_id,
num_groups=len(groups),
...
)
3.3 Patch 预处理流程
def run_instance_by_group(...):
"""单组 Patch 处理流程"""
# 1. 构建候选列表
candidate_list = []
for idx in range(len(candidate_log["patches"])):
cleaned_patch = clean_patch(candidate_log["patches"][idx])
is_success_regression = len(candidate_log["regressions"][idx]) == 0
candidate_list.append(CandidatePatch(
id=idx,
patch=candidate_log["patches"][idx],
cleaned_patch=cleaned_patch,
is_success_regression=is_success_regression,
is_success_patch=candidate_log["success_id"][idx],
))
# 2. 回归测试过滤
candidate_list_regression = [
c for c in candidate_list if c.is_success_regression
]
if len(candidate_list_regression) > 0:
candidate_list = candidate_list_regression
# 3. Patch 去重
candidate_list_deduplication = []
cleaned_candidate_set = set()
for candidate in candidate_list:
if candidate.cleaned_patch not in cleaned_candidate_set:
cleaned_candidate_set.add(candidate.cleaned_patch)
candidate_list_deduplication.append(candidate)
candidate_list = candidate_list_deduplication
3.4 CandidatePatch 数据结构
文件: evaluation/patch_selection/trae_selector/selector_agent.py
class CandidatePatch:
def __init__(
self,
id, # Patch ID
patch, # 原始 Patch
cleaned_patch, # 清理后的 Patch(用于去重)
is_success_regression, # 是否通过回归测试
is_success_patch, # 是否正确(Ground Truth)
):
self.id = id
self.patch = patch
self.cleaned_patch = cleaned_patch
self.is_success_regression = is_success_regression
self.is_success_patch = is_success_patch
四、Selector Agent 核心逻辑
4.1 系统提示词设计
文件: evaluation/patch_selection/trae_selector/selector_agent.py
def build_system_prompt(candidate_length: int) -> str:
return f"""\
# ROLE: Act as an expert code evaluator.
Given a codebase, an github issue and **{candidate_length} candidate patches**
proposed by your colleagues, your responsibility is to **select the correct one**
to solve the issue.
# WORK PROCESS:
1. Understand the Issue and Codebase
- 阅读问题描述
- 查看代码库上下文
2. Analyze the Candidate Patches
- 分析每个 Patch 的逻辑
- 对比问题描述和代码变更
3. Validate Functionality (Optional but Recommended)
- 编写单元测试验证
- 运行测试检查副作用
4. Select the Best Patch
- 选择最佳解决方案
# FINAL REPORT:
### Status: succeed
### Result: Patch-x
### Analysis: [解释为什么 Patch-x 是正确的]
# IMPORTANT TIPS:
1. Never avoid making a selection.
2. Do not propose new patches.
3. There must be at least one correct patch.
"""
4.2 SelectorAgent 执行流程
class SelectorAgent:
def __init__(
self,
llm_config: ModelConfig,
sandbox: Sandbox,
project_path: str,
issue_description: str,
candidate_list: list[CandidatePatch],
max_turn: int = 50,
):
self.llm_config = llm_config
self.sandbox = sandbox
self.candidate_list = candidate_list
self.max_turn = max_turn
# 初始化工具
self.tools = [
tools_registry["bash"](model_provider=llm_config.model_provider.provider),
tools_registry["str_replace_based_edit_tool"](...),
]
# 构建初始消息
self.initial_messages = [
LLMMessage(role="system", content=build_system_prompt(len(candidate_list)))
]
# 添加用户提示(包含所有候选 Patch)
user_prompt = f"""
[Codebase path]: {project_path}
[Github issue description]:
{issue_description}
[Candidate Patches]:
"""
for idx, candidate in enumerate(candidate_list):
user_prompt += f"\nPatch-{idx + 1}:\n```\n{candidate.patch}\n```"
self.initial_messages.append(LLMMessage(role="user", content=user_prompt))
4.3 选择执行循环
def run(self):
"""Selector Agent 主循环"""
messages = self.initial_messages
turn = 0
# 默认选择第一个
final_id = self.candidate_list[0].id
final_patch = self.candidate_list[0].patch
while turn < self.max_turn:
turn += 1
# 1. 调用 LLM
llm_response = self.llm_client.chat(messages, self.llm_config, self.tools)
# 2. 检查是否完成选择
match = re.search(
r"Status:\s*(success|succeed).*\n.*Result:\s*Patch-(\d+)",
llm_response.content,
)
if match:
# 提取选择的 Patch
selected_idx = int(match.group(2)) - 1
final_id = self.candidate_list[selected_idx].id
final_patch = self.candidate_list[selected_idx].patch
break
# 3. 执行工具调用(查看代码、运行测试等)
tool_results = parse_tool_response(llm_response, self.sandbox_session)
messages.extend(tool_results)
return final_id, final_patch
五、多数投票机制
5.1 实现逻辑
文件: evaluation/patch_selection/trae_selector/selector_evaluation.py
def run_instance_by_group(..., majority_voting=True):
"""多数投票选择"""
if majority_voting:
final_id_list = []
final_patch_list = []
# 多次运行 Selector Agent
for idx in range(num_candidate):
select_agent = SelectorAgent(
llm_config=llm_config,
sandbox=sandbox,
project_path=project_path,
issue_description=instance["problem_statement"],
candidate_list=candidate_list,
max_turn=max_turn,
)
final_id, final_patch = select_agent.run()
final_id_list.append(final_id)
final_patch_list.append(final_patch)
# 提前终止:如果某个 Patch 超过半数
from collections import Counter
if max(Counter(final_id_list).values()) > num_candidate / 2:
break
# 统计投票结果
counter = Counter(final_id_list)
max_count = max(counter.values())
most_common_ids = [elem for elem, count in counter.items() if count == max_count]
# 选择得票最高的
final_id = most_common_ids[0]
final_patch = final_patch_list[final_id_list.index(final_id)]
else:
# 单次选择
select_agent = SelectorAgent(...)
final_id, final_patch = select_agent.run()
5.2 投票流程图
运行 1: Selector Agent → 选择 Patch-3
运行 2: Selector Agent → 选择 Patch-3
运行 3: Selector Agent → 选择 Patch-1
运行 4: Selector Agent → 选择 Patch-3 ← 超过半数,提前终止
投票统计:
- Patch-3: 3 票 (75%)
- Patch-1: 1 票 (25%)
最终结果: Patch-3
六、关键设计特点
6.1 为什么需要 Selector Agent?
| 问题 | 解决方案 |
|---|---|
| 多个 Patch 哪个最好? | Agent 智能分析 |
| 需要理解代码上下文 | 提供工具查看代码 |
| 需要验证正确性 | 可运行测试验证 |
| 避免人工判断 | 自动化选择流程 |
6.2 分组策略的优势
# 50 个候选,直接处理 → 上下文过长,LLM 难以处理
# 50 个候选,group_size=10 → 分 5 组,每组 10 个
Group 0: Patch 0-9 → 选择 Patch-3
Group 1: Patch 10-19 → 选择 Patch-12
Group 2: Patch 20-29 → 选择 Patch-25
Group 3: Patch 30-39 → 选择 Patch-31
Group 4: Patch 40-49 → 选择 Patch-42
最终轮: Patch [3, 12, 25, 31, 42] → 选择最佳
6.3 预处理流程的作用
| 步骤 | 目的 |
|---|---|
| 回归测试过滤 | 排除有明显副作用的 Patch |
| Patch 去重 | 减少重复分析,提高效率 |
| 清理格式 | 标准化 Patch 便于比较 |
七、使用示例
7.1 生成候选 Patch
# 运行多次 TraeAgent 生成 Patch
for i in {1..10}; do
trae-cli run "Fix the bug" --output patch_$i.diff
done
# 组合成 candidate_patches.jsonl
python combine_patches.py --patches patch_*.diff --output candidates.jsonl
7.2 运行 Patch 选择
python evaluation/patch_selection/selector.py \
--instances_path "swebench-verified.json" \
--candidate_path "candidates.jsonl" \
--result_path "./results" \
--num_candidate 10 \
--group_size 10 \
--max_workers 10 \
--config_file trae_config.yaml \
--model_name claude-sonnet \
--majority_voting
7.3 查看结果
# 分析选择结果
python evaluation/patch_selection/analysis.py --result_path ./results
# 输出结构
results/
├── log/ # LLM 交互日志
├── output/ # 标准输出
├── patch/ # 选中的 Patch
└── statistics/ # 统计结果
八、总结
| 阶段 | 核心组件 | 关键设计 |
|---|---|---|
| Patch 生成 | TraeAgent(多次运行) | 利用 LLM 随机性 |
| 候选管理 | candidate_patches.jsonl | 结构化存储 |
| 预处理 | 回归测试 + 去重 | 过滤低质量 Patch |
| 智能选择 | Selector Agent | LLM 分析 + 工具验证 |
| 结果确认 | 多数投票 | 提高选择稳定性 |
核心思想:通过"生成多样性 + 智能选择"的组合,提高解决复杂软件工程问题的成功率。
最后更新: 2026-03-16