IDE 编辑器右侧的 Agent 的实现原理你说的 IDE 编辑器右侧的 Agent，本质上就是一个「可感知上下文、可执

你说的 IDE 编辑器右侧的 Agent，本质上就是一个「可感知上下文、可执行动作、可持续对话的 AI 工具人」。下面我从产品形态 → 技术架构 → Agent 能力拆解 → 实现路径，把主流 IDE（如 Cursor / VS Code 系）右侧 Agent 的开发思路一次讲清楚。

一、右侧 Agent 在 IDE 里的“角色定位”

它不是聊天机器人，而是 “会用 IDE 的程序员”

右侧 Agent 一般承担 5 类职责：

代码理解（Read）
- 当前打开文件
- 光标位置 / 选中代码
- 项目结构（tree）
- Git diff / 历史
代码生成（Write）
- 插入 / 替换代码
- 批量改文件
- 按规范生成模板、单测
工具执行（Act）
- 执行命令（lint / test / build）
- 运行脚本
- 搜索代码
持续对话（Think）
- 保留上下文
- 多轮推理
- 自我修正
状态感知（Observe）
- 编译报错
- Lint 结果
- Test 失败日志

👉 这是一个 ReAct / Tool Agent，不是 ChatCompletion。

二、整体技术架构（核心）

┌──────── IDE Plugin (前端) ────────┐
│  - 右侧面板 UI                   │
│  - 获取编辑器上下文               │
│  - 执行指令（写文件/跑命令）       │
└─────────────▲───────────────────┘
              │ JSON / RPC
┌─────────────┴───────────────────┐
│         Agent Orchestrator       │
│  - Prompt 组装                   │
│  - Tool 调度                     │
│  - Memory / State                │
└─────────────▲───────────────────┘
              │
┌─────────────┴───────────────────┐
│           LLM / Model            │
│  - GPT / Claude / 本地模型       │
│  - Function / Tool Call          │
└─────────────────────────────────┘

三、IDE 侧（插件）怎么设计？

1️⃣ 必须掌控的 IDE 能力

以 VS Code API 为例：

能力	API
当前文件	`window.activeTextEditor`
选中代码	`editor.selection`
项目结构	`workspace.findFiles`
写文件	`WorkspaceEdit`
执行命令	`tasks.executeTask` / `child_process`
Git	`vscode.git`

👉 IDE 插件不是 UI，而是 Agent 的“身体”

2️⃣ 右侧面板 UI

典型结构：

Chat 区
执行记录（Thinking / Action / Result）
可回滚操作列表
Agent 模式切换（Explain / Refactor / Fix）

四、Agent 核心设计（最关键）

1️⃣ Prompt 不是一句话，而是“协议”

SYSTEM:
You are a coding agent inside an IDE.
You can:
- read files
- modify files
- run commands
- ask for clarification

You must use tools when needed.

USER CONTEXT:
- Current file: src/foo.ts
- Selection: lines 10-30
- Project tree: ...

TASK:
Refactor this function to be pure and add tests.

👉 上下文是结构化 JSON，不是自然语言

2️⃣ Tool / Action 设计（Agent 的“手”）

tools = [
  read_file(path),
  write_file(path, content),
  insert_code(path, range, content),
  run_command(cmd),
  search_code(query),
]

LLM 返回：

{
  "tool": "write_file",
  "arguments": {
    "path": "src/foo.ts",
    "content": "..."
  }
}

3️⃣ ReAct Loop（核心）

Thought → Action → Observation → Thought → ...

例如：

Thought：需要先理解依赖
Action：search_code
Observation：找到了 util.ts
Thought：需要抽离副作用
Action：write_file

👉 右侧 Agent 的“智能”90% 来自这个循环

五、上下文与 Memory 怎么做？

短期上下文（Session）

最近 N 次操作
当前文件快照
错误日志

长期 Memory（可选）

项目约定
技术栈偏好
用户风格

{
  "coding_style": "functional",
  "test_framework": "vitest"
}

六、真正难的 3 个问题（Cursor 们的护城河）

1️⃣ 上下文裁剪（Context Pruning）

不能把整个项目塞给 LLM
要：
- Tree → 相关文件 → 关键片段

这是一个 信息检索问题（RAG for Code）

2️⃣ 安全写代码

Diff 模式
预览确认
可撤销
限制写入范围

3️⃣ 失败自愈

Test 失败 → 自动修
编译错误 → 定位原因
无限循环防护

七、如果你自己要做一个（最小可行方案）

Phase 1：最小 Agent（1 周）

VS Code 插件
右侧 Chat
读当前文件 + 写当前文件
单轮 Prompt

Phase 2：工具 Agent（2–3 周）

Tool calling
ReAct Loop
run command
自动修复

Phase 3：产品级（长期）

多文件修改
Git 集成
Plan → Execute
权限 / 回滚

八、和你前端 & AI 架构背景的强相关点

你非常适合从：

IDE Plugin + Agent Orchestrator
D2C / Code Agent
私有项目 AI Copilot

这个方向切入，尤其适合你之前提到的 “前端架构 + AI Agent” 路线。

下面给你一套**「VS Code 插件 + Agent」可直接起步的项目骨架**，目标是：

👉 右侧面板 + 代码上下文感知 + Tool-Calling Agent + 可扩展

我会给你：目录结构 → 核心代码 → Agent 协议 → 最小可跑流程
（这是我给“能真正做成产品”的骨架，不是 Hello World）

一、整体架构（你要先吃透）

IDE (VS Code 插件)
   │
   │  JSON-RPC / HTTP
   ▼
Agent Orchestrator（Node / TS）
   │
   │  Tool Calling
   ▼
LLM (OpenAI / Claude / 本地)

插件 = 身体
Agent Orchestrator = 大脑
LLM = 推理器

二、项目目录结构（推荐）

vscode-agent/
├── package.json
├── tsconfig.json
├── src/
│   ├── extension.ts          # 插件入口
│   ├── panel/
│   │   └── AgentPanel.ts     # 右侧 Webview
│   ├── context/
│   │   └── editorContext.ts  # 编辑器上下文采集
│   ├── tools/
│   │   ├── readFile.ts
│   │   ├── writeFile.ts
│   │   ├── runCommand.ts
│   │   └── index.ts
│   ├── agent/
│   │   ├── agent.ts          # ReAct Loop
│   │   ├── prompt.ts         # Prompt 协议
│   │   └── types.ts
│   └── llm/
│       └── openai.ts         # 模型封装
└── media/
    └── panel.html            # 右侧 UI

这个结构 可以直接进化成 Cursor / Continue 级别

三、插件入口（extension.ts）

import * as vscode from "vscode";
import { AgentPanel } from "./panel/AgentPanel";

export function activate(context: vscode.ExtensionContext) {
  context.subscriptions.push(
    vscode.commands.registerCommand("agent.open", () => {
      AgentPanel.createOrShow(context.extensionUri);
    })
  );
}

📌 插件只做三件事：

注册命令
打开右侧面板
把 IDE 能力暴露给 Agent

四、右侧 Agent 面板（Webview）

AgentPanel.ts

import * as vscode from "vscode";

export class AgentPanel {
  static currentPanel: AgentPanel | undefined;
  private panel: vscode.WebviewPanel;

  static createOrShow(uri: vscode.Uri) {
    if (this.currentPanel) {
      this.currentPanel.panel.reveal();
      return;
    }

    const panel = vscode.window.createWebviewPanel(
      "agent",
      "AI Agent",
      vscode.ViewColumn.Beside,
      { enableScripts: true }
    );

    this.currentPanel = new AgentPanel(panel);
  }

  constructor(panel: vscode.WebviewPanel) {
    this.panel = panel;
    panel.webview.html = this.getHtml();
  }

  getHtml() {
    return `
      <html>
        <body>
          <textarea id="input"></textarea>
          <button onclick="send()">Run</button>
          <pre id="log"></pre>
          <script>
            const vscode = acquireVsCodeApi();
            function send() {
              vscode.postMessage({
                type: 'run',
                text: document.getElementById('input').value
              })
            }
          </script>
        </body>
      </html>
    `;
  }
}

👉 UI 极简即可，Agent 才是重点

五、编辑器上下文采集（Agent 的“眼睛”）

import * as vscode from "vscode";

export function getEditorContext() {
  const editor = vscode.window.activeTextEditor;
  if (!editor) return null;

  return {
    filePath: editor.document.uri.fsPath,
    language: editor.document.languageId,
    selection: editor.document.getText(editor.selection),
    fullText: editor.document.getText(),
  };
}

⚠️ 产品级一定要做 context 裁剪
现在先全量，跑通再说

六、Tool 设计（Agent 的“手”）

readFile.ts

import * as fs from "fs";

export function readFile(path: string) {
  return fs.readFileSync(path, "utf-8");
}

writeFile.ts

import * as fs from "fs";

export function writeFile(path: string, content: string) {
  fs.writeFileSync(path, content);
  return "ok";
}

tools/index.ts

export const tools = {
  read_file: readFile,
  write_file: writeFile,
};

七、Agent Prompt 协议（这是灵魂）

prompt.ts

export function buildPrompt(ctx: any, task: string) {
  return `
SYSTEM:
You are an AI coding agent inside VS Code.

You can use tools:
- read_file
- write_file

RULES:
- Think step by step
- Use tools when needed
- Never guess file content

EDITOR CONTEXT:
${JSON.stringify(ctx, null, 2)}

TASK:
${task}
`;
}

📌 这是协议，不是文案

八、Agent 核心（ReAct Loop）

import { buildPrompt } from "./prompt";
import { callLLM } from "../llm/openai";
import { tools } from "../tools";

export async function runAgent(task: string, context: any) {
  let prompt = buildPrompt(context, task);

  while (true) {
    const res = await callLLM(prompt);

    if (res.tool) {
      const output = tools[res.tool](...Object.values(res.args));
      prompt += `\nOBSERVATION:\n${output}`;
    } else {
      return res.text;
    }
  }
}

👉 90% 的 Agent 智能来自这个 while

九、LLM 封装（极简）

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_KEY });

export async function callLLM(prompt: string) {
  const res = await client.chat.completions.create({
    model: "gpt-4.1-mini",
    messages: [{ role: "user", content: prompt }],
  });

  return JSON.parse(res.choices[0].message.content!);
}

十、最小可用 Demo（你现在能干什么）

✔ 右侧输入：

“把当前函数改成纯函数”

✔ Agent 会：

读当前文件
理解上下文
改代码
写回文件

这已经是 Cursor v0.1 的能力

十一、下一步你该升级什么（关键）

🔥 必做三件事

Diff 写入
- 不是 whole file
Plan → Execute
- 先输出计划
run command
- npm test / lint

十二、一句狠话（但对你很重要）

Agent 项目不是难在模型，而是难在工程纪律

你这个骨架：

前端架构能力 100% 能吃下
非常适合你继续往 AI 工程 / AI 平台 / D2C Agent 演进

下面我按工程可落地的方式，直接把你问的两个难点拆开讲清楚，并给你可实现的算法 + 数据结构 + 代码级骨架。

目标：
让 Agent 像“高级工程师”一样，能改多文件、懂依赖、不会塞爆上下文

一、问题本质先对齐（非常重要）

你现在要解决的不是“LLM 能不能理解代码”，而是两个工程问题：

多文件修改 + 依赖分析
- Agent 怎么知道 该改哪几个文件
- 修改顺序是什么
- 改 A 会不会影响 B
Context 裁剪 / Code RAG
- 上下文窗口有限
- 怎么只把“相关代码”送给模型
- 不靠人手点文件

二、多文件修改 + 依赖分析（Agent 的“全局视野”）

1️⃣ 把问题建模成「依赖图」

核心思想

代码 = 有向图，而不是文本

File A  ──imports──▶ File B
  │                    │
  └──calls──────────▶ Function C

你要构建的最小图结构

type FileNode = {
  path: string
  imports: string[]
  exports: string[]
  symbols: string[]
}

type DependencyGraph = Map<string, FileNode>

2️⃣ 如何构建依赖图（不用 AST 也能跑）

MVP 方案（强烈推荐先用）

JS / TS 项目

import { parse } from "@babel/parser";
import traverse from "@babel/traverse";

function analyzeFile(code: string) {
  const ast = parse(code, { sourceType: "module", plugins: ["typescript"] });

  const imports: string[] = [];
  const exports: string[] = [];
  const symbols: string[] = [];

  traverse(ast, {
    ImportDeclaration(path) {
      imports.push(path.node.source.value);
    },
    ExportNamedDeclaration(path) {
      path.node.declaration?.declarations?.forEach(d => {
        symbols.push(d.id.name);
        exports.push(d.id.name);
      });
    }
  });

  return { imports, exports, symbols };
}

👉 这一步就已经超过 90% AI 工具

3️⃣ Agent 如何“决定”要改哪些文件

策略不是“全量扫描”，而是 Plan → Expand

Step 1：LLM 先输出「修改计划」

{
  "plan": [
    {
      "file": "src/userService.ts",
      "reason": "contains business logic to refactor"
    },
    {
      "file": "src/userController.ts",
      "reason": "calls userService and must adapt API"
    }
  ]
}

⚠️ 注意：
计划阶段不允许写代码

Step 2：依赖扩展（自动）

function expandDependencies(files: string[], graph: DependencyGraph) {
  const result = new Set(files);

  for (const file of files) {
    graph.get(file)?.imports.forEach(dep => result.add(dep));
  }

  return [...result];
}

📌 这样 Agent 会“自动补齐”关联文件，而不是靠猜。

4️⃣ 多文件修改的正确顺序

顺序规则（工程级）

底层 → 上层
- util → service → controller → view
被依赖者先改
接口变更优先

function sortByDependency(files, graph) {
  // 拓扑排序（Topological Sort）
}

5️⃣ 多文件写入的安全策略（必须）

永远不要直接写

❌ write_file
✅ apply_diff

Diff Tool 设计

apply_diff({
  file: "src/a.ts",
  hunks: [
    { start: 10, end: 20, content: "new code" }
  ]
})

三、Context 裁剪 / Code RAG（Agent 的“注意力系统”）

这是 Agent 成败的分水岭。

1️⃣ Context 分层模型（你一定要用）

Layer 0：Task 本身（必须）
Layer 1：当前文件 / 直接修改文件
Layer 2：直接依赖（import / call）
Layer 3：间接依赖（最多 1 层）
Layer 4：全局索引（摘要）

绝对禁止 Layer 4 直接进 prompt

2️⃣ Code RAG 的最小实现（可跑）

Step 1：代码切块（Chunk）

type CodeChunk = {
  id: string
  file: string
  content: string
  symbols: string[]
}

Chunk 规则

200–400 行
函数 / class 不拆
带符号名

Step 2：Embedding + 向量索引

embed(chunk.content)
→ vector
→ store (file, symbols)

你可以用：

FAISS
LanceDB
SQLite + cosine（MVP）

Step 3：检索策略（比模型更重要）

function retrieveContext(task: string) {
  return [
    ...searchByEmbedding(task),
    ...searchBySymbol(task),
    ...searchByImportGraph(task),
  ]
}

不是只用 embedding！

3️⃣ Symbol 优先于文本（非常关键）

当用户说：

“重构 userService 的 createUser”

你应该：

先用 regex / AST 找 createUser
再补充：
- 定义
- 被调用处
- 导出点

findSymbol("createUser")
→ definition
→ call sites
→ export file

4️⃣ Context 压缩（不是裁剪，是“摘要”）

对不直接修改的文件：

File: src/utils/date.ts
Exports:
- formatDate(date: Date): string
- parseDate(str: string): Date
Used by: userService.ts

📌 模型只需要“接口语义”，不需要实现

5️⃣ Prompt 里 Context 的正确形态

## FILES TO MODIFY
- src/userService.ts
- src/userController.ts

## FULL CONTENT
### src/userService.ts
<full code>

## DEPENDENCY SUMMARY
### src/date.ts
Exports: formatDate, parseDate

## TASK
Refactor createUser to be pure.

四、把两件事合在一起：完整 Agent 流程

User Task
 ↓
Plan（LLM）
 ↓
Dependency Expand（Graph）
 ↓
Context Retrieve（RAG）
 ↓
Context Compress
 ↓
Execute（Diff）
 ↓
Validate（tsc / test）
 ↓
Auto-fix（可选）

这一步，已经是 Cursor / Windsurf 的内核

五、给你一句“架构级真话”

❌ 强模型 + 大 Prompt = 不稳定
✅ 弱一点模型 + 好 Context = 工程级可用

六、你现在该怎么做（最优路径）

第 1 周

import graph
multi-file plan
diff 写入

第 2 周

code chunk
symbol 索引
embedding 检索

第 3 周

context summary
test → auto fix