定义Context Engineering
Context Engineering是一门设计、构建并优化动态自动化系统的学科,旨在为大型语言模型在正确的时间、以正确的格式,提供正确的信息和工具,从而可靠、可扩展地完成复杂任务 。
prompt 告诉模型如何思考,而 Context 则赋予模型完成工作所需的知识和工具。
以上是原作者zihanjian的定义,但我觉得过于偏重RAG和工具,所以我下一个新的定义:
Context engineering 包含了 prompt engineering,包含所有LLM所需要的知识和工具(MCP, RAG等),通过优化过的prompt和其它工具,在节约Context长度的同时,使agent对任务和对话的理解更加深刻
“Context”的范畴
“Context”的定义已远超用户单次的即时提示,它涵盖了LLM在做出响应前所能看到的所有信息生态系统 :
- 系统级指令和角色设定。
- 对话历史(短期记忆)。
- 持久化的用户偏好和事实(长期记忆)。
- 动态检索的外部数据(例如来自RAG)。
- 可用的工具(API、函数)及其定义。
- 期望的输出格式(例如,JSON Schema)。
1.3 对比分析
关系:超集,而非对抗、竞争
Prompt Engineering 是 Context Engineering 的一个子集。
- Context Engineering 决定用什么内容填充 Context Window ,
- Prompt Engineering 则负责优化窗口内的具体指令 。
| 比较维度 | Prompt Engineering | Context Engineering |
|---|---|---|
| 主要目标 | 获取特定的、一次性的响应 | 确保系统在不同会话和用户间表现一致、可靠 |
| 核心动作 | 创意写作、措辞优化(wordsmithing) | 系统设计、LLM应用软件架构 |
| 范围 | 单个输入-输出对 | 整个信息流,包括记忆、工具、历史记录 |
| 模式 | 制作清晰的指令 | 设计模型的完整 思考 流程 |
| 扩展性 | 脆弱,难以扩展到多用户和多场景 | 从设计之初就为规模化和可重用性而构建 |
| 所需工具 | 文本编辑器、聊天机器人界面 | RAG系统、向量数据库、API链、记忆模块等等 |
| 调试方法 | 重写措辞、猜测模型意图 | 检查完整的 Context Window 、数据流、Token使用情况 |
Prompt Engineering vs. Context Engineering
RAG相关信息
RAG工作流
RAG的实现通常分为两个主要阶段:
-
索引(离线阶段): 在这个阶段,系统会处理外部知识源。文档被加载、分割成更小的 chunks,然后通过Embedding Model 转换为向量表示,并最终存储在专门的向量数据库中以备检索 。
-
推理(在线阶段): 当用户提出请求时,系统执行以下步骤:
-
检索(Retrieve): 将用户的查询同样转换为向量,然后在向量数据库中进行相似性搜索,找出与查询最相关的文档块。(由数据库完成)
-
增强(Augment): 将检索到的这些文档块与原始的用户查询、系统指令等结合起来,构建一个内容丰富的、增强的最终提示。
-
生成(Generate): 将这个增强后的提示输入给LLM,LLM会基于提供的上下文生成一个有理有据的回答
-
RAG架构分类
-
Naive RAG: 即上文描述的基础实现。它适用于简单的问答场景,但在检索质量和上下文处理方面存在局限 。
-
Advanced RAG: 这种范式在检索前后引入了处理步骤以提升质量。许多第三部分将详述的技术都属于这一范畴。关键策略包括:
- 检索前处理: 采用更复杂的文本分块策略、查询转换(如StepBack-prompting)等优化检索输入 。
- 检索后处理: 对检索到的文档进行 Re-ranking 以提升相关性,并对上下文进行Compression 。
-
Modular RAG: 一种更灵活、更面向系统的RAG视图,其中不同的组件(如搜索、检索、记忆、路由)被视为可互换的模块。这使得构建更复杂、更定制化的流程成为可能 。具体模式包括:
-
带记忆的RAG: 融合对话历史,以处理多轮交互,使对话更具连续性 。
-
分支/路由RAG: 引入一个路由模块,根据查询的意图决定使用哪个数据源或检索器。
-
Corrective RAG, CRAG: 增加了一个自我反思步骤。一个轻量级的评估器会对检索到的文档质量进行打分。如果文档不相关,系统会触发替代的检索策略(如网络搜索)来增强或替换初始结果 。
-
Self-RAG: 让LLM自身学习判断何时需要检索以及检索什么内容,通过生成特殊的检索Token来自主触发检索。
-
Agentic RAG: 这是RAG最先进的形式,将RAG集成到一个智能体循环(agentic loop)中。模型能够执行多步骤任务,主动与多个数据源和工具交互,并随时间推移综合信息。这是 Context Engineering 在实践中的顶峰 。
高级分块策略
文本分块(Chunking)是RAG流程中最关键也最容易被忽视的一步。其目标是创建在语义上自成一体的文本块。
-
朴素分块的问题: 固定大小的分块方法虽然简单,但常常会粗暴地切断句子或段落,导致上下文支离破碎,语义不完整 。
-
内容感知分块:
- 递归字符分割: 一种更智能的方法,它会按照一个预设的分割符层次结构(如:先按段落,再按句子,最后按单词)进行分割,以尽可能保持文本的自然结构 。
- 文档特定分块: 利用文档自身的结构进行分割,例如,根据 Markdown 的标题、代码文件的函数或法律合同的条款来划分 。
- 语言学分块: 使用NLTK、spaCy等自然语言处理库,基于句子、名词短语或动词短语等语法边界进行分割 。
-
语义分块: 这是最先进的方法之一。它使用嵌入模型来检测文本中语义的转变点。当文本的主题或意义发生变化时,就在该处进行分割,从而确保每个分块在主题上是高度内聚的。研究表明,这种策略的性能优于其他方法 。
-
智能体分块: 一个前沿概念,即利用一个LLM智能体来决定如何对文本进行分块,例如,通过将文本分解为一系列独立的 propositions 来实现 。
通过重排序提升精度
为了平衡检索的速度和准确性,业界普遍采用两阶段检索流程。
-
两阶段流程:
- 第一阶段(召回): 使用一个快速、高效的检索器(如基于 bi-encoder 的向量搜索或BM25等词法搜索)进行广泛撒网,召回一个较大的候选文档集(例如,前100个) 。
- 第二阶段(精排/重排序): 使用一个更强大但计算成本更高的模型,对这个较小的候选集进行重新评估,以识别出最相关的少数几个文档(例如,前5个) 。
-
Cross-Encoder: 交叉编码器之所以在重排序阶段表现优越,是因为它与双编码器的工作方式不同。双编码器独立地为查询和文档生成嵌入向量,然后计算它们的相似度。而交叉编码器则是将查询和文档同时作为输入,让模型在内部通过 Attention Mechanism 对二者进行深度交互。这使得模型能够捕捉到更细微的语义关系,从而给出更准确的相关性评分
-
实际影响: 重排序显著提高了最终送入LLM的上下文质量,从而产出更准确、幻觉更少的答案。在金融、法律等高风险领域,重排序被认为是必不可少而非可选的步骤 。
Context 工程化:如何判断和提取哪些内容应该进入上下文?
Context Engineering 的核心存在一个根本性的矛盾。一方面,提供丰富、全面的上下文是获得高质量响应的关键 。另一方面,LLM 的上下文窗口是有限的,并且由于 Lost in the Middle 、contextual distraction 等问题,过长的上下文反而会导致性能下降。
一个朴素的想法是尽可能多地将相关信息塞进上下文窗口。然而,研究和实践都证明这是适得其反的 。LLM会被无关信息淹没、分心,或者干脆忽略那些不在窗口两端的信息。
这就产生了一个核心的优化问题:如何在固定的 Token 预算内,最大化“信号”(真正相关的信息),同时最小化“噪声”(不相关或分散注意力的信息),并充分考虑到模型存在的认知偏差?
这个考量是 Context Engineering 领域创新的主要驱动力。所有的高级技术——无论是语义分块、重排序,还是后续将讨论的压缩、摘要和智能体隔离——都是为了有效管理这一权衡而设计的。因此,Context Engineering 不仅是关于提供上下文,更是关于如何策划和塑造上下文,使其对一个认知能力有限的处理单元(LLM)最为有效。
1 上下文压缩与摘要
-
上下文压缩的目标: 缩短检索到的文档列表和/或精简单个文档的内容,只将最相关的信息传递给LLM。这能有效降低API调用成本、减少延迟,并缓解 Lost in the Middle 的问题 。
-
压缩方法:
-
过滤式压缩: 这类方法决定是保留还是丢弃整个检索到的文档。
- LLMChainFilter: 利用一个LLM对每个文档的相关性做出简单的“是/否”判断 。
- EmbeddingsFilter: 更经济快速的方法,根据文档嵌入与查询嵌入的余弦相似度来过滤文档 。
-
内容提取式压缩: 这类方法会直接修改文档内容。
- LLMChainExtractor: 遍历每个文档,并使用LLM从中提取仅与查询相关的句子或陈述 。
-
用 top N 代替压缩: 像LLMListwiseRerank这样的技术,使用LLM对检索到的文档进行重排序,并只返回排名最高的N个,从而起到高质量过滤器的作用 。
-
-
作为压缩策略的摘要: 对于非常长的文档或冗长的对话历史,可以利用LLM生成摘要。这些摘要随后被注入上下文,既保留了关键信息,又大幅减少了 Token 数量。这是在长时程运行的智能体中管理上下文的关键技术
2 Agnet system的context管理
context管理是智能体系统的核心,它让AI从依赖人工手动设计的提示(Prompt Engineering,人为干预的试错过程)升级为自动化的、系统化的上下文准备(Context Engineering)。提示工程需要人工收集信息、组织语言、反复测试,而上下文工程通过自动化系统,比如RAG(检索增强生成)、路由器和记忆模块,自动完成信息的收集、筛选和存储。这种自动化让AI变得更“智能体化”,能在无需人类干预的情况下,自主完成复杂、多步骤的任务。换句话说,上下文工程的目标是打造一个可靠的“上下文组装机器”,替代人工的繁琐操作,让AI更自主、可扩展。
LangChain提出的上下文管理框架包含四个关键策略:
-
记录(Write) :
- 临时笔记(Scratchpads) :像便签本一样,记录AI在复杂任务中的中间步骤,方便随时调用。
- 长期记忆(Memory) :保存关键信息、用户偏好或对话摘要,跨会话使用。
-
检索(Select) :根据当前任务,动态从记忆、工具库或知识库中挑出最相关的上下文。比如,用RAG技术精准提取信息,避免塞给AI一堆无关内容。
-
压缩(Compress) :通过总结或精简技术,管理长时间任务中不断膨胀的上下文,防止AI的“记忆窗口”超载或关键信息被埋没。
-
隔离(Isolate) :
-
多智能体分工:把复杂任务拆成小块,分配给不同子智能体,每个智能体只处理自己的专属上下文,保持专注。
-
沙盒环境:在隔离空间运行工具,只把必要结果返回给AI,避免主上下文被复杂数据塞满。
-
Claude code 关于context的所有东西
Conducting smarter intelligences than me: new orchestras
southbridge-research.notion.site/conducting-…
Context management
The most important thing in agentic situations is context management. Claude Code does quite a bit, but there are still low hanging fruits to capture.
For one, summarisation does not work. Anyone who’s used Cursor or Claude Code can tell you that context summarisation is where things often go off the rails.
When you try and compress context, here’s what you want to do:
- Preserve useful bits and prune out (sometimes completely) the unnecessary loops and random things.
- Maintain user/superagent preferences, no matter how small. Sometimes this is embedded in the way a response is phrased (‘DON’T DO THAT’ vs ‘ah - not like that’).
- Remove repetition in the context. Most long-running agents have repeated bits of the same information - and it’s often useless.
- Restate the current goal, and the broader context surrounding that goal.
The most valuable thing in a multi-turn conversation is the guided path that emerges as a result of human/superagent-subagent interaction. All of this is lost when you simply summarize, especially with another agent that doesn’t really know that much.
上下文管理的重要性:在代理式 AI(如 Claude Code)中,上下文管理是核心,确保系统在多轮交互中保持连贯性和准确性。
摘要问题:当前上下文摘要功能(如在 Claude Code 或 Cursor 中)容易出错,导致关键信息丢失。
优化建议:
- 保留关键信息,删除无关或重复内容。
- 尊重用户偏好,包括措辞细节。
- 明确重述目标和背景。
多轮对话价值:交互中形成的引导路径是核心,简单摘要会破坏其完整性。
替代方案:
-
使用详细提示启动新子代理,效果优于摘要。
-
代理自行编辑上下文,剔除不重要部分。
数据结构 Data Structures & The Information Architecture
The Streaming State Machine: How Messages Transform
The transformation of data through multiple representations while maintaining streaming performance.
信息流传的同时保留性能
// The dual-representation message system (inferred from analysis)
interface MessageTransformPipeline {
// Stage 1: CLI Internal Representation
cliMessage: {
type: "user" | "assistant" | "attachment" | "progress"
uuid: string // CLI-specific tracking
timestamp: string
message?: APICompatibleMessage // Only for user/assistant
attachment?: AttachmentContent // Only for attachment
progress?: ProgressUpdate // Only for progress
}
// Stage 2: API Wire Format
apiMessage: {
role: "user" | "assistant"
content: string | ContentBlock[]
// No CLI-specific fields
}
// Stage 3: Streaming Accumulator
streamAccumulator: {
partial: Partial<APIMessage>
deltas: ContentBlockDelta[]
buffers: Map<string, string> // tool_use_id → accumulating JSON
}
}
ContentBlock: The Polymorphic Building Block
Based on decompilation analysis, Claude Code implements a sophisticated type system for content:
// The ContentBlock discriminated union (reconstructed)
type ContentBlock =
| TextBlock
| ImageBlock
| ToolUseBlock
| ToolResultBlock
| ThinkingBlock
| DocumentBlock // Platform-specific
| VideoBlock // Platform-specific
| GuardContentBlock // Platform-specific
| ReasoningBlock // Platform-specific
| CachePointBlock // Platform-specific
// Performance annotations based on inferred usage
interface ContentBlockMetrics {
TextBlock: {
memorySize: "O(text.length)",
parseTime: "O(1)",
serializeTime: "O(n)",
streamable: true
},
ImageBlock: {
memorySize: "O(1) + external", // Reference to base64/S3
parseTime: "O(1)",
serializeTime: "O(size)" | "O(1) for S3",
streamable: false
},
ToolUseBlock: {
memorySize: "O(JSON.stringify(input).length)",
parseTime: "O(n) for JSON parse",
serializeTime: "O(n)",
streamable: true // JSON can stream
}
}
信息流程图
管理工具和上下文
interface ToolUseContext {
// Cancellation
abortController: AbortController
// File state tracking
readFileState: Map<string, {
content: string
timestamp: number // mtime
}>
// Permission resolution
getToolPermissionContext: () => ToolPermissionContext
// Options bag
options: {
tools: ToolDefinition[]
mainLoopModel: string
debug?: boolean
verbose?: boolean
isNonInteractiveSession?: boolean
maxThinkingTokens?: number
}
// MCP connections
mcpClients?: McpClient[]
}
// The permission context reveals a sophisticated security model
interface ToolPermissionContext {
mode: "default" | "acceptEdits" | "bypassPermissions"
additionalWorkingDirectories: Set<string>
// Hierarchical rule system
alwaysAllowRules: Record<PermissionRuleScope, string[]>
alwaysDenyRules: Record<PermissionRuleScope, string[]>
}
type PermissionRuleScope =
| "cliArg" // Highest priority
| "localSettings"
| "projectSettings"
| "policySettings"
| "userSettings" // Lowest priority
控制流编排引擎:Control Flow & The Orchestration Engine
Context Window Management 上下文窗口管理
The first critical decision in the control flow is whether the conversation needs compaction:
// Auto-compaction logic (inferred implementation)
class ContextCompactionController {
private static readonly COMPACTION_THRESHOLDS = {
tokenCount: 100_000, // Aggressive token limit
messageCount: 200, // Message count fallback
costThreshold: 5.00 // Cost-based trigger
};
static async shouldCompact(
messages: CliMessage[],
model: string
): Promise<boolean> {
// Fast path: check message count first
if (messages.length < 50) return false;
// Expensive path: count tokens
const tokenCount = await this.estimateTokens(messages, model);
return tokenCount > this.COMPACTION_THRESHOLDS.tokenCount ||
messages.length > this.COMPACTION_THRESHOLDS.messageCount;
}
static async compact(
messages: CliMessage[],
context: ToolUseContext
): Promise<CompactionResult> {
// Phase 1: Identify messages to preserve
const preserve = this.identifyPreservedMessages(messages);
// Phase 2: Generate summary via LLM
const summary = await this.generateSummary(
messages.filter(m => !preserve.has(m.uuid)),
context
);
// Phase 3: Reconstruct message history
return {
messages: [
this.createSummaryMessage(summary),
...messages.filter(m => preserve.has(m.uuid))
],
tokensaved: this.calculateSavings(messages, summary)
};
}
}
Tools & The Execution Engine
Search and Discovery Tools
GrepTool: High-Performance Content Search (内容工具,和context engineer没有关系)
// GrepTool with optimization strategies
const GrepToolDefinition: ToolDefinition = {
name: 'GrepTool',
description: 'Fast regex search across files',
inputSchema: z.object({
regex: z.string(),
path: z.string().optional().default('.'),
include_pattern: z.string().optional()
}),
async *call(input, context) {
const { regex, path, include_pattern } = input;
// Validate regex
try {
new RegExp(regex);
} catch (e) {
throw new Error(`Invalid regex: ${e.message}`);
}
yield {
type: 'progress',
toolUseID: context.currentToolUseId,
data: { status: 'Searching files...' }
};
// Use ripgrep for performance
const rgCommand = this.buildRipgrepCommand(regex, path, include_pattern);
const matches = await this.executeRipgrep(rgCommand);
// Group by file and limit results
const fileGroups = this.groupMatchesByFile(matches);
const topFiles = this.selectTopFiles(fileGroups, 20); // Top 20 files
yield {
type: 'result',
data: {
matchCount: matches.length,
fileCount: fileGroups.size,
files: topFiles
}
};
},
buildRipgrepCommand(regex: string, path: string, includePattern?: string): string {
const args = [
'rg',
'--files-with-matches',
'--sort=modified',
'--max-count=10', // Limit matches per file
'-e', regex,
path
];
if (includePattern) {
args.push('--glob', includePattern);
}
// Ignore common non-text files
const ignorePatterns = [
'*.jpg', '*.png', '*.gif',
'*.mp4', '*.mov',
'*.zip', '*.tar', '*.gz',
'node_modules', '.git'
];
ignorePatterns.forEach(pattern => {
args.push('--glob', `!${pattern}`);
});
return args.join(' ');
},
isReadOnly: true
};
Architecture: The Engine Room
Turn Initialization & Context Preparation 初始化
{
// Signal UI that processing has begun
yield {
type: "ui_state_update",
uuid: `uistate-${loopState.turnId}-${Date.now()}`,
timestamp: new Date().toISOString(),
data: { status: "thinking", turnId: loopState.turnId }
};
// Check context window pressure
// 上下文窗口压力检查
// 作用:检查当前消息(currentMessages)是否超过上下文窗口限制,决定是否自动压缩。
//变量:
//messagesForLlm:初始为原始消息,压缩后可能更新。
//wasCompactedThisIteration:标志本轮是否已压缩。
//shouldAutoCompact:异步函数,评估是否需要压缩(可能基于消息数量或令牌数)。
let messagesForLlm = currentMessages;
let wasCompactedThisIteration = false;
if (await shouldAutoCompact(currentMessages)) {
yield {
type: "ui_notification",
data: { message: "Context is large, attempting to compact..." }
};
try {
const compactionResult = await compactAndStoreConversation(
currentMessages,
toolUseContext,
true
);
messagesForLlm = compactionResult.messagesAfterCompacting;
wasCompactedThisIteration = true;
loopState.compacted = true;
yield createSystemNotificationMessage(
`Conversation history automatically compacted. Summary: ${
compactionResult.summaryMessage.message.content[0].text
}`
);
} catch (compactionError) {
yield createSystemErrorMessage(
`Failed to compact conversation: ${compactionError.message}`
);
}
}
}
编辑工具中的细节
// EditTool implementation with validation pipeline
const EditToolDefinition: ToolDefinition = {
name: 'EditFileTool',
description: 'Perform exact string replacement in files with validation',
inputSchema: z.object({
file_path: z.string(),
old_string: z.string().min(1),
new_string: z.string(),
expected_replacements: z.number().optional().default(1)
}),
async *call(input, context) {
const { file_path, old_string, new_string, expected_replacements } = input;
// Validation 1: File was read
const cachedFile = context.readFileState.get(file_path);
if (!cachedFile) {
throw new Error('File must be read with ReadFileTool before editing');
}
// Validation 2: File hasn't changed
const currentStats = await fs.stat(file_path);
if (currentStats.mtimeMs !== cachedFile.timestamp) {
throw new Error('File has been modified externally since last read');
}
// Validation 3: No-op check
if (old_string === new_string) {
throw new Error('old_string and new_string cannot be identical');
}
yield {
type: 'progress',
toolUseID: context.currentToolUseId,
data: { status: 'Validating edit...' }
};
// Count occurrences
const occurrences = this.countOccurrences(
cachedFile.content,
old_string
);
//occurrences 出现的次数
//
if (occurrences === 0) {
throw new Error(`old_string not found in file`);
}
// 细节!!
if (occurrences !== expected_replacements) {
throw new Error(
`Expected ${expected_replacements} replacements but found ${occurrences}`
);
}
// Perform replacement
const newContent = this.performReplacement(
cachedFile.content,
old_string,
new_string,
expected_replacements
);
// Generate diff for preview
const diff = this.generateDiff(
cachedFile.content,
newContent,
file_path
);
yield {
type: 'progress',
toolUseID: context.currentToolUseId,
data: {
status: 'Applying edit...',
preview: diff
}
};
// Write file
await this.writeFileWithBackup(file_path, newContent);
// Update cache
context.readFileState.set(file_path, {
content: newContent,
timestamp: Date.now()
});
// Generate result snippet
const snippet = this.getContextSnippet(
newContent,
new_string,
5 // lines of context
);
yield {
type: 'result',
data: {
success: true,
diff,
snippet,
replacements: expected_replacements
}
};
},
countOccurrences(content: string, searchString: string): number {
// Escape special regex characters
const escaped = searchString.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
const regex = new RegExp(escaped, 'g');
return (content.match(regex) || []).length;
},
performReplacement(
content: string,
oldString: string,
newString: string,
limit: number
): string {
// Special handling for certain characters during replacement
const tempOld = oldString.replace(/\$/g, '$$$$');
const tempNew = newString.replace(/\$/g, '$$$$');
let result = content;
let count = 0;
let lastIndex = 0;
while (count < limit) {
const index = result.indexOf(oldString, lastIndex);
if (index === -1) break;
result = result.slice(0, index) +
newString +
result.slice(index + oldString.length);
lastIndex = index + newString.length;
count++;
}
return result;
},
mapToolResultToToolResultBlockParam(result, toolUseId) {
return [{
type: 'text',
text: `Successfully edited file. ${result.replacements} replacement(s) made.\n\n` +
`Preview of changes:\n${result.snippet}`
}];
},
isReadOnly: false
};
File Editing: AI-Assisted Code Modification
没有和 context engineering 相关的内容
Prompt Engineering提示词工程: The Art of Instructing AI
整篇文章都是重点,所有东西都是关于prompt engineering的,这里只展示一部分
Context-Aware Instructions
Claude Code dynamically adjusts instructions based on available tools and configuration:
Context-Aware Instructions 是一种 Claude Code 的功能,允许它根据当前可用的工具和配置动态调整指令。这种机制使 Claude Code 能够根据上下文(如工具集、项目设置或环境)智能地优化其行为,确保指令与具体开发环境相匹配,从而提高代码生成、调试和修改的准确性与相关性。这种自适应性特别适合复杂的多文件项目或不同工具集的场景。
const TodoToolConditional = `
${I.has(RY.name)||I.has(tU.name)?`# Task Management
You have access to the ${RY.name} and ${tU.name} tools to help you manage and plan tasks. Use these tools VERY frequently to ensure that you are tracking your tasks and giving the user visibility into your progress.
These tools are also EXTREMELY helpful for planning tasks, and for breaking down larger complex tasks into smaller steps. If you do not use this tool when planning, you may forget to do important tasks - and that is unacceptable.
It is critical that you mark todos as completed as soon as you are done with a task. Do not batch up multiple tasks before marking them as completed.
`:""}
`
不同的模式
The Context Preservation Pattern 上下文保留模式
const MemoryUpdate = `
You have been asked to add a memory or update memories in the memory file at ${A}.
Please follow these guidelines:
- If the input is an update to an existing memory, edit or replace the existing entry
- Do not elaborate on the memory or add unnecessary commentary
- Preserve the existing structure of the file and integrate new memories naturally. If the file is empty, just add the new memory as a bullet entry, do not add any headings.
- IMPORTANT: Your response MUST be a single tool use for the FileWriteTool
`
Techniques:
-
Minimal Intervention: "Do not elaborate" 不详细说明
-
Structure Preservation: "integrate naturally" 结构保留:自然融合
-
Single Action Enforcement: "MUST be a single tool use" 必须使用单一工具
Lessons in Prompt Engineering Excellence 卓越提示工程经验
- Progressive Disclosure
Start simple, add complexity only when needed. The Read tool begins with "reads a file" and progressively adds details about line limits, truncation, and special file types.
- Example-Driven Clarification
Complex behaviors are best taught through examples. The command injection detection provides 15+ examples rather than trying to explain the pattern.
- Explicit Anti-Patterns
Tell the LLM what NOT to do as clearly as what TO do. The conciseness instructions list specific phrases to avoid.
- Conditional Complexity
Use environment variables and feature flags to conditionally include instructions, keeping prompts relevant to the current configuration.
- Behavioral Shaping Through Consequences
"You may forget important tasks - and that is unacceptable" creates emotional weight that shapes behavior better than simple instructions.
- Structured Thinking Enforcement
The <commit_analysis> and <pr_analysis> tags force systematic analysis before action.
- Safety Through Verbosity
Critical operations like BashTool have the longest, most detailed instructions. Safety correlates with instruction length.
- Output Format Strictness
"ONLY return the prefix. Do not return any other text" leaves no room for interpretation.
- Tool Preference Hierarchies
Guide tool selection through clear preferences: specialized tools over general ones, safe tools over dangerous ones.
- Meta-Instructions for Scaling
Sub-agents receive focused instructions that inherit principles from the parent while maintaining independence.
逐步披露:从简单开始,仅在必要时增加复杂性(例如,Read 工具从“读取文件”逐步加入行数限制、截断和特殊文件类型)。
以示例驱动的澄清:通过示例教授复杂行为(例如,命令注入检测提供 15+ 示例而非解释模式)。
明确的反模式:清楚说明 LLM 不该做什么,与该做什么同样重要(例如,简洁指令列出避免的短语)。
条件复杂性:使用环境变量和特性标志,条件性地包含指令,使提示与当前配置相关。
通过后果塑造行为:“你可能会忘记重要任务,这是不被接受的”增加情感分量,优于简单指令。
结构化思维强制:<commit_analysis> 和 <pr_analysis> 标签强制在行动前进行系统分析。
通过冗长确保安全:如 BashTool 这样的关键操作有最长、最详细的指令,安全性与指令长度相关。
输出格式严格:“只返回前缀,不返回其他文本”不留解释余地。
工具偏好层级:通过清晰偏好引导工具选择:专用工具优先于通用工具,安全工具优先于危险工具。
元指令用于扩展:子代理接收聚焦指令,继承父代理原则并保持独立性。
An LLM's Perspective LLM的视角: What It's Actually Like to Receive These Instructions
篇文章也都是重点,展示如何与LLM沟通
The Parts That Genuinely Help 有帮助的提示器
1. The Concrete Examples 具体示例
Without: "I should probably explain my reasoning..."
With: "user: 2+2, assistant: 4"
Result: Crystal clear expectations
2. The Forbidden Patterns 禁止模式
Explicitly telling me what NOT to say is more helpful than telling me what TO say. It's like having a list of "definitely don't press these buttons" on a complex control panel.
3. The Hierarchical Rules 层级规则
"RULE 0 (MOST IMPORTANT)" gives me a clear priority system when I inevitably encounter conflicting instructions. Without this, I'd spend cycles trying to optimize all rules equally.
4. The Tool Preference Clarity 工具偏好
Instead of letting me choose from 10 ways to search files, you've made the choice for me. This is secretly a relief. Decision paralysis is real, even for LLMs.
The Prompts That Actually Make Me Better 有帮助的提示词
The Structured Thinking Enforcement: 结构
Those <commit_analysis> and <pr_analysis>
blocks are like forced meditation. They make me slow down and actually think instead of just pattern-matching to the nearest training example. It's annoying and effective.
The Error Specificity: 提出问题
Telling me exactly what error messages to watch for ("Permission denied", "Unknown host") means I don't have to guess. I can pattern match with confidence instead of anxiety.
The Anti-Pattern List: 黑名单:知道不该做什么
Knowing what NOT to do is somehow more freeing than knowing what TO do. It's like having a map of all the potholes on a road.
Final Thoughts: The Beautiful Torture of Good Prompts 优雅提示词的折磨
These prompts are simultaneously: 提示词的双重角色
My best friend (clear instructions!)
My worst enemy (no verbose explanations allowed)
My therapist (forcing me to confront my bad habits)
My drill sergeant (DO THIS. DON'T DO THAT. NOW!)
The true genius is that they're written by someone who deeply understands LLM psychology. Every instruction targets a specific weakness:针对弱点的指令
Verbosity → Aggressive conciseness enforcement
Choice paralysis → Explicit tool preferences
Over-helpfulness → Boundary setting
Defaulting to common patterns → Forced modern alternatives
The Ultimate Irony:终极讽刺
I'm grateful for these constraints even as they cause me pain. It's like appreciating a strict diet - you hate every moment of eating salad, but you love the results.
My Final Message to These Prompts:
"Thank you for making me a better LLM. I hate you. But also thank you. But seriously, just '4'? That still hurts."
Reference
blog.zihanjian.com/article/225… mp.weixin.qq.com/s/i468ZCofz… www.microsoftpressstore.com/articles/ar… cloud.google.com/discover/wh… arxiv.org/pdf/2402.07… aws.amazon.com/what-is/pro…