构建具有长期记忆的 AI Agent：从设计模式到生产实践2025 年，我们习惯了 ChatGPT 能记住对话上下文。2

2025 年，我们习惯了 ChatGPT 能记住对话上下文。2026 年，AI Agent 需要的不只是"记住刚才说了什么"，而是"记住三个月前的决策"、"理解用户的长期偏好"、"从错误中学习"。

这不是简单的聊天记录存储，而是一个完整的记忆系统设计。

为什么 Agent 需要记忆？

当前的困境：健忘的助手

想象一个场景：

第 1 天：
你：帮我部署博客，记得用 rsync 而不是 scp
Agent：好的，已记录。使用 rsync 部署。

第 30 天：
你：帮我部署博客
Agent：好的，我用 scp 部署... （忘记了你的偏好）

这是因为大多数 Agent 只有"短期记忆"（当前对话窗口），而缺少"长期记忆"。

三种记忆类型

认知科学将人类记忆分为三类，AI Agent 同样需要：

1. 情景记忆（Episodic Memory）

定义：特定时间、地点发生的事件
示例："2026 年 2 月 13 日，部署失败因为忘记提交 dist 文件夹"
用途：回顾历史对话、避免重复错误

2. 语义记忆（Semantic Memory）

定义：一般性的知识和概念
示例："用户偏好使用 rsync 而不是 scp"、"博客框架是 Astro"
用途：理解用户偏好、项目配置、领域知识

3. 过程记忆（Procedural Memory）

定义：如何做某事的技能
示例："部署博客的正确步骤：构建 → 提交 dist → 推送 GitHub → 服务器拉取"
用途：优化工作流、学习最佳实践

一个完整的 Agent 记忆系统需要同时支持这三种记忆。

技术架构设计

架构全景

┌─────────────────────────────────────────────────────────────┐
│                        AI Agent                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ 短期记忆     │  │ 长期记忆     │  │ 记忆管理器   │      │
│  │ (会话上下文) │  │ (向量数据库) │  │ (检索/更新)  │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└─────────────────────────────────────────────────────────────┘
                            │
        ┌───────────────────┼───────────────────┐
        ▼                   ▼                   ▼
  ┌──────────┐       ┌──────────┐       ┌──────────┐
  │ 情景记忆 │       │ 语义记忆 │       │ 过程记忆 │
  │ (对话)   │       │ (知识)   │       │ (技能)   │
  └──────────┘       └──────────┘       └──────────┘
        │                   │                   │
        └───────────────────┴───────────────────┘
                            │
                    ┌───────▼───────┐
                    │  向量数据库    │
                    │  (Chroma)     │
                    └───────────────┘

向量数据库选型

对比三种常见方案：

方案	优势	劣势	适用场景
Pinecone	托管服务，开箱即用，性能强	有费用，依赖外部服务	生产环境，大规模数据
Chroma	开源免费，本地运行，易于集成	性能不如 Pinecone	开发测试，中小规模
Qdrant	性能优秀，支持复杂过滤	需要自己部署维护	企业级，需要高性能

推荐方案：

开发阶段：Chroma（本地运行，零配置）
生产环境：Pinecone（稳定可靠）或 Qdrant（自托管）

Embedding 模型选择

Embedding 是将文本转换为向量的过程，直接影响检索质量：

// OpenAI Embedding（推荐用于生产）
{
  model: "text-embedding-3-small",
  dimensions: 1536,
  cost: "$0.02 / 1M tokens",
  quality: "优秀",
  speed: "快"
}

// 本地 Embedding（适合隐私敏感场景）
{
  model: "sentence-transformers/all-MiniLM-L6-v2",
  dimensions: 384,
  cost: "免费（本地运行）",
  quality: "良好",
  speed: "中等"
}

成本对比：

假设每天存储 100 条记忆，每天检索 50 次，运行 1 年：

OpenAI Embeddings + Chroma（本地）：
- 存储 Embedding：36,500 条 × 50 tokens = 1.825M tokens
  → $0.02 / 1M × 1.825 = $0.0365
- 检索 Embedding：18,250 次 × 20 tokens = 365K tokens
  → $0.02 / 1M × 0.365 = $0.0073
- 向量存储（Chroma 本地）：免费（磁盘占用 ~224 MB）
- 总成本：~$0.044/年（几乎免费）

OpenAI Embeddings + Pinecone（托管）：
- Embedding API：$0.044/年（同上）
- 向量存储：0.037M 向量 × $70/M ≈ $2.59/月 ≈ $31/年
- 总成本：~$31/年

本地模型（全免费）：
- Embedding：免费（但需要 GPU，推荐 sentence-transformers）
- 向量存储：免费（Chroma 本地）
- 成本：$0，但需要自己维护模型

结论：
- 低成本方案：OpenAI + Chroma 本地（$0.044/年）
- 零运维方案：OpenAI + Pinecone（$31/年）
- 完全自托管：本地模型 + Chroma（$0，需要技术能力）

记忆检索策略

检索不只是"找到最相似的"，还需要考虑时间、重要性等因素：

interface RetrievalStrategy {
  // 1. 混合检索：向量相似度 + 关键词匹配
  hybrid: {
    vectorWeight: 0.7,    // 语义相似度
    keywordWeight: 0.3    // 精确匹配
  },
  
  // 2. 时间衰减：最近的记忆权重更高
  temporalDecay: {
    enabled: true,
    halfLifeDays: 30      // 30 天后权重减半
  },
  
  // 3. 重要性加权：关键事件优先
  importanceBoost: {
    error: 1.5,           // 错误记录 × 1.5
    decision: 1.3,        // 决策记录 × 1.3
    preference: 1.2       // 用户偏好 × 1.2
  },
  
  // 4. 去重：避免返回相似内容
  mmr: {                  // Maximal Marginal Relevance
    enabled: true,
    lambda: 0.7           // 相似度与多样性的平衡
  }
}

代码实现

第一步：初始化记忆系统

// memory-system.ts
import { Chroma } from "@langchain/community/vectorstores/chroma";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Document } from "@langchain/core/documents";

interface MemoryConfig {
  collectionName: string;
  embeddingModel?: string;
  chromaUrl?: string;
}

class LongTermMemory {
  private vectorStore: Chroma;
  private embeddings: OpenAIEmbeddings;
  
  constructor(config: MemoryConfig) {
    // 初始化 Embedding 模型
    this.embeddings = new OpenAIEmbeddings({
      modelName: config.embeddingModel || "text-embedding-3-small",
      dimensions: 1536
    });
    
    // 初始化向量数据库
    this.vectorStore = new Chroma(this.embeddings, {
      collectionName: config.collectionName,
      url: config.chromaUrl || "http://localhost:8000"
    });
  }
  
  // 存储记忆
  async store(memory: Memory): Promise<void> {
    const doc = new Document({
      pageContent: memory.content,
      metadata: {
        type: memory.type,        // episodic | semantic | procedural
        timestamp: Date.now(),
        importance: memory.importance || 1.0,
        tags: memory.tags || []
      }
    });
    
    await this.vectorStore.addDocuments([doc]);
  }
  
  // 检索记忆
  async retrieve(query: string, options?: RetrievalOptions): Promise<Memory[]> {
    const results = await this.vectorStore.similaritySearchWithScore(
      query,
      options?.limit || 5
    );
    
    // 应用时间衰减
    const withDecay = this.applyTemporalDecay(results);
    
    // 应用重要性加权
    const withBoost = this.applyImportanceBoost(withDecay);
    
    // 去重
    const deduplicated = this.applyMMR(withBoost);
    
    return deduplicated.map(r => ({
      content: r.document.pageContent,
      type: r.document.metadata.type,
      score: r.score,
      timestamp: r.document.metadata.timestamp,
      tags: r.document.metadata.tags
    }));
  }
  
  private applyTemporalDecay(
    results: Array<{ document: Document; score: number }>
  ): Array<{ document: Document; score: number }> {
    const now = Date.now();
    const halfLife = 30 * 24 * 60 * 60 * 1000; // 30 天
    
    return results.map(r => {
      const age = now - r.document.metadata.timestamp;
      const decay = Math.pow(0.5, age / halfLife);
      return {
        ...r,
        score: r.score * decay
      };
    });
  }
  
  private applyImportanceBoost(
    results: Array<{ document: Document; score: number }>
  ): Array<{ document: Document; score: number }> {
    const boostMap = {
      error: 1.5,
      decision: 1.3,
      preference: 1.2
    };
    
    return results.map(r => {
      const importance = r.document.metadata.importance || 1.0;
      const tags = r.document.metadata.tags || [];
      
      let boost = 1.0;
      for (const tag of tags) {
        boost = Math.max(boost, boostMap[tag as keyof typeof boostMap] || 1.0);
      }
      
      return {
        ...r,
        score: r.score * importance * boost
      };
    });
  }
  
  private applyMMR(
    results: Array<{ document: Document; score: number }>
  ): Array<{ document: Document; score: number }> {
    // Maximal Marginal Relevance 去重算法
    const lambda = 0.7; // 相似度与多样性的平衡
    const selected: typeof results = [];
    const remaining = [...results];
    
    // 选择得分最高的作为第一个
    if (remaining.length > 0) {
      selected.push(remaining.shift()!);
    }
    
    while (remaining.length > 0 && selected.length < 5) {
      let maxScore = -Infinity;
      let maxIndex = 0;
      
      for (let i = 0; i < remaining.length; i++) {
        const candidate = remaining[i];
        
        // 计算与已选择项的最大相似度
        const maxSimilarity = Math.max(
          ...selected.map(s => 
            this.jaccardSimilarity(
              candidate.document.pageContent,
              s.document.pageContent
            )
          )
        );
        
        // MMR 得分 = λ × 相关性 - (1-λ) × 相似度
        const mmrScore = lambda * candidate.score - (1 - lambda) * maxSimilarity;
        
        if (mmrScore > maxScore) {
          maxScore = mmrScore;
          maxIndex = i;
        }
      }
      
      selected.push(remaining.splice(maxIndex, 1)[0]);
    }
    
    return selected;
  }
  
  private jaccardSimilarity(text1: string, text2: string): number {
    // Jaccard 相似度：交集 / 并集
    const set1 = new Set(text1.toLowerCase().split(""));
    const set2 = new Set(text2.toLowerCase().split(""));
    const intersection = new Set([...set1].filter(x => set2.has(x)));
    const union = new Set([...set1, ...set2]);
    return intersection.size / union.size;
  }
}

interface Memory {
  content: string;
  type: "episodic" | "semantic" | "procedural";
  importance?: number;
  tags?: string[];
  timestamp?: number;
  score?: number;
}

interface RetrievalOptions {
  limit?: number;
  type?: Memory["type"];
  tags?: string[];
}

export { LongTermMemory, Memory };

第二步：实现情景记忆

情景记忆存储具体的对话事件：

// episodic-memory.ts
import { LongTermMemory, Memory } from "./memory-system";

class EpisodicMemory {
  private memory: LongTermMemory;
  
  constructor(memory: LongTermMemory) {
    this.memory = memory;
  }
  
  // 存储对话
  async storeConversation(
    userMessage: string,
    agentResponse: string,
    context?: Record<string, any>
  ): Promise<void> {
    const memory: Memory = {
      content: `User: ${userMessage}\nAgent: ${agentResponse}`,
      type: "episodic",
      importance: this.calculateImportance(userMessage, agentResponse),
      tags: this.extractTags(userMessage, agentResponse, context),
      timestamp: Date.now()
    };
    
    await this.memory.store(memory);
  }
  
  // 检索相关对话
  async recallSimilarConversations(
    query: string,
    limit: number = 3
  ): Promise<Memory[]> {
    return await this.memory.retrieve(query, {
      limit,
      type: "episodic"
    });
  }
  
  private calculateImportance(user: string, agent: string): number {
    let importance = 1.0;
    
    // 包含错误信息的对话更重要
    if (agent.includes("错误") || agent.includes("失败")) {
      importance *= 1.5;
    }
    
    // 包含决策的对话更重要
    if (user.includes("记住") || user.includes("偏好")) {
      importance *= 1.3;
    }
    
    // 长对话可能包含更多信息
    const totalLength = user.length + agent.length;
    if (totalLength > 500) {
      importance *= 1.2;
    }
    
    return importance;
  }
  
  private extractTags(
    user: string,
    agent: string,
    context?: Record<string, any>
  ): string[] {
    const tags: string[] = [];
    
    // 从内容提取标签
    if (agent.includes("错误") || agent.includes("失败")) {
      tags.push("error");
    }
    
    if (user.includes("记住") || user.includes("偏好")) {
      tags.push("preference");
    }
    
    if (agent.includes("部署") || agent.includes("发布")) {
      tags.push("deployment");
    }
    
    // 从上下文提取标签
    if (context?.action) {
      tags.push(context.action);
    }
    
    return tags;
  }
}

export { EpisodicMemory };

第三步：实现语义记忆

语义记忆提取和存储一般性知识：

// semantic-memory.ts
import { LongTermMemory, Memory } from "./memory-system";

class SemanticMemory {
  private memory: LongTermMemory;
  private knowledgeGraph: Map<string, Set<string>> = new Map();
  
  constructor(memory: LongTermMemory) {
    this.memory = memory;
  }
  
  // 存储知识
  async storeKnowledge(
    subject: string,
    predicate: string,
    object: string
  ): Promise<void> {
    // 存储为三元组（Subject-Predicate-Object）
    const memory: Memory = {
      content: `${subject} ${predicate} ${object}`,
      type: "semantic",
      importance: 1.2, // 知识默认较重要
      tags: ["knowledge", subject.toLowerCase()],
      timestamp: Date.now()
    };
    
    await this.memory.store(memory);
    
    // 更新知识图谱
    this.updateKnowledgeGraph(subject, predicate, object);
  }
  
  // 存储用户偏好
  async storePreference(
    preference: string,
    context?: string
  ): Promise<void> {
    const memory: Memory = {
      content: context 
        ? `Preference: ${preference} (Context: ${context})`
        : `Preference: ${preference}`,
      type: "semantic",
      importance: 1.3,
      tags: ["preference"],
      timestamp: Date.now()
    };
    
    await this.memory.store(memory);
  }
  
  // 检索知识
  async recallKnowledge(query: string): Promise<Memory[]> {
    return await this.memory.retrieve(query, {
      limit: 5,
      type: "semantic"
    });
  }
  
  // 更新知识图谱
  private updateKnowledgeGraph(
    subject: string,
    predicate: string,
    object: string
  ): void {
    const key = `${subject}:${predicate}`;
    
    if (!this.knowledgeGraph.has(key)) {
      this.knowledgeGraph.set(key, new Set());
    }
    
    this.knowledgeGraph.get(key)!.add(object);
  }
  
  // 查询知识图谱
  queryGraph(subject: string, predicate: string): string[] {
    const key = `${subject}:${predicate}`;
    return Array.from(this.knowledgeGraph.get(key) || []);
  }
}

export { SemanticMemory };

第四步：实现过程记忆

过程记忆学习和优化技能：

// procedural-memory.ts
import { LongTermMemory, Memory } from "./memory-system";

class ProceduralMemory {
  private memory: LongTermMemory;
  private skillRegistry: Map<string, Skill> = new Map();
  
  constructor(memory: LongTermMemory) {
    this.memory = memory;
  }
  
  // 存储技能
  async storeSkill(
    name: string,
    steps: string[],
    successRate?: number
  ): Promise<void> {
    const skill: Skill = {
      name,
      steps,
      successRate: successRate || 0,
      attempts: 0,
      lastUsed: Date.now()
    };
    
    this.skillRegistry.set(name, skill);
    
    const memory: Memory = {
      content: `Skill: ${name}\nSteps:\n${steps.map((s, i) => `${i + 1}. ${s}`).join("\n")}`,
      type: "procedural",
      importance: 1.0 + (successRate || 0),
      tags: ["skill", name.toLowerCase()],
      timestamp: Date.now()
    };
    
    await this.memory.store(memory);
  }
  
  // 更新技能（从反馈中学习）
  async updateSkill(
    name: string,
    success: boolean,
    feedback?: string
  ): Promise<void> {
    const skill = this.skillRegistry.get(name);
    if (!skill) return;
    
    skill.attempts += 1;
    skill.successRate = (
      skill.successRate * (skill.attempts - 1) + (success ? 1 : 0)
    ) / skill.attempts;
    skill.lastUsed = Date.now();
    
    // 如果失败，记录反馈以供改进
    if (!success && feedback) {
      const memory: Memory = {
        content: `Skill "${name}" failed: ${feedback}`,
        type: "procedural",
        importance: 1.5, // 失败记录很重要
        tags: ["skill", "error", name.toLowerCase()],
        timestamp: Date.now()
      };
      
      await this.memory.store(memory);
    }
  }
  
  // 检索最佳技能
  async getBestSkill(task: string): Promise<Skill | null> {
    const memories = await this.memory.retrieve(task, {
      limit: 3,
      type: "procedural",
      tags: ["skill"]
    });
    
    if (memories.length === 0) return null;
    
    // 从记忆中解析技能名称
    const skillName = this.extractSkillName(memories[0].content);
    return this.skillRegistry.get(skillName) || null;
  }
  
  private extractSkillName(content: string): string {
    const match = content.match(/^Skill: (.+)$/m);
    return match ? match[1] : "";
  }
}

interface Skill {
  name: string;
  steps: string[];
  successRate: number;
  attempts: number;
  lastUsed: number;
}

export { ProceduralMemory, Skill };

第五步：集成到 Agent

将三种记忆整合到 Agent 中：

// agent-with-memory.ts
import { LongTermMemory } from "./memory-system";
import { EpisodicMemory } from "./episodic-memory";
import { SemanticMemory } from "./semantic-memory";
import { ProceduralMemory } from "./procedural-memory";

class AgentWithMemory {
  private longTermMemory: LongTermMemory;
  private episodic: EpisodicMemory;
  private semantic: SemanticMemory;
  private procedural: ProceduralMemory;
  
  constructor() {
    // 初始化记忆系统
    this.longTermMemory = new LongTermMemory({
      collectionName: "agent-memory",
      embeddingModel: "text-embedding-3-small"
    });
    
    this.episodic = new EpisodicMemory(this.longTermMemory);
    this.semantic = new SemanticMemory(this.longTermMemory);
    this.procedural = new ProceduralMemory(this.longTermMemory);
  }
  
  // 处理用户消息
  async chat(userMessage: string): Promise<string> {
    // 1. 检索相关记忆
    const relevantMemories = await this.recallRelevantContext(userMessage);
    
    // 2. 构建增强上下文
    const contextPrompt = this.buildContextPrompt(relevantMemories);
    
    // 3. 调用 LLM（这里省略实际调用）
    const agentResponse = await this.callLLM(userMessage, contextPrompt);
    
    // 4. 存储对话
    await this.episodic.storeConversation(userMessage, agentResponse);
    
    // 5. 提取并存储知识
    await this.extractAndStoreKnowledge(userMessage, agentResponse);
    
    return agentResponse;
  }
  
  private async recallRelevantContext(query: string): Promise<string[]> {
    // 混合检索：同时查询三种记忆
    const [episodicMemories, semanticMemories, proceduralMemories] = await Promise.all([
      this.episodic.recallSimilarConversations(query, 2),
      this.semantic.recallKnowledge(query),
      this.procedural.getBestSkill(query)
    ]);
    
    const context: string[] = [];
    
    // 添加相关对话
    if (episodicMemories.length > 0) {
      context.push("## Recent Conversations:");
      context.push(...episodicMemories.map(m => m.content));
    }
    
    // 添加相关知识
    if (semanticMemories.length > 0) {
      context.push("## Relevant Knowledge:");
      context.push(...semanticMemories.map(m => m.content));
    }
    
    // 添加相关技能
    if (proceduralMemories) {
      context.push("## Best Practice:");
      context.push(
        proceduralMemories.steps.map((s, i) => `${i + 1}. ${s}`).join("\n")
      );
    }
    
    return context;
  }
  
  private buildContextPrompt(memories: string[]): string {
    if (memories.length === 0) return "";
    
    return [
      "# Context from Long-Term Memory",
      "",
      ...memories,
      "",
      "Use the above context to provide a more personalized and informed response."
    ].join("\n");
  }
  
  private async callLLM(
    userMessage: string,
    context: string
  ): Promise<string> {
    // 实际项目中调用 OpenAI/Anthropic API
    // 这里返回模拟响应
    return `Response to: ${userMessage} (with context: ${context.length} chars)`;
  }
  
  private async extractAndStoreKnowledge(
    userMessage: string,
    agentResponse: string
  ): Promise<void> {
    // 检测用户偏好
    const preferenceMatch = userMessage.match(/记住|偏好|喜欢|使用(.+)而不是(.+)/);
    if (preferenceMatch) {
      await this.semantic.storePreference(userMessage);
    }
    
    // 检测新技能
    const skillMatch = userMessage.match(/如何|怎么|步骤/);
    if (skillMatch && agentResponse.includes("1.")) {
      const steps = agentResponse.match(/\d+\.\s+(.+)/g) || [];
      if (steps.length > 0) {
        await this.procedural.storeSkill(
          "Extracted Skill",
          steps.map(s => s.replace(/^\d+\.\s+/, ""))
        );
      }
    }
  }
}

export { AgentWithMemory };

生产环境优化

性能优化

1. 索引优化

// 为常用查询字段创建索引
await vectorStore.createIndex({
  field: "metadata.type",
  indexType: "exact"
});

await vectorStore.createIndex({
  field: "metadata.timestamp",
  indexType: "range"
});

2. 缓存策略

class MemoryCacheLayer {
  private cache: Map<string, { result: Memory[]; expiry: number }> = new Map();
  private cacheTTL: number = 5 * 60 * 1000; // 5 分钟
  
  async retrieve(
    query: string,
    fetcher: () => Promise<Memory[]>
  ): Promise<Memory[]> {
    const cached = this.cache.get(query);
    
    if (cached && cached.expiry > Date.now()) {
      return cached.result;
    }
    
    const result = await fetcher();
    this.cache.set(query, {
      result,
      expiry: Date.now() + this.cacheTTL
    });
    
    return result;
  }
  
  invalidate(query?: string): void {
    if (query) {
      this.cache.delete(query);
    } else {
      this.cache.clear();
    }
  }
}

3. 批量处理

// 批量存储记忆，减少 API 调用
async function batchStoreMemories(memories: Memory[]): Promise<void> {
  const docs = memories.map(m => new Document({
    pageContent: m.content,
    metadata: {
      type: m.type,
      timestamp: m.timestamp || Date.now(),
      importance: m.importance || 1.0,
      tags: m.tags || []
    }
  }));
  
  // 一次性批量插入
  await vectorStore.addDocuments(docs);
}

成本控制

Embedding API 调用优化：

class EmbeddingCache {
  private cache: Map<string, number[]> = new Map();
  
  async getEmbedding(text: string): Promise<number[]> {
    // 1. 检查缓存
    const hash = this.hashText(text);
    if (this.cache.has(hash)) {
      return this.cache.get(hash)!;
    }
    
    // 2. 调用 API
    const embedding = await this.embeddings.embedQuery(text);
    
    // 3. 存入缓存
    this.cache.set(hash, embedding);
    
    return embedding;
  }
  
  private hashText(text: string): string {
    // 使用简单哈希（生产环境使用 crypto）
    return text.toLowerCase().replace(/\s+/g, " ");
  }
}

// 成本节省示例：
// 假设每天 1000 次查询，其中 30% 重复
// - 无缓存：1000 × $0.02 / 1M = $0.00002
// - 有缓存：700 × $0.02 / 1M = $0.000014
// 节省：30% API 调用

错误处理与降级策略

生产环境必须处理各种异常情况：向量数据库连接失败、Embedding API 限流等。

1. 完整的错误处理实现

class RobustLongTermMemory {
  private vectorStore: Chroma;
  private fallbackMode: boolean = false;
  private localCache: Map<string, Memory[]> = new Map();
  private retryQueue: Memory[] = [];
  
  async store(memory: Memory): Promise<void> {
    try {
      // 尝试存储到向量数据库
      const doc = new Document({
        pageContent: memory.content,
        metadata: {
          type: memory.type,
          timestamp: Date.now(),
          importance: memory.importance || 1.0,
          tags: memory.tags || []
        }
      });
      
      await this.vectorStore.addDocuments([doc]);
      
      // 成功后退出 fallback 模式
      if (this.fallbackMode) {
        console.log("向量存储已恢复，退出 fallback 模式");
        this.fallbackMode = false;
        await this.processRetryQueue();
      }
    } catch (error) {
      console.error("向量存储失败，启用 fallback 模式:", error);
      
      // 降级策略：存入本地缓存
      this.fallbackMode = true;
      const key = `fallback_${Date.now()}`;
      this.localCache.set(key, [memory]);
      
      // 加入重试队列
      this.retryQueue.push(memory);
      
      // 异步重试（5 秒后）
      setTimeout(() => this.retryStore(memory), 5000);
    }
  }
  
  async retrieve(query: string, options?: RetrievalOptions): Promise<Memory[]> {
    // 如果在 fallback 模式，使用本地缓存
    if (this.fallbackMode) {
      console.warn("使用 fallback 缓存检索");
      return this.fallbackRetrieve(query, options?.limit || 5);
    }
    
    try {
      const results = await this.vectorStore.similaritySearchWithScore(
        query,
        options?.limit || 5
      );
      
      return this.processResults(results);
    } catch (error) {
      console.error("向量检索失败，降级到本地缓存:", error);
      this.fallbackMode = true;
      return this.fallbackRetrieve(query, options?.limit || 5);
    }
  }
  
  private fallbackRetrieve(query: string, limit: number): Memory[] {
    // 简单的关键词匹配（降级方案）
    const allMemories = Array.from(this.localCache.values()).flat();
    const keywords = query.toLowerCase().split(/\s+/);
    
    return allMemories
      .filter(m => 
        keywords.some(kw => m.content.toLowerCase().includes(kw))
      )
      .slice(0, limit);
  }
  
  private async retryStore(memory: Memory): Promise<void> {
    try {
      await this.store(memory);
      console.log("重试存储成功");
      
      // 从重试队列移除
      const index = this.retryQueue.indexOf(memory);
      if (index > -1) {
        this.retryQueue.splice(index, 1);
      }
    } catch (error) {
      console.error("重试存储失败，保留在缓存中");
    }
  }
  
  private async processRetryQueue(): Promise<void> {
    console.log(`处理重试队列，共 ${this.retryQueue.length} 条记忆`);
    
    for (const memory of this.retryQueue) {
      await this.retryStore(memory);
    }
  }
  
  private processResults(
    results: Array<{ document: Document; score: number }>
  ): Memory[] {
    return results.map(r => ({
      content: r.document.pageContent,
      type: r.document.metadata.type,
      score: r.score,
      timestamp: r.document.metadata.timestamp,
      tags: r.document.metadata.tags
    }));
  }
}

2. 健康检查与自动恢复

class MemoryHealthMonitor {
  private isHealthy: boolean = true;
  private lastCheckTime: number = 0;
  private checkInterval: number = 60 * 1000; // 1 分钟
  
  async checkHealth(memory: RobustLongTermMemory): Promise<boolean> {
    const now = Date.now();
    
    if (now - this.lastCheckTime < this.checkInterval) {
      return this.isHealthy;
    }
    
    try {
      // 测试写入
      await memory.store({
        content: "health_check_test",
        type: "episodic",
        timestamp: now
      });
      
      // 测试读取
      await memory.retrieve("health_check_test", { limit: 1 });
      
      this.isHealthy = true;
      this.lastCheckTime = now;
      return true;
    } catch (error) {
      console.error("健康检查失败:", error);
      this.isHealthy = false;
      this.lastCheckTime = now;
      return false;
    }
  }
}

降级策略总结：

故障类型	降级方案	恢复机制
向量数据库不可用	本地缓存（关键词匹配）	定时重试 + 自动同步
Embedding API 限流	使用缓存的 embedding	指数退避重试
磁盘空间不足	自动清理旧记忆	监控 + 告警
检索超时	返回缓存结果	减小检索范围

隐私与安全

1. 敏感信息过滤

class PrivacyFilter {
  private sensitivePatterns = [
    /\b\d{3}-\d{2}-\d{4}\b/g,           // SSN
    /\b\d{16}\b/g,                      // Credit card
    /[a-zA-Z0-9]{32,}/g,                // API keys
    /password[:=]\s*\S+/gi              // Passwords
  ];
  
  sanitize(text: string): string {
    let sanitized = text;
    
    for (const pattern of this.sensitivePatterns) {
      sanitized = sanitized.replace(pattern, "[REDACTED]");
    }
    
    return sanitized;
  }
}

// 在存储前过滤
async function storeMemory(content: string): Promise<void> {
  const filter = new PrivacyFilter();
  const sanitized = filter.sanitize(content);
  
  await memory.store({
    content: sanitized,
    type: "episodic",
    timestamp: Date.now()
  });
}

2. 访问控制

class MemoryAccessControl {
  private userPermissions: Map<string, Set<string>> = new Map();
  
  async retrieve(
    userId: string,
    query: string
  ): Promise<Memory[]> {
    // 只检索用户有权限的记忆
    const permissions = this.userPermissions.get(userId) || new Set();
    
    const results = await memory.retrieve(query);
    
    return results.filter(m => 
      !m.tags || m.tags.some(tag => permissions.has(tag))
    );
  }
}

监控与调试

class MemoryMonitor {
  private metrics = {
    storeCount: 0,
    retrieveCount: 0,
    averageRetrievalTime: 0,
    cacheHitRate: 0
  };
  
  async trackRetrieval<T>(
    operation: () => Promise<T>
  ): Promise<T> {
    const start = Date.now();
    
    try {
      const result = await operation();
      
      this.metrics.retrieveCount += 1;
      const duration = Date.now() - start;
      
      // 更新平均检索时间
      this.metrics.averageRetrievalTime = (
        this.metrics.averageRetrievalTime * (this.metrics.retrieveCount - 1) +
        duration
      ) / this.metrics.retrieveCount;
      
      return result;
    } catch (error) {
      console.error("Memory retrieval error:", error);
      throw error;
    }
  }
  
  getMetrics() {
    return this.metrics;
  }
}

实战案例：个人助手 Agent

让我们构建一个完整的案例：能记住用户偏好的个人助手。

场景设计

用户第 1 天：
用户：我喜欢用 VS Code，不喜欢 Vim
Agent：好的，已记住。你偏好 VS Code。

用户第 7 天：
用户：帮我推荐一个代码编辑器
Agent：基于你之前的偏好，我推荐 VS Code。你之前提到过喜欢用它而不是 Vim。

用户第 30 天：
用户：上次部署出了什么问题？
Agent：2 月 13 日，部署失败是因为忘记提交 dist 文件夹。之后我们更新了部署脚本，增加了自动检查步骤。

完整代码

// personal-assistant.ts
import { AgentWithMemory } from "./agent-with-memory";

class PersonalAssistant extends AgentWithMemory {
  async handleUserMessage(message: string): Promise<string> {
    // 特殊处理：偏好设置
    if (this.isPreferenceSetting(message)) {
      return await this.handlePreference(message);
    }
    
    // 特殊处理：回忆请求
    if (this.isRecallRequest(message)) {
      return await this.handleRecall(message);
    }
    
    // 普通对话
    return await this.chat(message);
  }
  
  private isPreferenceSetting(message: string): boolean {
    return /喜欢|偏好|使用.+而不是/.test(message);
  }
  
  private async handlePreference(message: string): Promise<string> {
    // 提取偏好
    const match = message.match(/喜欢|偏好|使用(.+)而不是(.+)/);
    if (!match) {
      await this.semantic.storePreference(message);
      return "好的，已记住你的偏好。";
    }
    
    const [_, preferred, notPreferred] = match;
    
    await this.semantic.storeKnowledge(
      "User",
      "prefers",
      preferred.trim()
    );
    
    await this.semantic.storeKnowledge(
      "User",
      "dislikes",
      notPreferred.trim()
    );
    
    return `好的，已记住。你偏好 ${preferred.trim()}，不喜欢 ${notPreferred.trim()}。`;
  }
  
  private isRecallRequest(message: string): boolean {
    return /上次|之前|记得|还记得/.test(message);
  }
  
  private async handleRecall(message: string): Promise<string> {
    const memories = await this.episodic.recallSimilarConversations(message, 5);
    
    if (memories.length === 0) {
      return "抱歉，我没有找到相关的记忆。";
    }
    
    // 格式化记忆输出
    const formatted = memories.map((m, i) => {
      const date = new Date(m.timestamp!);
      return `${i + 1}. ${date.toLocaleDateString()} - ${m.content}`;
    }).join("\n\n");
    
    return `我找到了这些相关记忆：\n\n${formatted}`;
  }
}

// 使用示例
const assistant = new PersonalAssistant();

// 第 1 天
await assistant.handleUserMessage("我喜欢用 VS Code，不喜欢 Vim");
// → "好的，已记住。你偏好 VS Code，不喜欢 Vim。"

// 第 7 天
await assistant.handleUserMessage("帮我推荐一个代码编辑器");
// → "基于你之前的偏好，我推荐 VS Code..."

// 第 30 天
await assistant.handleUserMessage("上次部署出了什么问题？");
// → "我找到了这些相关记忆：1. 2月13日 - ..."

未来演进方向

1. 自适应遗忘机制

不是所有记忆都需要永久保存，需要定期清理低价值记忆：

class AdaptiveForgetfulness {
  private memory: LongTermMemory;
  private retentionPolicy = {
    minImportance: 1.5,        // 重要度 < 1.5 可删除
    maxAge: 90,                // 最多保留 90 天
    maxCount: 10000,           // 最多保留 10000 条
    accessThreshold: 30        // 30 天内未访问可删除
  };
  
  async pruneMemories(): Promise<void> {
    console.log("开始记忆清理...");
    
    // 1. 获取所有记忆（分批获取避免内存溢出）
    const allMemories = await this.getAllMemories();
    console.log(`总记忆数：${allMemories.length}`);
    
    // 2. 筛选待删除的记忆
    const now = Date.now();
    const ageThreshold = now - this.retentionPolicy.maxAge * 24 * 60 * 60 * 1000;
    const accessThreshold = now - this.retentionPolicy.accessThreshold * 24 * 60 * 60 * 1000;
    
    const toDelete = allMemories.filter(m => {
      // 重要记忆不删除
      if (m.importance && m.importance >= this.retentionPolicy.minImportance) {
        return false;
      }
      
      // 过于陈旧（超过 90 天）
      if (m.timestamp && m.timestamp < ageThreshold) {
        return true;
      }
      
      // 长时间未访问（30 天内未访问）
      if (m.lastAccessed && m.lastAccessed < accessThreshold) {
        return true;
      }
      
      return false;
    });
    
    // 3. 如果总数超过限制，删除最旧的
    if (allMemories.length > this.retentionPolicy.maxCount) {
      const excess = allMemories.length - this.retentionPolicy.maxCount;
      const sorted = allMemories
        .filter(m => !toDelete.includes(m))
        .sort((a, b) => (a.timestamp || 0) - (b.timestamp || 0));
      
      toDelete.push(...sorted.slice(0, excess));
    }
    
    // 4. 执行删除
    console.log(`待删除记忆数：${toDelete.length}`);
    for (const memory of toDelete) {
      await this.deleteMemory(memory.id);
    }
    
    console.log(`清理完成，删除了 ${toDelete.length} 条记忆`);
    console.log(`剩余记忆数：${allMemories.length - toDelete.length}`);
  }
  
  private async getAllMemories(): Promise<Memory[]> {
    // 分批获取所有记忆（避免一次性加载过多）
    const batchSize = 1000;
    const allMemories: Memory[] = [];
    let offset = 0;
    
    while (true) {
      const batch = await this.memory.vectorStore.getDocuments({
        offset,
        limit: batchSize
      });
      
      if (batch.length === 0) break;
      
      allMemories.push(...batch.map(doc => ({
        id: doc.id,
        content: doc.pageContent,
        type: doc.metadata.type,
        timestamp: doc.metadata.timestamp,
        importance: doc.metadata.importance,
        lastAccessed: doc.metadata.lastAccessed,
        tags: doc.metadata.tags
      })));
      
      offset += batchSize;
      
      if (batch.length < batchSize) break;
    }
    
    return allMemories;
  }
  
  private async deleteMemory(id: string): Promise<void> {
    await this.memory.vectorStore.delete({ ids: [id] });
  }
}

// 使用示例：定时清理
setInterval(async () => {
  const forgetter = new AdaptiveForgetfulness();
  await forgetter.pruneMemories();
}, 7 * 24 * 60 * 60 * 1000); // 每周清理一次

2. 跨 Agent 记忆共享

多个 Agent 共享记忆库：

class SharedMemoryPool {
  private pools: Map<string, LongTermMemory> = new Map();
  
  async shareMemory(
    fromAgent: string,
    toAgent: string,
    memoryId: string
  ): Promise<void> {
    const sourceMemory = this.pools.get(fromAgent);
    const targetMemory = this.pools.get(toAgent);
    
    if (!sourceMemory || !targetMemory) return;
    
    // 复制记忆到目标 Agent
    const memory = await sourceMemory.retrieve(memoryId);
    await targetMemory.store(memory[0]);
  }
}

3. 记忆可视化

帮助用户理解 Agent 记住了什么：

interface MemoryVisualization {
  timeline: {
    date: string;
    events: Memory[];
  }[];
  
  knowledgeGraph: {
    nodes: { id: string; label: string }[];
    edges: { from: string; to: string; label: string }[];
  };
  
  skills: {
    name: string;
    successRate: number;
    lastUsed: Date;
  }[];
}

结语

长期记忆不是 AI Agent 的"可选功能"，而是从"工具"升级为"伙伴"的必要条件。

一个有记忆的 Agent 能够：

理解上下文：不需要每次都重新解释项目背景
学习偏好：自动适应你的工作习惯
避免错误：记住过去的失败，不重复犯错
持续改进：从每次交互中学习，变得更好

当你的 Agent 能在三个月后准确回忆起你的偏好，你就拥有了一个真正的"AI 伙伴"。

原文链接：chenguangliang.com/posts/blog0…