阿里开源AgentScope多智能体框架解析系列（七）第7章：记忆系统（Memory）本章导读本章将深入讲解Agent

本章导读

本章将深入讲解AgentScope的记忆系统。记忆是Agent的核心能力之一，它使Agent能够记住历史对话、学习用户偏好、实现跨会话的知识积累。一个没有记忆的Agent就像患有失忆症的人，无法进行有效的多轮对话。

理解记忆系统的分层设计理念
掌握短期记忆（InMemoryMemory）的使用
学会长期记忆（LongTermMemory）的配置
了解记忆管理的最佳实践
掌握多租户场景下的记忆隔离

7.1 记忆系统设计理念

7.1.1 什么是记忆

在AgentScope中，**记忆（Memory）**是Agent用来存储和检索历史信息的组件。它类似于人类的记忆系统，分为短期记忆和长期记忆。

为什么需要记忆？

场景1：多轮对话
用户: 我叫张三
Agent: 你好张三！
用户: 我叫什么名字？
Agent: 你叫张三。（需要记忆）

场景2：上下文理解
用户: Python有哪些优点？
Agent: Python简洁、易学、生态丰富...
用户: 它适合用在哪些领域？
Agent: Python适合Web开发、数据科学、AI...（需要记住前面在讨论Python）

场景3：个性化服务
用户: 我喜欢喝咖啡
Agent: 好的，记住了
（第二天）
用户: 给我推荐一个饮料
Agent: 根据你喜欢咖啡，推荐一款拿铁...（需要长期记忆）

7.1.2 三层记忆架构

AgentScope采用三层记忆架构，分别处理不同时间尺度和用途的信息：

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第1层：短期记忆（Short-term Memory）
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
功能：存储当前会话的消息历史
范围：最近N轮对话（通常10-50轮）
实现：InMemoryMemory
特点：
  - 快速访问
  - 内存存储
  - 会话结束即清空
使用场景：
  - 多轮对话上下文
  - 同一会话内的信息关联

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第2层：上下文压缩（Context Compression）
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
功能：自动摘要和压缩历史对话
范围：压缩后的重要信息
实现：AutoContextMemory
特点：
  - 自动触发压缩
  - 保留关键信息
  - 减少Token消耗
使用场景：
  - 长时间对话（超过100轮）
  - Token成本控制

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第3层：长期记忆（Long-term Memory）
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
功能：跨会话持久化、语义搜索
范围：所有历史会话的知识
实现：LongTermMemory
特点：
  - 持久化存储（数据库）
  - 向量化搜索
  - 多租户隔离
使用场景：
  - 用户偏好学习
  - 知识积累
  - 个性化推荐

7.1.3 Memory接口设计

package io.agentscope.core.memory;

import io.agentscope.core.message.Msg;
import io.agentscope.core.state.StateModule;

import java.util.List;

/**
 * Memory接口定义了记忆系统的核心操作
 */
public interface Memory extends StateModule {
    
    /**
     * 添加消息到记忆
     * 
     * @param message 要添加的消息
     */
    void add(Msg message);
    
    /**
     * 批量添加消息
     * 
     * @param messages 消息列表
     */
    default void addAll(List<Msg> messages) {
        messages.forEach(this::add);
    }
    
    /**
     * 获取所有记忆中的消息
     * 
     * @return 消息列表
     */
    List<Msg> getMemory();
    
    /**
     * 获取最近N条消息
     * 
     * @param n 消息数量
     * @return 最近N条消息
     */
    List<Msg> getLastN(int n);
    
    /**
     * 删除指定索引的消息
     * 
     * @param index 消息索引
     */
    void delete(int index);
    
    /**
     * 清空所有消息
     */
    void clear();
    
    /**
     * 获取消息总数
     * 
     * @return 消息数量
     */
    int size();
}

7.2 InMemoryMemory - 短期记忆（How - 如何使用）

7.2.1 基础用法

InMemoryMemory是最简单的记忆实现，将所有消息存储在内存中。

import io.agentscope.core.ReActAgent;
import io.agentscope.core.memory.InMemoryMemory;
import io.agentscope.core.memory.Memory;
import io.agentscope.core.message.Msg;

public class BasicMemoryExample {
    
    public static void main(String[] args) {
        // 创建短期记忆
        Memory memory = new InMemoryMemory();
        
        // 在Agent中使用
        ReActAgent agent = ReActAgent.builder()
            .name("MemoryAgent")
            .model(model)
            .memory(memory)  // 关联记忆
            .build();
        
        // Agent执行时自动使用记忆
        // 1. 用户消息自动添加到Memory
        // 2. 调用Model时自动从Memory获取历史
        // 3. Agent回复自动添加到Memory
        
        Msg response = agent.call("你好").block();
        System.out.println("Agent: " + response.getTextContent());
        
        // 查看记忆内容
        List<Msg> history = memory.getMemory();
        System.out.println("Memory size: " + history.size());  // 2（用户消息 + Agent回复）
    }
}

内部工作流程：

用户调用: agent.call("你好")
  ↓
① Agent.call()方法被调用
  ↓
② 将用户消息添加到Memory
   memory.add(Msg{role=USER, content="你好"})
  ↓
③ 从Memory获取完整历史
   List<Msg> history = memory.getMemory()
  ↓
④ 调用Model，传入历史消息
   model.generate(history, tools, options)
  ↓
⑤ Model生成回复
   Msg response = Msg{role=ASSISTANT, content="你好！有什么我可以帮助你的？"}
  ↓
⑥ 将回复添加到Memory
   memory.add(response)
  ↓
⑦ 返回回复给用户
   return response

7.2.2 多轮对话示例

import io.agentscope.core.ReActAgent;
import io.agentscope.core.memory.InMemoryMemory;
import io.agentscope.core.message.Msg;

public class MultiRoundConversationExample {
    
    public static void main(String[] args) {
        // 创建Agent
        ReActAgent agent = ReActAgent.builder()
            .name("ConversationBot")
            .model(model)
            .memory(new InMemoryMemory())
            .sysPrompt("你是一个友好的AI助手，能够记住对话历史。")
            .build();
        
        // 模拟多轮对话
        simulateConversation(agent);
    }
    
    private static void simulateConversation(ReActAgent agent) {
        String[] questions = {
            "我叫张三，今年28岁",
            "我最喜欢的编程语言是Python",
            "我的名字是什么？",
            "我喜欢什么编程语言？",
            "根据我的信息，推荐一个适合我的技术书籍"
        };
        
        System.out.println("===== 多轮对话演示 =====\n");
        
        for (int i = 0; i < questions.length; i++) {
            String question = questions[i];
            
            System.out.println("第" + (i + 1) + "轮:");
            System.out.println("用户: " + question);
            
            Msg response = agent.call(
                Msg.builder().textContent(question).build()
            ).block();
            
            System.out.println("Agent: " + response.getTextContent());
            
            // 显示当前记忆大小
            int memorySize = agent.getMemory().size();
            System.out.println("(当前记忆: " + memorySize + "条消息)\n");
        }
    }
}

执行过程示例：

===== 多轮对话演示 =====

第1轮:
用户: 我叫张三，今年28岁
Agent: 你好张三！很高兴认识你。28岁正是充满活力的年纪。
(当前记忆: 2条消息)

第2轮:
用户: 我最喜欢的编程语言是Python
Agent: Python是个很棒的选择！它简洁易学，在数据科学和AI领域特别受欢迎。
(当前记忆: 4条消息)

第3轮:
用户: 我的名字是什么？
Agent: 你的名字是张三。
(当前记忆: 6条消息)

第4轮:
用户: 我喜欢什么编程语言？
Agent: 你最喜欢的编程语言是Python。
(当前记忆: 8条消息)

第5轮:
用户: 根据我的信息，推荐一个适合我的技术书籍
Agent: 根据你28岁、喜欢Python的背景，我推荐《Python机器学习实战》或
      《流畅的Python》，这两本书都很适合你的水平和兴趣。
(当前记忆: 10条消息)

7.2.3 手动管理记忆

public class ManualMemoryManagementExample {
    
    public static void main(String[] args) {
        Memory memory = new InMemoryMemory();
        
        // 手动添加消息
        memory.add(Msg.builder()
            .role(MsgRole.USER)
            .textContent("Hello")
            .build());
        
        memory.add(Msg.builder()
            .role(MsgRole.ASSISTANT)
            .textContent("Hi! How can I help you?")
            .build());
        
        // 获取所有消息
        List<Msg> allMessages = memory.getMemory();
        System.out.println("Total messages: " + allMessages.size());
        
        // 获取最近N条消息
        List<Msg> lastTwo = memory.getLastN(2);
        System.out.println("Last 2 messages: " + lastTwo.size());
        
        // 删除特定消息
        memory.delete(0);  // 删除第一条消息
        System.out.println("After deletion: " + memory.size());
        
        // 清空所有消息
        memory.clear();
        System.out.println("After clear: " + memory.size());
    }
}

7.3 LongTermMemory - 长期记忆（Why - 为什么需要）

7.3.1 长期记忆的必要性

短期记忆在会话结束后就会丢失，但在很多场景下，我们需要跨会话的记忆：

场景1：个性化服务
会话1（周一）:
  用户: 我喜欢喝美式咖啡
  Agent: 好的，记住了

会话2（周三）:
  用户: 给我推荐一款饮料
  Agent: 根据你喜欢美式咖啡，推荐一款冷萃咖啡
  （需要记住周一的偏好）

场景2：知识积累
会话1:
  用户: 我们公司的产品代号是Phoenix
  Agent: 好的，记住了

会话2:
  用户: Phoenix项目的负责人是谁？
  Agent: 根据之前的信息，Phoenix是你们公司的产品代号...
  （需要记住产品信息）

场景3：学习用户习惯
多次会话后，Agent学到：
  - 用户习惯早上9点开会
  - 用户喜欢简洁的回复
  - 用户常问技术问题
  （长期记忆使Agent越用越懂用户）

7.3.2 LongTermMemory的配置

import io.agentscope.core.plan.LongTermMemory;
import io.agentscope.core.plan.LongTermMemoryMode;
import io.agentscope.core.rag.embedding.EmbeddingService;
import io.agentscope.core.rag.vectorstore.VectorStore;

public class LongTermMemoryExample {
    
    public static void main(String[] args) {
        // 步骤1：创建Embedding服务（用于向量化文本）
        EmbeddingService embeddingService = createEmbeddingService();
        
        // 步骤2：创建向量存储（用于存储和检索向量）
        VectorStore vectorStore = createVectorStore();
        
        // 步骤3：创建长期记忆
        LongTermMemory ltm = LongTermMemory.builder()
            .embeddingService(embeddingService)
            .vectorStore(vectorStore)
            .mode(LongTermMemoryMode.HYBRID)  // 混合模式
            .build();
        
        // 步骤4：在Agent中使用
        ReActAgent agent = ReActAgent.builder()
            .name("LongTermAgent")
            .model(model)
            .memory(new InMemoryMemory())  // 短期记忆
            .withLongTermMemory(ltm, LongTermMemoryMode.HYBRID)  // 长期记忆
            .build();
        
        // 使用Agent
        testLongTermMemory(agent, ltm);
    }
    
    private static void testLongTermMemory(ReActAgent agent, LongTermMemory ltm) {
        // === 第1个会话 ===
        System.out.println("===== 会话1 =====");
        
        agent.call("我叫Alice，是一名产品经理").block();
        agent.call("我喜欢喝美式咖啡").block();
        agent.call("我的团队有5个人").block();
        
        // Agent自动将这些信息存储到长期记忆
        
        // === 模拟会话结束，清空短期记忆 ===
        agent.getMemory().clear();
        
        // === 第2个会话（几天后）===
        System.out.println("\n===== 会话2（几天后）=====");
        
        Msg response1 = agent.call("我是谁？").block();
        System.out.println("Agent: " + response1.getTextContent());
        // 输出: 你是Alice，是一名产品经理。
        
        Msg response2 = agent.call("我喜欢喝什么？").block();
        System.out.println("Agent: " + response2.getTextContent());
        // 输出: 你喜欢喝美式咖啡。
        
        // Agent从长期记忆中检索到了之前的信息！
    }
    
    private static EmbeddingService createEmbeddingService() {
        // 创建Embedding服务（使用DashScope或OpenAI）
        return DashScopeEmbeddingService.builder()
            .apiKey(System.getenv("DASHSCOPE_API_KEY"))
            .modelName("text-embedding-v2")
            .build();
    }
    
    private static VectorStore createVectorStore() {
        // 创建向量存储（使用内存、Redis或专业向量数据库）
        return InMemoryVectorStore.create();
    }
}

7.3.3 LongTermMemory的工作模式

LongTermMemory支持三种工作模式：

public enum LongTermMemoryMode {
    /**
     * 只存储模式
     * Agent的每次对话都会自动存储到长期记忆
     * 但不会主动检索
     */
    STORE_ONLY,
    
    /**
     * 只检索模式
     * Agent会根据当前查询检索相关的长期记忆
     * 但不会存储新的对话
     */
    RETRIEVE_ONLY,
    
    /**
     * 混合模式（推荐）
     * 既存储新对话，也检索相关历史
     */
    HYBRID
}

使用示例：

// 模式1：只存储（适合知识积累阶段）
ReActAgent storageAgent = ReActAgent.builder()
    .name("StorageAgent")
    .withLongTermMemory(ltm, LongTermMemoryMode.STORE_ONLY)
    .build();

// 所有对话都会存储，但不会检索历史
// 适合场景：初次使用，积累知识

// 模式2：只检索（适合只读查询）
ReActAgent queryAgent = ReActAgent.builder()
    .name("QueryAgent")
    .withLongTermMemory(ltm, LongTermMemoryMode.RETRIEVE_ONLY)
    .build();

// 会检索相关历史，但不存储新对话
// 适合场景：临时查询，不污染长期记忆

// 模式3：混合模式（推荐）
ReActAgent hybridAgent = ReActAgent.builder()
    .name("HybridAgent")
    .withLongTermMemory(ltm, LongTermMemoryMode.HYBRID)
    .build();

// 既检索又存储，最智能
// 适合场景：生产环境

7.3.4 语义搜索能力

长期记忆支持基于语义的搜索，而不仅仅是关键词匹配。

public class SemanticSearchExample {
    
    public static void main(String[] args) {
        LongTermMemory ltm = createLongTermMemory();
        
        // 假设长期记忆中已经存储了这些信息：
        // "用户喜欢喝咖啡"
        // "用户每天早上9点开会"
        // "用户的团队有5个人"
        // "用户最近在学习Python"
        
        // 语义搜索示例1：查询用户偏好
        System.out.println("===== 查询：用户的饮品偏好 =====");
        List<MemoryEntry> results1 = ltm.search(
            "用户喜欢喝什么？",
            5,      // 返回top 5
            0.7     // 相似度阈值70%
        );
        
        results1.forEach(entry -> {
            System.out.println("相似度: " + entry.getScore());
            System.out.println("内容: " + entry.getContent());
            System.out.println();
        });
        
        // 输出：
        // 相似度: 0.92
        // 内容: 用户喜欢喝咖啡
        
        // 语义搜索示例2：查询工作习惯
        System.out.println("===== 查询：工作习惯 =====");
        List<MemoryEntry> results2 = ltm.search(
            "用户的工作时间安排",
            5,
            0.7
        );
        
        results2.forEach(entry -> {
            System.out.println("相似度: " + entry.getScore());
            System.out.println("内容: " + entry.getContent());
        });
        
        // 输出：
        // 相似度: 0.88
        // 内容: 用户每天早上9点开会
    }
}

7.4 记忆管理最佳实践

7.4.1 避免记忆爆炸

随着对话的增多，记忆会不断增长，导致Token消耗激增和性能下降。

问题场景：

第1轮: 2条消息（用户 + Agent）
第10轮: 20条消息
第100轮: 200条消息
第1000轮: 2000条消息 ❌ Token超标！

解决方案1：限制窗口大小

/**
 * 滑动窗口记忆
 * 只保留最近N条消息
 */
public class WindowedMemory implements Memory {
    
    private final List<Msg> messages = new ArrayList<>();
    private final int maxSize;
    
    public WindowedMemory(int maxSize) {
        this.maxSize = maxSize;
    }
    
    @Override
    public void add(Msg message) {
        messages.add(message);
        
        // 超出限制时删除最旧的消息
        while (messages.size() > maxSize) {
            messages.remove(0);
        }
    }
    
    @Override
    public List<Msg> getMemory() {
        return new ArrayList<>(messages);
    }
    
    @Override
    public int size() {
        return messages.size();
    }
    
    // 其他方法省略...
}

// 使用示例
ReActAgent agent = ReActAgent.builder()
    .name("Agent")
    .memory(new WindowedMemory(50))  // 只保留最近50条消息
    .build();

解决方案2：自动压缩

/**
 * 使用AutoContextMemory自动压缩历史
 */
public class AutoCompressionExample {
    
    public static void main(String[] args) {
        // 创建自动压缩记忆
        AutoContextMemory memory = AutoContextMemory.builder()
            .model(model)  // 用于生成摘要的LLM
            .compressionThreshold(10000)  // 当Token超过10K时触发压缩
            .mode(CompressionMode.PROGRESSIVE)  // 渐进式压缩
            .build();
        
        ReActAgent agent = ReActAgent.builder()
            .name("Agent")
            .memory(memory)
            .build();
        
        // 当消息历史的Token超过10K时：
        // 1. AutoContextMemory自动调用LLM
        // 2. LLM生成历史对话的摘要
        // 3. 用摘要替换原始消息
        // 4. Token数量大幅减少
    }
}

解决方案3：定期清理

/**
 * 定期清理旧消息
 */
public class ScheduledMemoryCleanup {
    
    private final ScheduledExecutorService scheduler = 
        Executors.newScheduledThreadPool(1);
    
    public void startCleanup(Memory memory, long intervalMinutes) {
        scheduler.scheduleAtFixedRate(
            () -> cleanupOldMessages(memory, Duration.ofDays(7)),
            0,
            intervalMinutes,
            TimeUnit.MINUTES
        );
    }
    
    private void cleanupOldMessages(Memory memory, Duration maxAge) {
        List<Msg> messages = memory.getMemory();
        Instant cutoff = Instant.now().minus(maxAge);
        
        for (int i = messages.size() - 1; i >= 0; i--) {
            Msg msg = messages.get(i);
            
            // 假设消息有时间戳元数据
            String timestampStr = msg.getMetadata("timestamp");
            if (timestampStr != null) {
                Instant msgTime = Instant.parse(timestampStr);
                
                if (msgTime.isBefore(cutoff)) {
                    memory.delete(i);
                }
            }
        }
        
        System.out.println("清理完成，剩余消息: " + memory.size());
    }
}

7.4.2 多租户记忆隔离

在生产环境中，一个Agent服务通常需要同时服务多个用户，每个用户的记忆必须隔离。

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

/**
 * 多租户记忆管理器
 */
public class MultiTenantMemoryManager {
    
    // 为每个用户维护独立的记忆
    private final Map<String, Memory> userMemories = new ConcurrentHashMap<>();
    
    // 为每个用户维护独立的长期记忆
    private final Map<String, LongTermMemory> userLongTermMemories = new ConcurrentHashMap<>();
    
    private final LongTermMemoryConfig ltmConfig;
    
    public MultiTenantMemoryManager(LongTermMemoryConfig config) {
        this.ltmConfig = config;
    }
    
    /**
     * 获取或创建用户的短期记忆
     */
    public Memory getOrCreateMemory(String userId) {
        return userMemories.computeIfAbsent(
            userId,
            key -> new WindowedMemory(100)  // 每个用户最多100条消息
        );
    }
    
    /**
     * 获取或创建用户的长期记忆
     */
    public LongTermMemory getOrCreateLongTermMemory(String userId) {
        return userLongTermMemories.computeIfAbsent(
            userId,
            key -> LongTermMemory.builder()
                .embeddingService(ltmConfig.getEmbeddingService())
                .vectorStore(ltmConfig.getVectorStore())
                .namespace(userId)  // 使用userId作为命名空间隔离
                .mode(LongTermMemoryMode.HYBRID)
                .build()
        );
    }
    
    /**
     * 为用户创建Agent
     */
    public ReActAgent createAgentForUser(String userId, Model model) {
        return ReActAgent.builder()
            .name("Agent-" + userId)
            .model(model)
            .memory(getOrCreateMemory(userId))
            .withLongTermMemory(
                getOrCreateLongTermMemory(userId),
                LongTermMemoryMode.HYBRID
            )
            .build();
    }
    
    /**
     * 清理用户的短期记忆
     */
    public void clearUserMemory(String userId) {
        Memory memory = userMemories.get(userId);
        if (memory != null) {
            memory.clear();
        }
    }
    
    /**
     * 删除用户的所有记忆（包括长期记忆）
     */
    public void deleteUserData(String userId) {
        // 清理短期记忆
        userMemories.remove(userId);
        
        // 清理长期记忆
        LongTermMemory ltm = userLongTermMemories.remove(userId);
        if (ltm != null) {
            ltm.clear();  // 从向量数据库中删除
        }
    }
    
    /**
     * 获取系统统计
     */
    public MemoryStats getStats() {
        return new MemoryStats(
            userMemories.size(),
            userMemories.values().stream()
                .mapToInt(Memory::size)
                .sum(),
            userLongTermMemories.size()
        );
    }
}

// 使用示例
public class MultiUserScenario {
    
    public static void main(String[] args) {
        // 创建多租户管理器
        MultiTenantMemoryManager manager = new MultiTenantMemoryManager(config);
        
        // 用户1的会话
        ReActAgent agent1 = manager.createAgentForUser("user_001", model);
        agent1.call("我叫张三").block();
        agent1.call("我喜欢喝咖啡").block();
        
        // 用户2的会话
        ReActAgent agent2 = manager.createAgentForUser("user_002", model);
        agent2.call("我叫李四").block();
        agent2.call("我喜欢喝茶").block();
        
        // 验证记忆隔离
        Msg response1 = agent1.call("我喜欢喝什么？").block();
        System.out.println("用户1: " + response1.getTextContent());
        // 输出: 你喜欢喝咖啡
        
        Msg response2 = agent2.call("我喜欢喝什么？").block();
        System.out.println("用户2: " + response2.getTextContent());
        // 输出: 你喜欢喝茶
        
        // 记忆完全隔离！
        
        // 清理用户1的短期记忆
        manager.clearUserMemory("user_001");
        
        // 查看统计
        MemoryStats stats = manager.getStats();
        System.out.println("活跃用户: " + stats.getActiveUsers());
        System.out.println("总消息数: " + stats.getTotalMessages());
    }
}

7.4.3 记忆持久化

在生产环境中，记忆需要持久化，以便在系统重启后恢复。

/**
 * 可持久化的记忆
 */
public class PersistentMemory implements Memory {
    
    private final List<Msg> messages = new ArrayList<>();
    private final MemoryStore store;  // 存储后端（数据库、Redis等）
    private final String userId;
    
    public PersistentMemory(String userId, MemoryStore store) {
        this.userId = userId;
        this.store = store;
        
        // 启动时从存储加载
        loadFromStore();
    }
    
    @Override
    public void add(Msg message) {
        messages.add(message);
        
        // 同步到存储
        store.save(userId, message);
    }
    
    @Override
    public List<Msg> getMemory() {
        return new ArrayList<>(messages);
    }
    
    @Override
    public void clear() {
        messages.clear();
        store.clear(userId);
    }
    
    private void loadFromStore() {
        List<Msg> stored = store.load(userId);
        messages.addAll(stored);
    }
}

// 使用Session自动持久化
ReActAgent agent = ReActAgent.builder()
    .name("PersistentAgent")
    .memory(new InMemoryMemory())
    .build();

// 保存Agent状态（包括Memory）
Session session = Session.create();
session.saveAgent(agent);

// 恢复Agent状态
ReActAgent restored = session.loadAgent("PersistentAgent", ReActAgent.class);
// Memory自动恢复

7.5 生产场景：智能客服的记忆管理

让我们通过一个完整的生产级示例，展示记忆系统在实际场景中的应用。

import io.agentscope.core.ReActAgent;
import io.agentscope.core.memory.Memory;
import io.agentscope.core.plan.LongTermMemory;
import io.agentscope.core.plan.LongTermMemoryMode;

/**
 * 智能客服系统的记忆管理
 * 
 * 功能：
 * 1. 多用户隔离
 * 2. 短期记忆：当前会话的上下文
 * 3. 长期记忆：用户偏好、历史问题
 * 4. 自动压缩：避免Token爆炸
 * 5. 持久化：系统重启后恢复
 */
public class CustomerServiceMemorySystem {
    
    private final MultiTenantMemoryManager memoryManager;
    private final Model model;
    
    public CustomerServiceMemorySystem(String apiKey) {
        this.model = createModel(apiKey);
        this.memoryManager = new MultiTenantMemoryManager(
            createLongTermMemoryConfig()
        );
    }
    
    /**
     * 处理客服对话
     */
    public String handleCustomerQuery(String userId, String query) {
        // 为用户创建或获取Agent
        ReActAgent agent = memoryManager.createAgentForUser(userId, model);
        
        // 处理查询
        Msg response = agent.call(query).block();
        
        return response.getTextContent();
    }
    
    /**
     * 查看用户的历史偏好
     */
    public List<String> getUserPreferences(String userId) {
        LongTermMemory ltm = memoryManager.getOrCreateLongTermMemory(userId);
        
        // 语义搜索用户偏好
        List<MemoryEntry> preferences = ltm.search(
            "用户的偏好和喜好",
            10,
            0.6
        );
        
        return preferences.stream()
            .map(MemoryEntry::getContent)
            .collect(Collectors.toList());
    }
    
    /**
     * 结束用户会话
     */
    public void endSession(String userId) {
        // 清空短期记忆（长期记忆保留）
        memoryManager.clearUserMemory(userId);
    }
    
    /**
     * 完全删除用户数据（GDPR合规）
     */
    public void deleteUserData(String userId) {
        memoryManager.deleteUserData(userId);
    }
    
    // 辅助方法...
    private Model createModel(String apiKey) {
        return DashScopeChatModel.builder()
            .apiKey(apiKey)
            .modelName("qwen-plus")
            .build();
    }
    
    private LongTermMemoryConfig createLongTermMemoryConfig() {
        return new LongTermMemoryConfig(
            createEmbeddingService(),
            createVectorStore()
        );
    }
}

// 使用示例
public class CustomerServiceDemo {
    
    public static void main(String[] args) {
        CustomerServiceMemorySystem system = new CustomerServiceMemorySystem(
            System.getenv("DASHSCOPE_API_KEY")
        );
        
        String userId = "customer_12345";
        
        // ===== 第1次对话 =====
        System.out.println("===== 第1次对话（周一）=====");
        
        String resp1 = system.handleCustomerQuery(userId, 
            "我叫张三，我经常买你们的产品");
        System.out.println("客服: " + resp1);
        
        String resp2 = system.handleCustomerQuery(userId,
            "我最喜欢你们的iPhone 15 Pro");
        System.out.println("客服: " + resp2);
        
        // 结束会话
        system.endSession(userId);
        
        // ===== 第2次对话（几天后）=====
        System.out.println("\n===== 第2次对话（周五）=====");
        
        String resp3 = system.handleCustomerQuery(userId,
            "我是谁？");
        System.out.println("客服: " + resp3);
        // 输出: 你是张三，我们的老客户。
        
        String resp4 = system.handleCustomerQuery(userId,
            "给我推荐一款产品");
        System.out.println("客服: " + resp4);
        // 输出: 根据你喜欢iPhone 15 Pro，推荐你看看新款的AirPods Pro...
        
        // 查看用户偏好
        List<String> prefs = system.getUserPreferences(userId);
        System.out.println("\n用户偏好:");
        prefs.forEach(pref -> System.out.println("  - " + pref));
    }
}

7.6 本章总结

关键要点

三层记忆架构
- 短期记忆：InMemoryMemory，存储当前会话
- 上下文压缩：AutoContextMemory，自动摘要
- 长期记忆：LongTermMemory，跨会话持久化
Memory接口
- add()：添加消息
- getMemory()：获取所有消息
- getLastN()：获取最近N条
- clear()：清空记忆
长期记忆
- 语义向量搜索
- 三种工作模式：STORE_ONLY、RETRIEVE_ONLY、HYBRID
- 跨会话学习能力
最佳实践
- 限制记忆窗口避免Token爆炸
- 多租户隔离保证数据安全
- 持久化保证系统可恢复
- 定期清理优化性能

记忆管理检查清单

设计阶段：
☐ 确定记忆层次（是否需要长期记忆？）
☐ 设计记忆隔离策略（单用户 vs 多用户）
☐ 规划Token预算

实现阶段：
☐ 配置短期记忆（InMemoryMemory或AutoContextMemory）
☐ 配置长期记忆（如需要）
☐ 实现多租户隔离（如需要）
☐ 配置持久化（如需要）

优化阶段：
☐ 监控记忆大小和Token使用
☐ 设置记忆窗口限制
☐ 实现定期清理
☐ 测试记忆隔离

生产部署：
☐ 配置持久化存储
☐ 实现数据备份
☐ 支持GDPR合规（用户数据删除）
☐ 监控记忆系统性能

下一章预告

第8章将深入讲解Hook系统，探讨如何通过事件拦截机制扩展Agent的功能，实现日志记录、性能监控、RAG集成等高级特性。