第 20 课：SessionMemory — 实时对话记忆学习目标完成本课后，你将能够：解释 SessionMemo

模块七：记忆与上下文 | 前置依赖：第 17 课 | 预计学习时间：65 分钟

学习目标

完成本课后，你将能够：

解释 SessionMemory 的触发阈值及双条件判断逻辑
描述 forked subagent 的执行模型与权限约束
理解 session memory 文件的模板结构与章节管理
区分四种持久化记忆类型（user/feedback/project/reference）及其适用场景
说明 extractMemories 系统如何将会话知识写入磁盘

20.1 系统概览

SessionMemory 是 Claude Code 的实时对话记忆系统。它在对话进行中自动提取关键信息，写入一个结构化的 Markdown 文件，确保上下文压缩（compact）后核心信息不丢失。

┌─────────────────────────────────────────────────────┐
│                    主对话循环                         │
│                                                     │
│  用户提问 → 模型回复 → 工具调用 → 模型回复 → ...     │
│       │                                             │
│       ├── postSamplingHook 触发                     │
│       │      │                                      │
│       │      ▼                                      │
│       │  shouldExtractMemory()                      │
│       │      │                                      │
│       │      ├── 未达阈值 → 跳过                    │
│       │      │                                      │
│       │      └── 达到阈值 → runForkedAgent()        │
│       │             │                               │
│       │             ▼                               │
│       │  ┌─────────────────────┐                    │
│       │  │   Forked Subagent   │                    │
│       │  │  (独立上下文分支)    │                    │
│       │  │                     │                    │
│       │  │  读取当前记忆文件    │                    │
│       │  │  分析新对话内容      │                    │
│       │  │  用 Edit 工具更新    │                    │
│       │  └─────────────────────┘                    │
│       │             │                               │
│       │             ▼                               │
│       │  ~/.claude/session-memory/MEMORY.md 更新     │
│       │                                             │
│       └── 主循环继续（不被打断）                      │
└─────────────────────────────────────────────────────┘

20.2 Feature Gate 与初始化

门控检查

SessionMemory 受 GrowthBook Feature Flag tengu_session_memory 控制：

function isSessionMemoryGateEnabled(): boolean {
  return getFeatureValue_CACHED_MAY_BE_STALE('tengu_session_memory', false)
}

使用 _CACHED_MAY_BE_STALE 后缀的函数意味着它读取本地缓存值，不阻塞等待 GrowthBook 远程配置。值可能是过时的，但避免了网络延迟。

初始化流程

export function initSessionMemory(): void {
  if (getIsRemoteMode()) return
  const autoCompactEnabled = isAutoCompactEnabled()
  if (!autoCompactEnabled) return
  registerPostSamplingHook(extractSessionMemory)
}

初始化的关键决策：

远程模式跳过 — remote mode（CCR 容器）不使用本地记忆
依赖 autoCompact — 如果 autoCompact 关闭，session memory 也关闭（因为它的主要价值是辅助压缩）
注册 hook — 使用 registerPostSamplingHook 将提取函数挂载到每次 API 响应之后
gate 延迟检查 — 注册 hook 时不检查 feature gate，实际运行时才检查（lazy evaluation）

20.3 触发阈值：何时提取记忆

配置参数

export const DEFAULT_SESSION_MEMORY_CONFIG: SessionMemoryConfig = {
  minimumMessageTokensToInit: 10000,   // 10K tokens 才开始
  minimumTokensBetweenUpdate: 5000,    // 每 5K tokens 增长触发一次
  toolCallsBetweenUpdates: 3,          // 或 3 次工具调用
}

这些默认值可以被远程配置（tengu_sm_config）覆盖。

双阶段触发逻辑

阶段一：初始化门槛

if (!isSessionMemoryInitialized()) {
  if (!hasMetInitializationThreshold(currentTokenCount)) {
    return false  // 上下文还太小，不值得提取
  }
  markSessionMemoryInitialized()
}

对话开始时，上下文窗口中只有系统提示和少量交互。此时提取记忆意义不大，所以设置了 10K tokens 的初始化门槛。

阶段二：更新判断

const shouldExtract =
  (hasMetTokenThreshold && hasMetToolCallThreshold) ||
  (hasMetTokenThreshold && !hasToolCallsInLastTurn)

这是一个组合条件：

触发提取 = 
  (token 增长 >= 5K AND 工具调用 >= 3 次)
  OR
  (token 增长 >= 5K AND 最后一轮没有工具调用)

第二个条件捕获"自然对话断点" — 当模型完成一轮工作（不再调用工具）且有足够新内容时，这是一个好的提取时机。

重要约束：token 阈值是必要条件。即使工具调用达标，如果 token 增长不够 5K，也不会触发。这防止了在快速连续的小工具调用时过度提取。

Token 计算方式

export function hasMetUpdateThreshold(currentTokenCount: number): boolean {
  const tokensSinceLastExtraction = currentTokenCount - tokensAtLastExtraction
  return tokensSinceLastExtraction >= sessionMemoryConfig.minimumTokensBetweenUpdate
}

使用的是上下文窗口增量（当前 token 数 - 上次提取时的 token 数），而非累计 API 使用量。这与 autoCompact 使用相同的度量标准，确保两个系统的行为一致。

20.4 Forked Subagent 执行模型

为什么使用 Fork

SessionMemory 提取需要调用 Claude API（让模型分析对话并生成记忆更新），但不能阻塞主对话循环。解决方案是 forked subagent — 创建一个独立的 API 调用上下文，共享主对话的 prompt cache 但互不干扰。

await runForkedAgent({
  promptMessages: [createUserMessage({ content: userPrompt })],
  cacheSafeParams: createCacheSafeParams(context),
  canUseTool: createMemoryFileCanUseTool(memoryPath),
  querySource: 'session_memory',
  forkLabel: 'session_memory',
  overrides: { readFileState: setupContext.readFileState },
})

权限约束

forked agent 的工具权限被严格限制：

export function createMemoryFileCanUseTool(memoryPath: string): CanUseToolFn {
  return async (tool: Tool, input: unknown) => {
    if (
      tool.name === FILE_EDIT_TOOL_NAME &&
      typeof input === 'object' && input !== null &&
      'file_path' in input
    ) {
      const filePath = input.file_path
      if (typeof filePath === 'string' && filePath === memoryPath) {
        return { behavior: 'allow' as const, updatedInput: input }
      }
    }
    return {
      behavior: 'deny' as const,
      message: `only ${FILE_EDIT_TOOL_NAME} on ${memoryPath} is allowed`,
      // ...
    }
  }
}

只允许一个操作：对特定的 session memory 文件使用 Edit 工具。其他任何工具调用都会被拒绝。这是最小权限原则的极致体现。

并发控制

提取函数被 sequential() 包装，确保同一时间只有一个提取在运行：

const extractSessionMemory = sequential(async function (
  context: REPLHookContext,
): Promise<void> {
  // ...
})

20.5 记忆文件模板

默认模板结构

Session memory 文件使用固定的 Markdown 章节结构：

# Session Title
_A short and distinctive 5-10 word descriptive title for the session._

# Current State
_What is actively being worked on right now? Pending tasks not yet completed._

# Task specification
_What did the user ask to build? Any design decisions or other context_

# Files and Functions
_What are the important files? In short, what do they contain?_

# Workflow
_What bash commands are usually run and in what order?_

# Errors & Corrections
_Errors encountered and how they were fixed._

# Codebase and System Documentation
_What are the important system components? How do they fit together?_

# Learnings
_What has worked well? What has not? What to avoid?_

# Key results
_If the user asked a specific output, repeat the exact result here_

# Worklog
_Step by step, what was attempted, done? Very terse summary for each step_

章节保护规则

提取提示中有严格的结构保护指令：

CRITICAL RULES FOR EDITING:
- NEVER modify, delete, or add section headers (the lines starting with '#')
- NEVER modify or delete the italic _section description_ lines
- ONLY update the actual content that appears BELOW the italic descriptions
- Do NOT add any new sections, summaries, or information outside the structure

这确保了文件结构在多次更新后保持一致，便于后续的压缩系统解析。

章节大小控制

const MAX_SECTION_LENGTH = 2000            // 每个章节上限 ~2000 tokens
const MAX_TOTAL_SESSION_MEMORY_TOKENS = 12000  // 整个文件上限 ~12000 tokens

当章节超长时，系统会在提取提示中添加警告：

function generateSectionReminders(sectionSizes, totalTokens): string {
  // 超过总预算 → CRITICAL 级别警告
  // 单章节超限 → IMPORTANT 级别警告
}

自定义模板

用户可以在 ~/.claude/session-memory/config/template.md 放置自定义模板，在 ~/.claude/session-memory/config/prompt.md 放置自定义提取提示。模板变量使用 {{variableName}} 语法：

function substituteVariables(template: string, variables: Record<string, string>): string {
  return template.replace(/\{\{(\w+)\}\}/g, (match, key: string) =>
    Object.prototype.hasOwnProperty.call(variables, key)
      ? variables[key]!
      : match,
  )
}

20.6 extractMemories — 持久化记忆系统

除了 SessionMemory（会话级记忆），Claude Code 还有一个更持久的记忆系统 — services/extractMemories/。它在每次查询循环结束时运行，将值得长期保存的知识写入磁盘。

四种记忆类型

export const MEMORY_TYPES = ['user', 'feedback', 'project', 'reference'] as const

类型	用途	存储位置	示例
`user`	用户画像	私有目录	用户是数据科学家，关注可观测性
`feedback`	工作指导	默认私有	不要在这些测试中 mock 数据库
`project`	项目知识	偏向团队	Q2 重构目标是迁移到 gRPC
`reference`	参考信息	视情况	构建命令是 `make build-prod`

记忆文件格式

每个记忆是一个独立的 Markdown 文件，使用 frontmatter 标注元数据：

---
type: feedback
title: 不要在集成测试中使用 mock
---

集成测试必须连接真实数据库，不使用 mock。

**Why:** 上季度 mock 测试通过但生产迁移失败的事故。

**How to apply:** 在 tests/integration/ 目录下的测试必须使用 testcontainers。

记忆索引 — MEMORY.md

所有记忆文件通过 MEMORY.md 索引。每条索引是一行简短描述：

- [User Profile](user_role.md) — data scientist focused on observability
- [No DB Mocks](feedback_testing.md) — integration tests must hit real DB
- [Q2 Migration](project_grpc.md) — gRPC migration target for services layer

MEMORY.md 会被加载到系统提示中，所以有 200 行的软上限。

extractMemories 执行流程

handleStopHooks（查询循环结束）
  │
  ▼
executeExtractMemories()
  │
  ├── 非主 agent → 跳过
  ├── feature gate 关闭 → 跳过
  ├── autoMemory 未启用 → 跳过
  ├── 远程模式 → 跳过
  ├── 已有提取在运行 → 暂存上下文，等待尾部运行
  │
  └── runExtraction()
      │
      ├── 主 agent 已写入记忆文件 → 跳过（互斥）
      ├── 轮次节流（每 N 轮才运行一次）
      │
      └── runForkedAgent()
          ├── 扫描已有记忆文件（避免重复）
          ├── 构建提取提示
          ├── 限制 5 轮（Read → Write 模式）
          └── 权限：Read/Grep/Glob 无限制
              + 只读 Bash
              + Edit/Write 仅限记忆目录

与 SessionMemory 的关系

两个系统互补而非竞争：

┌─────────────────┐        ┌──────────────────┐
│  SessionMemory   │        │  extractMemories  │
│                  │        │                   │
│  会话级记忆       │        │  跨会话记忆        │
│  一个 MD 文件     │        │  多个主题文件      │
│  结构化章节       │        │  frontmatter 标注  │
│  辅助 compact     │        │  加载到系统提示    │
│  每次 API 后检查  │        │  每次查询结束检查   │
│  → 信息不丢失     │        │  → 知识长期积累    │
└─────────────────┘        └──────────────────┘

20.7 等待与清理

等待正在进行的提取

export async function waitForSessionMemoryExtraction(): Promise<void> {
  const startTime = Date.now()
  while (extractionStartedAt) {
    // 提取已超过 1 分钟 → 视为过期，不再等待
    if (Date.now() - extractionStartedAt > 60000) return
    // 总等待超过 15 秒 → 超时，继续
    if (Date.now() - startTime > 15000) return
    await sleep(1000)
  }
}

这个函数在 compact 之前调用，确保正在进行的记忆提取完成后再开始压缩，避免数据丢失。两个安全阀防止无限等待。

drain 机制（extractMemories）

drainer = async (timeoutMs = 60_000) => {
  if (inFlightExtractions.size === 0) return
  await Promise.race([
    Promise.all(inFlightExtractions).catch(() => {}),
    new Promise<void>(r => setTimeout(r, timeoutMs).unref()),
  ])
}

在进程退出前（print.ts 的 gracefulShutdown），等待所有正在进行的记忆提取完成，软超时 60 秒。.unref() 确保定时器不会阻止 Node.js 进程退出。

课后练习

练习 1：阈值模拟

假设一个对话的 token 变化序列为：[0, 3000, 8000, 11000, 14000, 16000, 22000]，每两步之间有 2 次工具调用。使用默认配置，在哪些步骤会触发 session memory 提取？

练习 2：权限分析

如果 forked subagent 尝试调用 BashTool 执行 ls ~/.claude/，会发生什么？追踪 createMemoryFileCanUseTool 的判断路径，说明拒绝原因。对比 createAutoMemCanUseTool（extractMemories 使用）的权限范围有何不同。

练习 3：模板设计

为一个数据分析项目设计自定义的 session memory 模板（放置在 ~/.claude/session-memory/config/template.md）。考虑数据分析的特殊需求：数据源信息、SQL 查询、分析结论、可视化配置等。

练习 4：记忆分类

以下信息应该保存为哪种记忆类型（user/feedback/project/reference）？说明理由。

"用户偏好在 PR 中不使用 squash merge"
"当前正在进行从 REST 到 GraphQL 的迁移"
"用户是前端团队的 tech lead"
"项目使用 pnpm 而不是 npm 作为包管理器"

本课小结

要点	内容
触发阈值	初始化 10K tokens；更新需要 5K token 增长 AND（3 次工具调用 OR 自然断点）
执行模型	forked subagent，共享 prompt cache，独立上下文
权限约束	SessionMemory 只允许 Edit 一个文件；extractMemories 允许读+只读 Bash+记忆目录写
文件结构	固定章节模板，每章节 2K tokens 上限，总文件 12K tokens 上限
四种记忆类型	user（用户画像）、feedback（工作指导）、project（项目知识）、reference（参考信息）
存储格式	独立 MD 文件 + frontmatter + MEMORY.md 索引
并发控制	sequential() 包装 + 15s/60s 等待超时 + drain 机制

下一课预告

第 21 课：autoDream — "做梦"系统 — 探索 Claude Code 如何在会话之间"做梦"，将散落在多个会话中的记忆整合为更持久、更有组织的知识。包括三道门控（24 小时时间间隔 + 5 个会话 + 合并锁）、四阶段合并流程（Orient→Gather→Consolidate→Prune），以及文件锁的 PID 竞争检测机制。