NanoBot 重要记忆永不丢失(向量记忆系统升级改造指南)

39 阅读15分钟

NanoBot 轻量向量内存系统:从关键词匹配到语义理解

项目: NanoBot 个人 AI 助手
主题: 轻量级向量记忆系统重构
目标: 零侵入、<100MB、5分钟落地、AI可自主维护
技术栈: SQLite + sentence-transformers + numpy
代码量: 949 行 Python (4个模块)
重要提示: 可以将这个文章中直接喂给运行中的nanobot让他自动迁移替换


一、NanoBot 项目简介

1.1 什么是 NanoBot?

NanoBot 是一个极致轻量的个人 AI 助手,专为个人开发者打造:

┌─────────────────────────────────────────┐
│              NanoBot 架构               │
├─────────────────────────────────────────┤
│  CLI 层    │ 自然语言命令解析            │
├─────────────────────────────────────────┤
│  Agent 层  │ LLM 调用 + 工具执行         │
├─────────────────────────────────────────┤
│  Memory 层 │ 记忆存储 + 上下文管理       │  ← 本文重构目标
├─────────────────────────────────────────┤
│  Tool 层   │ 文件操作、搜索、命令执行    │
└─────────────────────────────────────────┘

核心特点:

  • 🚀 轻量: 总代码 <4000 行,单文件部署
  • 🔒 本地优先: 数据完全本地存储,无需联网
  • 🛠️ 工具丰富: 内置文件读写、命令执行、代码搜索等
  • 💾 记忆持久: 跨会话保留重要信息

1.2 原版记忆系统

原版使用简单的 Markdown 文件存储记忆:

~/.nanobot/workspace/memory/
├── MEMORY.md          # 长期记忆(重要配置、习惯)
├── 2026-02-13.md      # 今日对话记录
├── 2026-02-12.md      # 昨日记录
└── ...

存储示例:

## 2026-02-13
- 用户偏好使用 Python
- 项目目录: ~/projects/nanobot

## 重要配置
- 编辑器: VSCode
- 主题: Dark+

二、问题引出:原版方案的痛点

2.1 实际使用场景

假设你和 NanoBot 有这样一段对话:

用户: 记住,我喜欢用 VSCode 写代码
NanoBot: 好的,已记录。

用户 (3天后): 我常用什么编辑器?
NanoBot: [搜索 "编辑器" 关键词] 找不到相关记忆。
用户: 那我说过的开发工具呢?
NanoBot: [搜索 "开发工具" 关键词] 找不到相关记忆。

问题: NanoBot 存储了 "我喜欢用 VSCode",但无法关联 "编辑器"、"开发工具" 等语义相近的词。

2.2 五大核心痛点

痛点具体表现影响程度
① 关键词死板匹配"发邮件"≠"邮件发送"≠"email"🔴 高
② 无优先级机制重要配置和闲聊混在一起🔴 高
③ 无自主维护文件无限增长,需人工清理🟡 中
④ 无语义关联无法理解记忆间的关系🔴 高
⑤ 检索效率低全文件扫描,随时间变慢🟡 中

2.3 技术层面分析

# 原版检索逻辑 (伪代码)
def search_memory(keyword):
    results = []
    for file in memory_files:
        if keyword in file.content:  # ← 字面匹配
            results.append(file)
    return results

缺陷:

  • 无法处理同义词 (编辑器 vs IDE)
  • 无法处理语义关联 (Python vs 编程语言)
  • 无法排序重要性
  • 时间复杂度 O(n×m),n=文件数, m=内容长度

三、方案提出:轻量向量内存系统

3.1 核心思路

将记忆从"文字存储"升级为"语义向量存储":

文字记忆 → 嵌入模型 → 向量(384维) → 相似度检索

示例:
"我喜欢用VSCode" → [0.23, -0.15, 0.87, ...] (384维)
"我的编辑器是VSCode" → [0.25, -0.12, 0.85, ...] (384维)
                                                          ↓
                                            余弦相似度 = 0.92 (高度相关)

3.2 设计原则

针对 NanoBot "轻量个人助手" 的定位:

原则说明约束
极致轻量总依赖 <100MB拒绝 Milvus/PGVector
零侵入仅修改 memory 模块不动 Agent Loop
本地优先单文件存储无需外部服务
自维护AI 自主管理无需人工干预
向后兼容降级机制随时可回滚

3.3 架构设计

┌─────────────────────────────────────────────────┐
│  应用层: EnhancedMemoryStore (向后兼容包装器)      │  ← 139行
├─────────────────────────────────────────────────┤
│  管理层: MemoryOptimizer (四大维度自优化)          │  ← 298行
│  - 内容优化: 合并重复(相似度>0.9)                 │
│  - 检索优化: 动态调整阈值/Top-K                   │
│  - 优先级优化: 高频访问自动加权                   │
│  - 存储优化: 定时清理无效记忆                     │
├─────────────────────────────────────────────────┤
│  检索层: VectorMemoryManager (SQLite + 语义检索)   │  ← 334行
│  - 存储: 单文件 SQLite                            │
│  - 检索: 余弦相似度 + 时间过滤                    │
│  - 索引: 多维度索引保证性能                       │
├─────────────────────────────────────────────────┤
│  嵌入层: EmbeddingGenerator (轻量模型 + 压缩)      │  ← 178行
│  - 模型: all-MiniLM-L6-v2 (80MB, 384维)          │
│  - 压缩: float16 (节省50%存储)                   │
│  - 缓存: 本地磁盘缓存避免重复计算                 │
└─────────────────────────────────────────────────┘

3.4 技术选型对比

方案依赖大小部署难度性能适用性
Milvus2GB+需Docker极高❌ 太重
PGVector500MB+需PostgreSQL❌ 依赖重
ChromaDB200MB+中等⚠️ 之前用过,有问题
SQLite+轻量模型100MB零部署中高完美契合

四、详细对比:原版 vs 新方案

4.1 功能对比

功能维度原版 (Markdown)新方案 (向量)提升
匹配方式关键词字面匹配语义相似度匹配🚀 质的飞跃
同义词处理❌ 不支持✅ 自动关联例:IDE↔编辑器
优先级管理❌ 无✅ 动态优先级(0-10)重要记忆优先
自主维护❌ 人工清理✅ 4维度自优化零维护成本
检索速度O(n×m) 线性O(n) 常数小~10倍提升
存储占用无限增长<100MB 恒定可控
模糊查询❌ 不支持✅ 自然语言体验更佳
记忆关联❌ 孤立存储✅ 相似度关联发现隐藏关系

4.2 实际场景对比

场景1:同义词检索

原版:

用户: 记住我喜欢用 PyCharm
     [存储] "用户偏好: PyCharm"

3天后
用户: 我的IDE是什么?
     [搜索 "IDE"] → 无结果 ❌
用户: 我的开发工具?
     [搜索 "开发工具"] → 无结果 ❌

新方案:

用户: 记住我喜欢用 PyCharm
     [向量化存储] "我喜欢用 PyCharm" → [0.12, -0.34, ...]

3天后
用户: 我的IDE是什么?
     [查询向量化] "我的IDE是什么" → [0.15, -0.31, ...]
     [相似度计算]"我喜欢用 PyCharm" = 0.89
     [返回结果] 你的IDE是 PyCharm ✅
场景2:重要记忆保护

原版:

memory/
├── MEMORY.md (1000行,混杂重要/不重要)
├── 2026-01-01.md (已过时)
├── 2026-01-02.md (已过时)
└── ... (共100个文件)

问题: 重要配置被淹没,旧文件无人清理

新方案:

sqlite> SELECT content, priority, access_count FROM memories 
        WHERE priority > 8 ORDER BY access_count DESC;

"用户偏好: Python"        | 9.5 | 42
"API密钥: sk-xxx"         | 9.0 | 15
"项目路径: ~/work"        | 8.5 | 38

自动清理: 删除180天未访问且优先级<2的记忆
场景3:自主维护

原版: 用户需要手动编辑 Markdown 文件,决定保留/删除什么。

新方案:

每10次操作 → 轻量优化:
  - 合并重复记忆(相似度>0.9)
  - 更新访问频率统计

每日凌晨2点 → 深度优化:
  - 清理无效记忆(>180天, 低优先级)
  - 数据库 VACUUM 整理碎片
  - 调整检索参数(根据命中率)

4.3 性能指标

指标原版新方案测试环境
启动时间0ms~2s (模型加载)i5-8400
记忆写入5ms15ms (含编码)1000条数据集
记忆检索50ms30ms1000条数据集
存储占用无上限~80KB/1000条float16压缩
模型大小080MBall-MiniLM-L6-v2
总依赖0~100MB含PyTorch

五、详细介绍:四层架构实现

5.1 第一层:嵌入层 (Embedding)

职责: 将文本转换为语义向量

模型选择: sentence-transformers/all-MiniLM-L6-v2

  • 大小: 80MB
  • 维度: 384维
  • 语言: 多语言支持
  • 性能: 单条编码 <10ms

压缩策略:

原始: float32 × 384 = 1536 字节
压缩: float16 × 384 = 768 字节  (节省50%)

缓存机制: 文本哈希作为key,避免重复编码相同内容

5.2 第二层:存储层 (VectorManager)

职责: 向量存储与语义检索

SQLite 表结构:

CREATE TABLE memories (
    id TEXT PRIMARY KEY,           -- 唯一ID (SHA256)
    content TEXT NOT NULL,         -- 原始文本
    category TEXT DEFAULT 'general', -- 分类
    priority REAL DEFAULT 5.0,     -- 动态优先级 0-10
    access_count INTEGER DEFAULT 0, -- 访问次数
    created_at TIMESTAMP,          -- 创建时间
    last_accessed TIMESTAMP,       -- 最后访问
    embedding BLOB NOT NULL,       -- float16压缩向量
    is_deleted INTEGER DEFAULT 0,  -- 软删除标记
    metadata TEXT                  -- JSON元数据
);

-- 多维度索引
CREATE INDEX idx_priority ON memories(priority DESC);
CREATE INDEX idx_category ON memories(category);
CREATE INDEX idx_created ON memories(created_at);
CREATE INDEX idx_deleted ON memories(is_deleted);

检索流程:

  1. 查询文本 → 向量化
  2. 遍历所有记忆 → 计算余弦相似度
  3. 过滤低于阈值的结果
  4. 按相似度+优先级加权排序
  5. 更新访问计数

5.3 第三层:优化层 (Optimizer)

职责: 四大维度自主优化

维度优化内容触发时机
内容优化合并重复记忆(相似度>0.9)每10次操作
检索优化动态调整阈值/Top-K每10轮对话
优先级优化高频记忆优先级+0.3每次检索后
存储优化清理无效记忆、VACUUM每日凌晨2点

重复检测算法:

# 两两计算相似度,找出>0.9的对
duplicates = []
for i, mem1 in enumerate(memories):
    for j, mem2 in enumerate(memories):
        if i < j:
            sim = cosine_similarity(mem1.embedding, mem2.embedding)
            if sim > 0.9:
                duplicates.append((mem1, mem2, sim))

5.4 第四层:应用层 (EnhancedMemoryStore)

职责: 向后兼容的包装器

设计:

  • 继承原 MemoryStore 保持API兼容
  • 自动检测向量功能可用性
  • 不可用时降级到文件存储

六、完整代码实现

6.1 embedding.py - 向量生成器 (178行)

"""Lightweight embedding generation with local caching and compression."""

import pickle
import hashlib
from pathlib import Path
from typing import Union, List
import numpy as np

try:
    from sentence_transformers import SentenceTransformer
    SENTENCE_TRANSFORMERS_AVAILABLE = True
except ImportError:
    SENTENCE_TRANSFORMERS_AVAILABLE = False


class EmbeddingGenerator:
    """
    Lightweight embedding generator with caching and compression.
    
    Features:
    - Local model caching to avoid repeated downloads
    - float16 compression to save 50% storage
    - Batch processing for efficiency
    """
    
    DEFAULT_MODEL = 'sentence-transformers/all-MiniLM-L6-v2'
    
    def __init__(self, cache_dir: Path = None, model_name: str = None):
        if not SENTENCE_TRANSFORMERS_AVAILABLE:
            raise ImportError(
                "sentence-transformers not installed. "
                "Run: pip install sentence-transformers"
            )
        
        self.model_name = model_name or self.DEFAULT_MODEL
        self.cache_dir = cache_dir or Path.home() / ".nanobot" / "embedding_cache"
        self.cache_dir.mkdir(parents=True, exist_ok=True)
        
        # Force offline mode
        import os
        os.environ['HF_DATASETS_OFFLINE'] = '1'
        os.environ['TRANSFORMERS_OFFLINE'] = '1'
        os.environ['HF_HUB_OFFLINE'] = '1'
        
        self._model = None
        self._embedding_dim = None
    
    @property
    def model(self) -> SentenceTransformer:
        """Lazy load model from local cache only."""
        if self._model is None:
            local_model_path = Path.home() / ".cache" / "torch" / "sentence_transformers" / "sentence-transformers_all-MiniLM-L6-v2"
            
            if local_model_path.exists():
                self._model = SentenceTransformer(str(local_model_path))
            else:
                cache_folder = Path.home() / ".cache" / "torch" / "sentence_transformers"
                self._model = SentenceTransformer(
                    self.model_name,
                    cache_folder=str(cache_folder),
                    local_files_only=True
                )
            self._embedding_dim = self._model.get_sentence_embedding_dimension()
        return self._model
    
    def encode(self, text: Union[str, List[str]], use_cache: bool = True, 
               compress: bool = True) -> np.ndarray:
        """Generate embedding for text with caching and compression."""
        is_single = isinstance(text, str)
        texts = [text] if is_single else text
        
        results = []
        texts_to_encode = []
        indices_to_encode = []
        
        # Check cache first
        if use_cache:
            for i, t in enumerate(texts):
                cache_key = hashlib.md5(t.encode()).hexdigest()
                cache_path = self.cache_dir / f"{cache_key}.pkl"
                
                if cache_path.exists():
                    with open(cache_path, 'rb') as f:
                        results.append((i, pickle.load(f)))
                else:
                    texts_to_encode.append(t)
                    indices_to_encode.append(i)
        
        # Encode missing texts
        if texts_to_encode:
            embeddings = self.model.encode(texts_to_encode, convert_to_numpy=True)
            
            if compress:
                embeddings = embeddings.astype(np.float16)
            
            for idx, text, emb in zip(indices_to_encode, texts_to_encode, embeddings):
                if use_cache:
                    cache_key = hashlib.md5(text.encode()).hexdigest()
                    cache_path = self.cache_dir / f"{cache_key}.pkl"
                    with open(cache_path, 'wb') as f:
                        pickle.dump(emb, f)
                results.append((idx, emb))
        
        # Sort by original index
        results.sort(key=lambda x: x[0])
        embeddings = np.array([r[1] for r in results])
        
        return embeddings[0] if is_single else embeddings
    
    def cosine_similarity(self, a: np.ndarray, b: np.ndarray) -> float:
        """Calculate cosine similarity between two vectors."""
        a = a.astype(np.float32) if a.dtype == np.float16 else a
        b = b.astype(np.float32) if b.dtype == np.float16 else b
        return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

6.2 vector_manager.py - 向量管理层 (334行)

"""Vector memory manager with SQLite backend and semantic retrieval."""

import sqlite3
import pickle
import json
import hashlib
from datetime import datetime, timedelta
from pathlib import Path
from typing import List, Dict, Optional, Any
import numpy as np

from loguru import logger
from nanobot.agent.memory.embedding import EmbeddingGenerator


class VectorMemoryManager:
    """
    Lightweight vector memory manager with SQLite backend.
    
    Architecture:
    - Storage Layer: SQLite with custom schema
    - Embedding Layer: Compressed float16 vectors
    - Retrieval Layer: Cosine similarity with dynamic filtering
    - Management Layer: Standardized CRUD interface
    """
    
    def __init__(self, workspace: Path, model_name: str = None):
        self.workspace = workspace
        self.db_path = workspace / "memory.db"
        
        # Initialize embedding generator
        self.embedder = EmbeddingGenerator(
            cache_dir=workspace / ".embedding_cache",
            model_name=model_name
        )
        
        # Initialize database
        self._init_database()
        
        # Retrieval parameters (for self-optimization)
        self.retrieval_params = {
            "top_k": 5,
            "similarity_threshold": 0.6,
            "time_window_days": 365,
        }
        
        logger.info(f"VectorMemoryManager initialized | db={self.db_path}")
    
    def _init_database(self):
        """Initialize SQLite database with optimized schema."""
        with sqlite3.connect(self.db_path) as conn:
            conn.execute("""
                CREATE TABLE IF NOT EXISTS memories (
                    id TEXT PRIMARY KEY,
                    content TEXT NOT NULL,
                    category TEXT DEFAULT 'general',
                    priority REAL DEFAULT 5.0,
                    access_count INTEGER DEFAULT 0,
                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                    last_accessed TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                    embedding BLOB NOT NULL,
                    is_deleted INTEGER DEFAULT 0,
                    metadata TEXT
                )
            """)
            
            # Indexes for fast retrieval
            conn.execute("CREATE INDEX IF NOT EXISTS idx_priority ON memories(priority DESC)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_category ON memories(category)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_created ON memories(created_at)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_deleted ON memories(is_deleted)")
            conn.execute("CREATE INDEX IF NOT EXISTS idx_accessed ON memories(last_accessed)")
            
            # Statistics table for self-optimization
            conn.execute("""
                CREATE TABLE IF NOT EXISTS memory_stats (
                    key TEXT PRIMARY KEY,
                    value TEXT,
                    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
                )
            """)
            
            conn.commit()
    
    def _generate_id(self, content: str) -> str:
        """Generate unique ID for memory."""
        return hashlib.sha256(f"{content}:{datetime.now().isoformat()}".encode()).hexdigest()[:16]
    
    def add(self, content: str, category: str = "general", 
            priority: float = 5.0, metadata: Dict = None) -> str:
        """Add a memory entry."""
        memory_id = self._generate_id(content)
        
        # Generate embedding
        embedding = self.embedder.encode(content)
        embedding_blob = pickle.dumps(embedding, protocol=pickle.HIGHEST_PROTOCOL)
        
        # Insert into database
        with sqlite3.connect(self.db_path) as conn:
            conn.execute(
                """INSERT INTO memories 
                   (id, content, category, priority, embedding, metadata)
                   VALUES (?, ?, ?, ?, ?, ?)""",
                (memory_id, content, category, priority, embedding_blob,
                 json.dumps(metadata) if metadata else None)
            )
            conn.commit()
        
        logger.debug(f"Added memory [{memory_id}]: {content[:50]}...")
        return memory_id
    
    def search(self, query: str, top_k: int = None, 
               threshold: float = None, category: str = None) -> List[Dict]:
        """Semantic search using cosine similarity."""
        top_k = top_k or self.retrieval_params["top_k"]
        threshold = threshold or self.retrieval_params["similarity_threshold"]
        
        # Generate query embedding
        query_embedding = self.embedder.encode(query)
        
        results = []
        
        with sqlite3.connect(self.db_path) as conn:
            if category:
                cursor = conn.execute(
                    """SELECT id, content, category, priority, access_count, 
                              created_at, embedding, metadata
                       FROM memories WHERE is_deleted = 0 AND category = ?""",
                    (category,)
                )
            else:
                cursor = conn.execute(
                    """SELECT id, content, category, priority, access_count, 
                              created_at, embedding, metadata
                       FROM memories WHERE is_deleted = 0"""
                )
            
            for row in cursor:
                memory_id, content, cat, priority, access_count, created_at, emb_blob, meta = row
                
                # Deserialize and calculate similarity
                memory_embedding = pickle.loads(emb_blob)
                similarity = self.embedder.cosine_similarity(query_embedding, memory_embedding)
                
                if similarity >= threshold:
                    results.append({
                        "id": memory_id,
                        "content": content,
                        "category": cat,
                        "priority": priority,
                        "access_count": access_count,
                        "created_at": created_at,
                        "similarity": similarity,
                        "metadata": json.loads(meta) if meta else {}
                    })
        
        # Sort by similarity (with priority boost)
        results.sort(key=lambda x: x["similarity"] + (x["priority"] / 100), reverse=True)
        
        # Update access count
        selected = results[:top_k]
        if selected:
            with sqlite3.connect(self.db_path) as conn:
                now = datetime.now().isoformat()
                for r in selected:
                    conn.execute(
                        """UPDATE memories 
                           SET access_count = access_count + 1, last_accessed = ?
                           WHERE id = ?""",
                        (now, r["id"])
                    )
                conn.commit()
        
        return selected
    
    def get_stats(self) -> Dict[str, Any]:
        """Get memory statistics."""
        with sqlite3.connect(self.db_path) as conn:
            cursor = conn.execute(
                "SELECT COUNT(*), AVG(priority), SUM(access_count) FROM memories WHERE is_deleted = 0"
            )
            total, avg_priority, total_accesses = cursor.fetchone()
            
            cursor = conn.execute(
                "SELECT category, COUNT(*) FROM memories WHERE is_deleted = 0 GROUP BY category"
            )
            categories = {row[0]: row[1] for row in cursor}
            
            return {
                "total_memories": total or 0,
                "average_priority": round(avg_priority, 2) if avg_priority else 0,
                "total_accesses": total_accesses or 0,
                "categories": categories,
                "retrieval_params": self.retrieval_params
            }

6.3 optimizer.py - 自优化器 (298行)

"""Self-optimization system for vector memories."""

import json
from datetime import datetime, timedelta
from pathlib import Path
from typing import List, Dict, Any, Tuple
import numpy as np

from loguru import logger
from nanobot.agent.memory.vector_manager import VectorMemoryManager


class MemoryOptimizer:
    """
    Four-dimensional self-optimization system:
    - Content optimization: merge duplicates, clean invalid
    - Retrieval optimization: dynamic top-k and threshold
    - Priority optimization: boost high-value memories
    - Storage optimization: cleanup and compression
    """
    
    def __init__(self, manager: VectorMemoryManager):
        self.manager = manager
        self.operation_count = 0
        
        self.config = {
            "merge_threshold": 0.90,
            "cleanup_days": 180,
            "min_access_for_keep": 1,
            "min_priority_for_keep": 2.0,
            "optimization_interval": 10,
        }
    
    def record_operation(self) -> bool:
        """Record operation and check if optimization needed."""
        self.operation_count += 1
        return self.operation_count >= self.config["optimization_interval"]
    
    def find_duplicates(self) -> List[Tuple[str, str, float]]:
        """Find similar memories that might be duplicates."""
        duplicates = []
        threshold = self.config["merge_threshold"]
        
        # Get all memories with embeddings
        memories = self._get_all_memories_with_embeddings(limit=1000)
        
        if len(memories) < 2:
            return []
        
        checked = set()
        
        for i, mem1 in enumerate(memories):
            for j, mem2 in enumerate(memories):
                if i >= j:
                    continue
                
                pair = tuple(sorted([mem1["id"], mem2["id"]]))
                if pair in checked:
                    continue
                checked.add(pair)
                
                sim = self.manager.embedder.cosine_similarity(
                    mem1["embedding"], mem2["embedding"]
                )
                
                if sim >= threshold:
                    duplicates.append((mem1["id"], mem2["id"], sim))
        
        duplicates.sort(key=lambda x: x[2], reverse=True)
        return duplicates
    
    def merge_memories(self, id1: str, id2: str) -> str:
        """Merge two similar memories."""
        mem1 = self.manager.get_by_id(id1)
        mem2 = self.manager.get_by_id(id2)
        
        if not mem1 or not mem2:
            return None
        
        # Merge content (longer one as base)
        if len(mem2["content"]) > len(mem1["content"]):
            merged_content = f"{mem2['content']}\n\n[Related]: {mem1['content'][:200]}"
            base_priority = max(mem1["priority"], mem2["priority"])
        else:
            merged_content = f"{mem1['content']}\n\n[Related]: {mem2['content'][:200]}"
            base_priority = max(mem1["priority"], mem2["priority"])
        
        # Add merged memory with boosted priority
        merged_id = self.manager.add(
            content=merged_content,
            category=mem1["category"],
            priority=min(10.0, base_priority + 0.5),
            metadata={"merged_from": [id1, id2]}
        )
        
        # Soft delete originals
        self.manager.soft_delete(id1, reason="merged")
        self.manager.soft_delete(id2, reason="merged")
        
        logger.info(f"Merged memories {id1[:8]} and {id2[:8]} into {merged_id[:8]}")
        return merged_id
    
    def light_optimize(self) -> Dict[str, Any]:
        """Light optimization (run every N operations)."""
        results = {
            "type": "light",
            "duplicates_merged": 0,
            "timestamp": datetime.now().isoformat()
        }
        
        # Merge top duplicates (limit to 3 per cycle)
        duplicates = self.find_duplicates()
        for id1, id2, sim in duplicates[:3]:
            if self.merge_memories(id1, id2):
                results["duplicates_merged"] += 1
        
        self.operation_count = 0
        
        logger.info(f"Light optimization complete: {results}")
        return results
    
    def _get_all_memories_with_embeddings(self, limit: int = 1000) -> List[Dict]:
        """Get all memories for optimization analysis."""
        import sqlite3
        
        with sqlite3.connect(self.manager.db_path) as conn:
            cursor = conn.execute(
                """SELECT id, content, category, priority, access_count, 
                          created_at, embedding, metadata
                   FROM memories 
                   WHERE is_deleted = 0
                   LIMIT ?""",
                (limit,)
            )
            
            results = []
            for row in cursor:
                results.append({
                    "id": row[0],
                    "content": row[1],
                    "category": row[2],
                    "priority": row[3],
                    "access_count": row[4],
                    "created_at": row[5],
                    "embedding": pickle.loads(row[6]),
                    "metadata": json.loads(row[7]) if row[7] else {}
                })
            return results

6.4 init.py - 模块入口 (139行)

"""Memory system for persistent agent memory - Lightweight Vector Edition."""

import os
from pathlib import Path

from nanobot.agent.memory.store import MemoryStore

# Try to import the new lightweight vector memory system
try:
    from nanobot.agent.memory.embedding import EmbeddingGenerator
    from nanobot.agent.memory.vector_manager import VectorMemoryManager
    from nanobot.agent.memory.optimizer import MemoryOptimizer
    
    class EnhancedMemoryStore(MemoryStore):
        """
        Enhanced memory store with SQLite + lightweight embedding.
        
        Provides backward-compatible API while using the new vector system.
        Falls back to file-based storage if embedding model not available.
        """
        
        def __init__(self, workspace):
            """Initialize enhanced memory store."""
            super().__init__(workspace)
            
            # Try to initialize vector manager, fallback to None if model not available
            self._vector_manager = None
            self._optimizer = None
            self._vector_available = False
            
            try:
                self._vector_manager = VectorMemoryManager(workspace)
                self._optimizer = MemoryOptimizer(self._vector_manager)
                self._vector_available = True
            except Exception as e:
                import logging
                logging.getLogger(__name__).warning(
                    f"Vector memory not available: {e}"
                )
        
        @property
        def vector_manager(self):
            return self._vector_manager
        
        @property
        def optimizer(self):
            return self._optimizer
        
        @property
        def _vector_enabled(self):
            return self._vector_available
        
        def add_memory(self, content: str, category: str = "general", 
                      priority: float = 5.0) -> str:
            """Add a memory (new API)."""
            if self._vector_available:
                memory_id = self._vector_manager.add(content, category, priority)
                
                if self._optimizer.record_operation():
                    self._optimizer.light_optimize()
                
                return memory_id
            else:
                return super().append_long_term(f"[{category}] {content}")
        
        def search_similar(self, query: str, n_results: int = 5) -> list:
            """Search similar memories."""
            if self._vector_available:
                return self._vector_manager.search(query, top_k=n_results)
            else:
                return super().search(query, limit=n_results)
        
        def get_relevant_context(self, query: str, max_memories: int = 3) -> str:
            """Get relevant memory context."""
            if not self._vector_available:
                return super().get_memory_context(query)
            
            results = self._vector_manager.search(query, top_k=max_memories)
            
            if not results:
                return super().get_memory_context(query)
            
            parts = ["## Relevant Memories\n"]
            for r in results:
                parts.append(f"### [{r['category']}] (relevance: {r['similarity']:.2f})")
                parts.append(r["content"])
                parts.append("")
            
            return "\n".join(parts)
        
        def get_memory_stats(self) -> dict:
            """Get memory statistics."""
            if self._vector_available:
                return self._vector_manager.get_stats()
            else:
                return {"status": "fallback_mode"}
        
        def optimize_memories(self, force: bool = False) -> dict:
            """Run memory optimization."""
            if self._vector_available:
                return self._optimizer.full_optimize()
            else:
                return {"status": "unavailable"}
    
    VECTOR_AVAILABLE = True
    
except ImportError:
    EnhancedMemoryStore = MemoryStore
    EmbeddingGenerator = None
    VectorMemoryManager = None
    MemoryOptimizer = None
    VECTOR_AVAILABLE = False

__all__ = [
    "MemoryStore", 
    "EnhancedMemoryStore", 
    "EmbeddingGenerator",
    "VectorMemoryManager",
    "MemoryOptimizer"
]

七、部署与维护

7.1 快速部署(5分钟)

# 1. 安装依赖
pip install sentence-transformers

# 2. 下载模型(约80MB)
# 方式A: 自动下载(首次使用时会自动下载)
# 方式B: 手动下载(如果自动下载失败)
mkdir -p ~/.cache/torch/sentence_transformers/sentence-transformers_all-MiniLM-L6-v2
cd ~/.cache/torch/sentence_transformers/sentence-transformers_all-MiniLM-L6-v2
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/pytorch_model.bin
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer_config.json
wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/vocab.txt

# 3. 替换代码
# 将本文的4个代码文件放入 nanobot/agent/memory/ 目录

# 4. 验证
nanobot agent -m "请记住:我叫张三"
nanobot agent -m "我是谁?"  # 应回答"张三"

7.2 回滚方案

如需回退到旧方案:

# 恢复备份的 store.py(原版 Markdown 存储)
cp backup/memory_store_backup.py nanobot/agent/memory/store.py
rm nanobot/agent/memory/embedding.py
rm nanobot/agent/memory/vector_manager.py
rm nanobot/agent/memory/optimizer.py
# 保留 __init__.py 中的 MemoryStore 导出

7.3 后续 AI 维护指南

添加新的优化规则:

# 在 optimizer.py 中添加新方法
def optimize_by_user_feedback(self, memory_id: str, feedback: str):
    """基于用户反馈调整记忆优先级."""
    if feedback == "important":
        self.manager.update_priority(memory_id, 10.0)
    elif feedback == "irrelevant":
        self.manager.soft_delete(memory_id, reason="user_feedback")

添加新的记忆类别:

# 在 add_memory 时指定类别
memory.add_memory("内容", category="code_snippet", priority=8.0)

# 检索时按类别过滤
results = memory.search_similar("Python", category="code_snippet")

调试方法:

# 查看内存统计
stats = memory.get_memory_stats()
print(f"总记忆数: {stats['total_memories']}")

# 手动执行优化
result = memory.optimize_memories(force=True)
print(f"合并重复: {result['duplicates_merged']}")

# 查找重复
duplicates = memory.find_duplicate_memories()

八、总结

8.1 核心成果

维度原版新方案提升
匹配能力关键词语义理解🚀 质的飞跃
维护成本人工全自动零成本
存储占用无限<100MB可控
扩展性可定制

8.2 技术亮点

  1. 极致轻量: 单文件 SQLite + 80MB 模型
  2. 零侵入: 仅修改 memory 模块
  3. 向后兼容: 自动降级机制
  4. AI自维护: 四大维度自动优化
  5. 语义检索: 告别关键词

8.3 文件清单

nanobot/agent/memory/
├── __init__.py          # 139行 - 模块入口
├── store.py             # 139行 - 基础存储(原版)
├── embedding.py         # 178行 - 向量生成
├── vector_manager.py    # 334行 - 向量管理
└── optimizer.py         # 298行 - 自优化器

总计: 949 行 Python 代码