从零实现前端 RAG 组件：Vue3+Pinia 打造本地化文档问答工具从零实现前端 RAG 组件：Vue3+Pinia

从零实现前端 RAG 组件：Vue3+Pinia 打造本地化文档问答工具

在前端开发中，你是否遇到过这些问题：项目文档太多找不到关键信息？用户需要频繁查阅帮助文档影响使用体验？2025 年，RAG（检索增强生成）技术已成为前端智能组件的核心能力。本文将用 Vue3+Pinia + 本地向量库，从零实现一个可嵌入项目的 RAG 文档问答组件，无需后端也能实现智能检索。

一、技术选型与核心原理

1. 为什么选择这些技术？

Vue3+TypeScript：保证组件类型安全，Composition API 更适合复杂逻辑拆分

Pinia：管理向量数据和问答状态，支持跨组件共享

@xenova/transformers：前端本地化运行 BERT 模型，实现文本向量转换（无需调用外部 API）

lance-vector：轻量级前端向量数据库，支持快速 nearest-neighbor 检索

2. RAG 前端实现核心流程

文档处理：将 Markdown 文档分割为短句，转换为向量存储

用户查询：输入问题后，前端将问题转为向量

向量检索：在本地向量库中匹配最相关的文档片段

结果生成：拼接检索结果，可对接 LLM 生成自然语言回答（本文实现基础版）

二、环境准备与依赖安装

# 创建 Vue 项目（若已有项目可跳过）
npm create vue@latest frontend-rag-component -- --typescript --pinia
cd frontend-rag-component
# 安装核心依赖
npm install @xenova/transformers@2.17.2 lance-vector@0.1.15 markdown-it@14.1.0

三、核心组件实现

1. 向量处理工具封装（utils/vectorHelper.ts）

首先实现文本转向量和向量检索的核心工具：

import { AutoModel, AutoTokenizer } from '@xenova/transformers';
import { LanceVectorDB } from 'lance-vector';
// 单例模式初始化模型（避免重复加载）
let model: AutoModel | null = null;
let tokenizer: AutoTokenizer | null = null;
let vectorDB: LanceVectorDB | null = null;
// 初始化模型和向量库
export async function initVectorTools() {
  // 加载轻量级 BERT 模型（前端约 400MB，首次加载较慢）
  tokenizer = await AutoTokenizer.from_pretrained('Xenova/bert-base-uncased');
  model = await AutoModel.from_pretrained('Xenova/bert-base-uncased');
  // 初始化向量库（维度与 BERT 输出一致：768）
  vectorDB = await LanceVectorDB.create({
    dim: 768,
    path: 'local-rag-vector-db', // 本地存储路径
  });
  return { tokenizer, model, vectorDB };
}
// 文本转向量
export async function textToVector(text: string) {
  if (!model || !tokenizer) throw new Error('请先初始化向量工具');
  
  // 处理文本并获取模型输出
  const inputs = await tokenizer(text, { padding: true, truncation: true });
  const outputs = await model(inputs);
  
  // 取 [CLS]  token 的输出作为文本向量（BERT 标准做法）
  const vector = outputs.last_hidden_state[0][0].detach().array() as number[];
  return vector;
}
// 向量检索（返回 top5 相关结果）
export async function searchSimilarVectors(queryVector: number[], topK = 5) {
  if (!vectorDB) throw new Error('向量库未初始化');
  const results = await vectorDB.search(queryVector, { topK });
  return results.map(item => ({
    score: item.score, // 相似度分数（越小越相似）
    text: item.metadata?.text as string // 存储的文档文本
  }));
}

2. Pinia 状态管理（stores/ragStore.ts）

管理文档数据、向量库状态和问答记录：

import { defineStore } from 'pinia';
import { initVectorTools, textToVector, searchSimilarVectors } from '@/utils/vectorHelper';
import MarkdownIt from 'markdown-it';
const md = new MarkdownIt();
export const useRagStore = defineStore('rag', {
  state: () => ({
    isReady: false, // 向量工具是否初始化完成
    documentChunks: [] as string[], // 文档片段
    history: [] as { question: string; answer: string }[], // 问答历史
  }),
  actions: {
    // 初始化向量工具和加载文档
    async initRag(documentContent: string) {
      this.isReady = false;
      try {
        // 1. 初始化向量模型和数据库
        await initVectorTools();
        
        // 2. 处理 Markdown 文档（分割为短句）
        const html = md.render(documentContent);
        const text = html.replace(/<[^>]+>/g, ' ').replace(/\s+/g, ' '); // 提取纯文本
        this.documentChunks = text.split(/[。！？；]/).filter(chunk => chunk.length > 10); // 分割并过滤短文本
        
        // 3. 将文档片段转为向量存入数据库
        for (const chunk of this.documentChunks) {
          const vector = await textToVector(chunk);
          await vectorDB?.add(vector, { text: chunk }); // 存储向量和元数据
        }
        
        this.isReady = true;
      } catch (error) {
        console.error('RAG 初始化失败：', error);
        throw error;
      }
    },
    // 处理用户查询
    async handleQuery(question: string) {
      if (!this.isReady) throw new Error('RAG 工具未准备好');
      
      // 1. 查询转为向量
      const queryVector = await textToVector(question);
      
      // 2. 检索相似文档片段
      const similarChunks = await searchSimilarVectors(queryVector);
      
      // 3. 生成回答（基础版：拼接相关片段，进阶版可对接 LLM）
      const answer = `找到以下相关信息：\n\n${similarChunks.map((item, i) => 
        `${i+1}. ${item.text}（相似度：${(1 - item.score).toFixed(2)}）`
      ).join('\n\n')}`;
      
      // 4. 记录历史
      this.history.unshift({ question, answer });
      return answer;
    }
  }
});

3. RAG 问答组件（components/RagQaComponent.vue）

实现用户交互界面，包含文档上传、问题输入和结果展示：

<template>
  <div class="rag-qa-container">
    <!-- 文档上传区域 -->
    <div class="document-upload">
      <h3>1. 上传 Markdown 文档</h3>
      <input 
        type="file" 
        accept=".md" 
        @change="handleFileUpload"
        class="file-input"
      >
      <p v-if="!ragStore.isReady" class="loading-text">模型加载中...（首次加载约30秒）</p>
    </div>
    <!-- 问答区域 -->
    <div class="qa-area" v-if="ragStore.isReady">
      <h3>2. 提问关于文档的问题</h3>
      <div class="input-group">
        <input
          v-model="question"
          type="text"
          placeholder="例如：如何实现向量检索？"
          class="question-input"
        >
        <button @click="handleSubmit" class="submit-btn">查询</button>
      </div>
      <!-- 历史记录 -->
      <div class="history-container">
        <h4>问答历史</h4>
        <div 
          class="history-item" 
          v-for="(item, index) in ragStore.history" 
          :key="index"
        >
          <div class="question">Q：{{ item.question }}</div>
          <div class="answer">A：{{ item.answer }}</div>
        </div>
      </div>
    </div>
  </div>
</template>
<script setup lang="ts">
import { ref } from 'vue';
import { useRagStore } from '@/stores/ragStore';
import { readAsText } from '@/utils/fileHelper'; // 需自行实现：封装 FileReader
const ragStore = useRagStore();
const question = ref('');
// 处理 Markdown 文件上传
const handleFileUpload = async (e: Event) => {
  const target = e.target as HTMLInputElement;
  if (!target.files?.[0]) return;
  const fileContent = await readAsText(target.files[0]);
  await ragStore.initRag(fileContent); // 初始化 RAG 并处理文档
};
// 提交查询
const handleSubmit = async () => {
  if (!question.value.trim()) return;
  await ragStore.handleQuery(question.value.trim());
  question.value = ''; // 清空输入框
};
</script>
<style scoped>
.rag-qa-container {
  max-width: 1000px;
  margin: 20px auto;
  padding: 0 20px;
}
.document-upload, .qa-area {
  margin-bottom: 30px;
  padding: 20px;
  border: 1px solid #e5e7eb;
  border-radius: 8px;
}
.loading-text {
  color: #6b7280;
  margin-top: 10px;
}
.input-group {
  display: flex;
  gap: 10px;
  margin-top: 10px;
}
.question-input {
  flex: 1;
  padding: 10px 15px;
  border: 1px solid #e5e7eb;
  border-radius: 4px;
  font-size: 14px;
}
.submit-btn {
  padding: 10px 20px;
  background-color: #3b82f6;
  color: white;
  border: none;
  border-radius: 4px;
  cursor: pointer;
}
.history-container {
  margin-top: 20px;
}
.history-item {
  margin-top: 15px;
  padding: 10px;
  border-left: 3px solid #3b82f6;
  background-color: #f9fafb;
}
.question {
  font-weight: 600;
  margin-bottom: 5px;
}
.answer {
  color: #4b5563;
}
</style>

4. 文件工具封装（utils/fileHelper.ts）

简单封装 FileReader 方便读取 Markdown 文件：

export function readAsText(file: File): Promise<string> {
  return new Promise((resolve, reject) => {
    const reader = new FileReader();
    reader.onload = () => resolve(reader.result as string);
    reader.onerror = () => reject(reader.error);
    reader.readAsText(file);
  });
}

四、组件使用与优化

1. 在页面中引入组件（views/RagDemo.vue）

<template>
  <div class="rag-demo-page">
    <h2>前端 RAG 文档问答演示</h2>
    <RagQaComponent />
  </div>
</template>
<script setup lang="ts">
import RagQaComponent from '@/components/RagQaComponent.vue';
</script>
<style scoped>
.rag-demo-page {
  padding: 20px;
}
h2 {
  text-align: center;
  margin-bottom: 30px;
  color: #1f2937;
}
</style>

2. 关键优化点

模型加载优化：首次加载模型较慢，可添加加载动画和缓存机制（利用 IndexedDB 存储模型文件）

文档分割策略：本文用简单分割，可优化为按标题层级分割（如 ### 分割为独立片段）

性能优化：向量库初始化和文档处理放在 Web Worker 中，避免阻塞主线程

功能扩展：对接 Claude 3 或 GPT-4o 前端 SDK，实现更智能的回答生成

五、常见问题与解决方案

模型加载失败？

检查网络：首次加载需要下载模型文件，确保网络稳定

浏览器兼容性：需支持 ES6 Modules 和 WebAssembly（Chrome/Firefox 最新版推荐）

检索结果不准确？

优化文档分割：确保每个片段主题单一

更换更大模型：可尝试 Xenova/bert-large-uncased（精度更高，但体积更大）

本地存储满了？

清理向量库：调用 vectorDB?.drop() 清空数据

限制文档大小：建议单文档不超过 10MB

通过这个组件，你可以轻松将 RAG 能力嵌入到任何 Vue 项目中，无论是后台管理系统的帮助中心，还是产品的文档问答功能。如果需要实现更复杂的功能，比如多文档检索或对接云端 LLM。