RAG 检索增强生成

58 阅读3分钟

RAG 检索增强生成

RAG: Retrieval Augmented Generation 检索增强生成

spring官网: docs.spring.io/spring-ai/r…

spring ai alibaba官网: java2ai.com/docs/framew…

需求: AI智能运维助手 通过提供错误编码,给出异常解释辅助运维人员更好定位问题和维护系统

技术构成:SpringAI + 阿里百炼嵌入模型text-embedding-v4 + 向量数据库RedisStack + DeepSeek实现RAG功能

LLM缺陷

  • LLM 知识不是实时的,不具备知识更新
  • LLM可能不知道你的私有领域/业务知识
  • LLM有时会幻觉 在回答中生成看似合理但是实际上是错误的信息

RAG 是什么

RAG(Retrieval-Augmented Generation,检索增强生成) 是一种结合 信息检索大语言模型(LLM) 的 AI 架构,用于生成更准确、可靠、基于事实的回答。

简单来说,RAG(检索增强生成)是一种从你的数据中查找相关信息,并在将提示发送给LLM之前将其注入到提示中的方法。这样一来,LLM就能获得(希望是)相关的信息,并基于这些信息进行回答,从而降低产生幻觉的概率。

RAG 能做什么

大型语言模型(LLM)虽然强大,但有两个关键限制:

  • 有限的上下文——它们无法一次性摄取整个语料库
  • 静态知识——它们的训练数据在某个时间点被冻结

检索通过在查询时获取相关的外部知识来解决这些问题。这是**检索增强生成(RAG)**的基础:使用特定上下文的信息来增强 LLM 的回答。

RAG 这么玩

两个阶段: 创建索引(index)、检索(Retrieval)

image.png

每个组件都是模块化的:你可以交换加载器、分割器、嵌入或向量存储,而无需重写应用程序的逻辑。

开发步骤

创建module

image.png

POM

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>com.miao</groupId>
        <artifactId>SpringAIAlibaba-test01</artifactId>
        <version>1.0-SNAPSHOT</version>
    </parent>

    <artifactId>SAA-09RAG</artifactId>

    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <!--  模型服务灵积  调用alibaba生态的协议 对标openai协议   -->
        <dependency>
            <groupId>com.alibaba.cloud.ai</groupId>
            <artifactId>spring-ai-alibaba-starter-dashscope</artifactId>
            <version>1.0.0.2</version>
        </dependency>

        <!--   向量数据库依赖     -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-vector-store-redis</artifactId>
        </dependency>

        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.38</version>
        </dependency>
    </dependencies>
</project>

YML文件

server:
  port: 8082
  servlet:
    encoding:
      enabled: true
      force: true
      charset: UTF-8

spring:
  application:
    name: SAA-07
  ai:
    dashscope:
      api-key: ${qwen-api-key}
      chat:
        options:
          model: qwen3-vl-flash
      emedding:
        options:
          model: text-embedding-v4
    vectorstore:
      redis:
        initialize-schema: true
        index-name: custom-index
        prefix: custom-index
  data:
    redis:
      host: redis-16002.c1.us-east1-2.gce.cloud.redislabs.com
      port: 16002
      password: password

主启动类

package com.miao;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SAA09RAGApplication {
    public static void main(String[] args) {
        SpringApplication.run(SAA09RAGApplication.class, args);
    }
}

业务类

提供ErrorCode脚本存入向量数据库

将errorcode脚本放在本地服务上,如下截图 ops_error_code.txt

AUTH_001:认证模块 - 用户名或密码错误
PAY_002:支付模块 - 余额不足
DB_003:数据库 - 连接超时
404:资源未找到(HTTP)
500:服务器内部错误(HTTP)
ERR_CONNECTION_TIMEOUT:连接超时(网络库)
E001:用户不存在(业务自定义)

image.png

config配置类

1.多模型配置

package com.miao.config;

import com.alibaba.cloud.ai.dashscope.api.DashScopeApi;
import com.alibaba.cloud.ai.dashscope.chat.DashScopeChatModel;
import com.alibaba.cloud.ai.dashscope.chat.DashScopeChatOptions;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.prompt.ChatOptions;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class SaaLLMConfig {

    private static final String QWEN_MODEL = "qwen3-max";
    private static final String DEEPSEEK_MODEL = "deepseek-v3.1";

    @Value("${spring.ai.dashscope.url")
    private String qwenUrl;

    @Bean(name = "deepseek")
    public ChatModel deepSeek() {
        return DashScopeChatModel.builder()
                .dashScopeApi(DashScopeApi.builder()
                        .apiKey(System.getenv("qwen-api-key"))
                        .build())
                .defaultOptions(DashScopeChatOptions.builder()
                        .withModel(DEEPSEEK_MODEL)
                        .build())
                .build();
    }

    @Bean(name = "qwen")
    public ChatModel qwen() {
        return DashScopeChatModel.builder()
                .dashScopeApi(DashScopeApi.builder()
                        .apiKey(System.getenv("qwen-api-key"))
                        .build())
                .defaultOptions(DashScopeChatOptions.builder().withModel(QWEN_MODEL).build())
                .build();
    }

    @Bean(name = "deepseekChatClient")
    public ChatClient deepSeekChatClient(@Qualifier("deepseek") ChatModel deepseekModel) {
        return ChatClient.builder(deepseekModel)
                .defaultOptions(ChatOptions.builder().model(DEEPSEEK_MODEL).build())
                .build();
    }

    @Bean(name = "qwenChatClient")
    public ChatClient qwenChatClient(@Qualifier("qwen") ChatModel qwenModel) {
        return ChatClient.builder(qwenModel)
                .defaultOptions(ChatOptions.builder().model(QWEN_MODEL).build())
                .build();
    }
}

2.redis配置

package com.miao.config;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.serializer.GenericJackson2JsonRedisSerializer;
import org.springframework.data.redis.serializer.StringRedisSerializer;

@Configuration
public class RedisConfig {

    @Bean
    public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory redisConnectionFactory) {
        RedisTemplate<String, Object> redisTemplate = new RedisTemplate<>();
        // Additional configuration can be added here if necessary
        redisTemplate.setConnectionFactory(redisConnectionFactory);

        // 设置ke 序列化方式string
        redisTemplate.setKeySerializer(new StringRedisSerializer());
        // 设置value 序列化方式json
        redisTemplate.setValueSerializer(new GenericJackson2JsonRedisSerializer());

        redisTemplate.setHashKeySerializer(new StringRedisSerializer());
        redisTemplate.setHashValueSerializer(new GenericJackson2JsonRedisSerializer());
        return redisTemplate;
    }
}

3.初始化index数据到向量数据库

package com.miao.config;

import jakarta.annotation.PostConstruct;
import jakarta.annotation.Resource;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.core.RedisTemplate;

import java.nio.charset.Charset;
import java.util.List;

@Slf4j
@Configuration
public class InitVectorDatabaseConfig {

    @Resource
    private VectorStore vectorStore;

    @Value("classpath:ops_error_code.txt")
    private org.springframework.core.io.Resource opsErrorCodeFile;

    @Resource
    private RedisTemplate<String, String> redisTemplate;

    @PostConstruct
    public void init() {
        // 1 读取文件
        TextReader textReader = new TextReader(opsErrorCodeFile);
        // 编码默认utf-8
        textReader.setCharset(Charset.defaultCharset());

        // 2 文件转换为向量(分词)
        List<Document> list = new TokenTextSplitter().transform(textReader.read());

        // 3.写入向量数据库redisStack 【需要改进 解决插入向量数据重复问题】
        //        vectorStore.add(list);

        // 4.去重处理
        // 文件名 有时候进行md5密文加密
        String sourceMataData = (String)textReader.getCustomMetadata().get("source");

        // 判断redis是否存在过 没有的话,就插入此Key
        Boolean flag = redisTemplate.opsForValue().setIfAbsent(sourceMataData, "1");
        if (flag) {
            // 说明不存在,可以插入数据
            vectorStore.add(list);
        } {
            log.info("数据已存在,避免重复插入");
        }

    }
}
业务类
package com.miao.controller;

import jakarta.annotation.Resource;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.document.Document;
import org.springframework.ai.rag.advisor.RetrievalAugmentationAdvisor;
import org.springframework.ai.rag.retrieval.search.VectorStoreDocumentRetriever;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

import java.util.List;

@RestController
public class RAGController {

    @Resource(name = "deepseekChatClient")
    private ChatClient deepSeekClient;

    @Resource
    private VectorStore vectorStore;


    @GetMapping("/rag")
    public Flux<String> rag(@RequestParam(name = "msg") String msg) {
        String sysMsg = "你是一个专业的运维技术专家,根据给出编码给出对应故障解释,否则回复我找不到信息。";
        // 1.向量检索
        RetrievalAugmentationAdvisor advisor = RetrievalAugmentationAdvisor
                .builder()
                .documentRetriever(VectorStoreDocumentRetriever.builder().vectorStore(vectorStore).build())
                .build();

        return deepSeekClient.prompt()
                .system(sysMsg)
                .user(msg)
                .advisors(advisor)
                .stream()
                .content();
    }
}