RAG 检索增强生成
RAG: Retrieval Augmented Generation 检索增强生成
spring官网: docs.spring.io/spring-ai/r…
spring ai alibaba官网: java2ai.com/docs/framew…
需求: AI智能运维助手 通过提供错误编码,给出异常解释辅助运维人员更好定位问题和维护系统
技术构成:SpringAI + 阿里百炼嵌入模型text-embedding-v4 + 向量数据库RedisStack + DeepSeek实现RAG功能
LLM缺陷
- LLM 知识
不是实时的,不具备知识更新 - LLM可能不知道你的私有领域/业务知识
- LLM有时会
幻觉在回答中生成看似合理但是实际上是错误的信息
RAG 是什么
RAG(Retrieval-Augmented Generation,检索增强生成) 是一种结合 信息检索 与 大语言模型(LLM) 的 AI 架构,用于生成更准确、可靠、基于事实的回答。
简单来说,RAG(检索增强生成)是一种从你的数据中查找相关信息,并在将提示发送给LLM之前将其注入到提示中的方法。这样一来,LLM就能获得(希望是)相关的信息,并基于这些信息进行回答,从而降低产生幻觉的概率。
RAG 能做什么
大型语言模型(LLM)虽然强大,但有两个关键限制:
- 有限的上下文——它们无法一次性摄取整个语料库
- 静态知识——它们的训练数据在某个时间点被冻结
检索通过在查询时获取相关的外部知识来解决这些问题。这是**检索增强生成(RAG)**的基础:使用特定上下文的信息来增强 LLM 的回答。
RAG 这么玩
两个阶段: 创建索引(index)、检索(Retrieval)
每个组件都是模块化的:你可以交换加载器、分割器、嵌入或向量存储,而无需重写应用程序的逻辑。
开发步骤
创建module
POM
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.miao</groupId>
<artifactId>SpringAIAlibaba-test01</artifactId>
<version>1.0-SNAPSHOT</version>
</parent>
<artifactId>SAA-09RAG</artifactId>
<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- 模型服务灵积 调用alibaba生态的协议 对标openai协议 -->
<dependency>
<groupId>com.alibaba.cloud.ai</groupId>
<artifactId>spring-ai-alibaba-starter-dashscope</artifactId>
<version>1.0.0.2</version>
</dependency>
<!-- 向量数据库依赖 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-vector-store-redis</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.38</version>
</dependency>
</dependencies>
</project>
YML文件
server:
port: 8082
servlet:
encoding:
enabled: true
force: true
charset: UTF-8
spring:
application:
name: SAA-07
ai:
dashscope:
api-key: ${qwen-api-key}
chat:
options:
model: qwen3-vl-flash
emedding:
options:
model: text-embedding-v4
vectorstore:
redis:
initialize-schema: true
index-name: custom-index
prefix: custom-index
data:
redis:
host: redis-16002.c1.us-east1-2.gce.cloud.redislabs.com
port: 16002
password: password
主启动类
package com.miao;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class SAA09RAGApplication {
public static void main(String[] args) {
SpringApplication.run(SAA09RAGApplication.class, args);
}
}
业务类
提供ErrorCode脚本存入向量数据库
将errorcode脚本放在本地服务上,如下截图 ops_error_code.txt
AUTH_001:认证模块 - 用户名或密码错误
PAY_002:支付模块 - 余额不足
DB_003:数据库 - 连接超时
404:资源未找到(HTTP)
500:服务器内部错误(HTTP)
ERR_CONNECTION_TIMEOUT:连接超时(网络库)
E001:用户不存在(业务自定义)
config配置类
1.多模型配置
package com.miao.config;
import com.alibaba.cloud.ai.dashscope.api.DashScopeApi;
import com.alibaba.cloud.ai.dashscope.chat.DashScopeChatModel;
import com.alibaba.cloud.ai.dashscope.chat.DashScopeChatOptions;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.prompt.ChatOptions;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class SaaLLMConfig {
private static final String QWEN_MODEL = "qwen3-max";
private static final String DEEPSEEK_MODEL = "deepseek-v3.1";
@Value("${spring.ai.dashscope.url")
private String qwenUrl;
@Bean(name = "deepseek")
public ChatModel deepSeek() {
return DashScopeChatModel.builder()
.dashScopeApi(DashScopeApi.builder()
.apiKey(System.getenv("qwen-api-key"))
.build())
.defaultOptions(DashScopeChatOptions.builder()
.withModel(DEEPSEEK_MODEL)
.build())
.build();
}
@Bean(name = "qwen")
public ChatModel qwen() {
return DashScopeChatModel.builder()
.dashScopeApi(DashScopeApi.builder()
.apiKey(System.getenv("qwen-api-key"))
.build())
.defaultOptions(DashScopeChatOptions.builder().withModel(QWEN_MODEL).build())
.build();
}
@Bean(name = "deepseekChatClient")
public ChatClient deepSeekChatClient(@Qualifier("deepseek") ChatModel deepseekModel) {
return ChatClient.builder(deepseekModel)
.defaultOptions(ChatOptions.builder().model(DEEPSEEK_MODEL).build())
.build();
}
@Bean(name = "qwenChatClient")
public ChatClient qwenChatClient(@Qualifier("qwen") ChatModel qwenModel) {
return ChatClient.builder(qwenModel)
.defaultOptions(ChatOptions.builder().model(QWEN_MODEL).build())
.build();
}
}
2.redis配置
package com.miao.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.serializer.GenericJackson2JsonRedisSerializer;
import org.springframework.data.redis.serializer.StringRedisSerializer;
@Configuration
public class RedisConfig {
@Bean
public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory redisConnectionFactory) {
RedisTemplate<String, Object> redisTemplate = new RedisTemplate<>();
// Additional configuration can be added here if necessary
redisTemplate.setConnectionFactory(redisConnectionFactory);
// 设置ke 序列化方式string
redisTemplate.setKeySerializer(new StringRedisSerializer());
// 设置value 序列化方式json
redisTemplate.setValueSerializer(new GenericJackson2JsonRedisSerializer());
redisTemplate.setHashKeySerializer(new StringRedisSerializer());
redisTemplate.setHashValueSerializer(new GenericJackson2JsonRedisSerializer());
return redisTemplate;
}
}
3.初始化index数据到向量数据库
package com.miao.config;
import jakarta.annotation.PostConstruct;
import jakarta.annotation.Resource;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.core.RedisTemplate;
import java.nio.charset.Charset;
import java.util.List;
@Slf4j
@Configuration
public class InitVectorDatabaseConfig {
@Resource
private VectorStore vectorStore;
@Value("classpath:ops_error_code.txt")
private org.springframework.core.io.Resource opsErrorCodeFile;
@Resource
private RedisTemplate<String, String> redisTemplate;
@PostConstruct
public void init() {
// 1 读取文件
TextReader textReader = new TextReader(opsErrorCodeFile);
// 编码默认utf-8
textReader.setCharset(Charset.defaultCharset());
// 2 文件转换为向量(分词)
List<Document> list = new TokenTextSplitter().transform(textReader.read());
// 3.写入向量数据库redisStack 【需要改进 解决插入向量数据重复问题】
// vectorStore.add(list);
// 4.去重处理
// 文件名 有时候进行md5密文加密
String sourceMataData = (String)textReader.getCustomMetadata().get("source");
// 判断redis是否存在过 没有的话,就插入此Key
Boolean flag = redisTemplate.opsForValue().setIfAbsent(sourceMataData, "1");
if (flag) {
// 说明不存在,可以插入数据
vectorStore.add(list);
} {
log.info("数据已存在,避免重复插入");
}
}
}
业务类
package com.miao.controller;
import jakarta.annotation.Resource;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.document.Document;
import org.springframework.ai.rag.advisor.RetrievalAugmentationAdvisor;
import org.springframework.ai.rag.retrieval.search.VectorStoreDocumentRetriever;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;
import java.util.List;
@RestController
public class RAGController {
@Resource(name = "deepseekChatClient")
private ChatClient deepSeekClient;
@Resource
private VectorStore vectorStore;
@GetMapping("/rag")
public Flux<String> rag(@RequestParam(name = "msg") String msg) {
String sysMsg = "你是一个专业的运维技术专家,根据给出编码给出对应故障解释,否则回复我找不到信息。";
// 1.向量检索
RetrievalAugmentationAdvisor advisor = RetrievalAugmentationAdvisor
.builder()
.documentRetriever(VectorStoreDocumentRetriever.builder().vectorStore(vectorStore).build())
.build();
return deepSeekClient.prompt()
.system(sysMsg)
.user(msg)
.advisors(advisor)
.stream()
.content();
}
}