Spring AI 学习笔记（第一阶段 — 基础入门）Spring AI 学习笔记（第一阶段

pom.xml 依赖

核心依赖

依赖	版本	作用
`spring-ai-alibaba-starter-dashscope`	1.1.0.0	阿里云百炼/通义千问接入
`spring-ai-ollama`	1.1.0	本地模型支持
`spring-ai-autoconfigure-model-chat-memory`	1.1.0	对话记忆自动配置
`spring-ai-starter-model-chat-memory-repository-jdbc`	1.1.0	JDBC 持久化对话记忆
`spring-boot-starter-webflux`	-	响应式编程（Flux 流式）
`mysql-connector-java`	8.0.33	MySQL 数据库连接

核心概念速查

1. ChatClient vs ChatModel

维度	ChatClient（高层）	ChatModel（底层）
代码量	多（链式调用）	少（直接调用）
灵活性	高（配置丰富）	低（参数有限）
功能	完整（记忆/日志/模板）	基础（仅调用）
切换模型	需重建	直接指定
适用场景	生产环境（90%）	快速测试/底层控制

关系：ChatClient 内部封装 ChatModel，帮你构建 Prompt 和管理 Advisors

2. 调用方式演进

Level 1: chatClient.prompt("你好").call()
Level 2: chatClient.prompt().system("...").user("...").call()
Level 3: chatClient.prompt(new Prompt(SystemMessage, UserMessage)).call()
Level 4: chatClient.prompt(new PromptTemplate("{name}").add("name","Tom").create()).call()
Level 5: chatClient.prompt(new PromptTemplate(resource).create(Map.of(...))).call()

3. 同步 vs 流式

特性	`call()`	`stream()`
返回	`ChatResponse`	`Flux<String>`
响应	一次性完整返回	逐字实时推送
场景	短文本/后台任务	长文本/实时聊天
前端	普通 HTTP	SSE / WebFlux

4. Message 类型

Message
├── SystemMessage      # 系统提示词：设定角色/规则
├── UserMessage        # 用户输入
├── AssistantMessage   # AI 回复（需手动加入历史）
└── ToolResponseMessage # 工具调用结果

5. 提示词工程技巧

技巧	核心思想	示例
角色设定	给 AI 一个人设	"你是一个专业影评人..."
Few-Shot	给示例学习	"Input: xxx → Output: yyy"
思维链	分步推理	"step 1... step 2..."
结构化约束	指定输出格式	"输出 JSON，包含..."

6. 结构化输出 Converter

Converter	输出	场景
`BeanOutputConverter<T>`	单个对象	已知结构
`.entity(Class<T>)`	自动转换	快速开发（推荐）
`.entity(ParameterizedTypeReference<List<T>>)`	列表	批量数据
`MapOutputConverter`	Map	动态结构

7. 对话记忆类型

类型	存储	特点	适用场景
内存记忆	`InMemoryChatMemory`	重启丢失	开发测试
JDBC 记忆	MySQL	持久化	生产环境
Redis 记忆	Redis	分布式	多实例部署

Controller 详解

ChatClientController

核心：5 种 ChatClient 调用方式

// 方式1：最简单
chatClient.prompt(message).call().content();

// 方式2：覆盖 System
chatClient.prompt(message).system("加上skr").call().content();

// 方式3：显式 User
chatClient.prompt().user(message).call().content();

// 方式4：Prompt 对象（最灵活）
chatClient.prompt(new Prompt(new SystemMessage("..."), new UserMessage("..."))).call().content();

// 方式5：流式
chatClient.prompt(message).stream().content();  // Flux<String>

构建配置：

ChatClient.builder(dashScopeChatModel)
    .defaultAdvisors(new SimpleLoggerAdvisor())      // 日志拦截器
    .defaultSystem("请用英文回答问题")                  // 默认系统提示词
    .defaultOptions(DashScopeChatOptions.builder()
        .temperature(0.7)                            // 创造性参数
        .build())
    .build();

StreamController

4 种流式实现：

端点	技术	说明
`/fakeStream`	HttpClient 原生	直接调用 DashScope API，`stream: true`
`/sse`	SseEmitter	Spring MVC 标准 SSE，虚拟线程推送
`/entity`	StreamingResponseBody	原始字节流
`/flux`	Reactor Flux	响应式流，`Flux.interval()`

SSE 关键代码：

SseEmitter emitter = new SseEmitter(60_000L);  // 60秒超时
Executors.newVirtualThreadPerTaskExecutor().submit(() -> {
    emitter.send("data");      // 发送数据
    emitter.complete();        // 正常结束
    emitter.completeWithError(e);  // 异常结束
});

ChatModelController

核心：绕过 ChatClient，直接操作 ChatModel

// 方式1：字符串
 dashScopeChatModel.call(message);

// 方式2：多消息
dashScopeChatModel.call(systemMessage, userMessage);

// 方式3：完整 Prompt（可切换模型）
ChatOptions options = ChatOptions.builder().model("deepseek-v4").build();
Prompt prompt = new Prompt.Builder().messages(...).chatOptions(options).build();
dashScopeChatModel.call(prompt).getResult().getOutput().getText();

// 方式4：流式
dashScopeChatModel.stream(message);  // Flux<String>

vs ChatClient：

代码更少，但功能也更少
适合快速测试、需要切换模型的场景

PromptEngineerController

4 大提示词技巧：

// 技巧1：角色设定
defaultSystem("你是一个专业影评人，说话很冷漠...");

// 技巧2：Few-Shot 示例
.system("""
    Input：ni好
    Output ：{"错别字改写":"你好","内容精简":""}
    Input：我今天心情不错...
    Output ：{"错别字改写":"","内容精简":"今天是什么天气？"}
    """)

// 技巧3：结构化约束
.prompt("请你以json格式输出内容")

// 技巧4：思维链（Chain-of-Thought）
.prompt("""
    step 1-用一句话概括下面文本。
    step 2-将摘要翻译成英语。
    step 3-在英语摘要中列出每个人名。
    step 4-输出一个 JSON 对象...
    """)

PromptTemplateController

3 种模板方式：

// 方式1：代码内模板
String template = "请给我推荐几个关于{topic}的高分电影";
PromptTemplate pt = new PromptTemplate(template);
pt.add("topic", topic);

// 方式2：链式 + Map
new PromptTemplate(template).create(Map.of("topic", topic));

// 方式3：外部文件模板（推荐）
@Value("classpath:/templates/ai_system_prompt.st")
private Resource systemPrompt;

PromptTemplate pt = PromptTemplate.builder()
    .resource(systemPrompt)
    .variables(Map.of("country", "中国", "topic", message))
    .build();

模板文件 ai_system_prompt.st：

请给我推荐几个关于{topic}的高分电影,要求电影国家是{country}的。

StructureOutputController

4 种结构化输出：

// 方式1：手动 BeanOutputConverter
BeanOutputConverter<Movie> converter = new BeanOutputConverter<>(Movie.class);
String resp = chatClient.prompt("...输出格式：{format}")
    .call().content();  // 传入 converter.getFormat() 作为格式说明
Book book = converter.convert(resp);

// 方式2：自动转换（推荐）
Movie movie = chatClient.prompt("...").call().entity(Movie.class);

// 方式3：列表转换
List<Movie> books = chatClient.prompt("...").call()
    .entity(new ParameterizedTypeReference<List<Movie>>() {});

// 方式4：Map 转换
Map<String, Object> map = chatClient.prompt("...").call()
    .entity(new MapOutputConverter());

Book 模型：

public record Book(
    @JsonPropertyDescription("电影名") String name,
    @JsonPropertyDescription("导演") String author,
    @JsonPropertyDescription("简介") String desc,
    @JsonPropertyDescription("类型") String type,
    @JsonPropertyDescription("出品国家") String country
) {}

ChatMemoryController

核心：实现多轮对话记忆

手动管理历史（底层方式）

@GetMapping("/call")
public String call(String message) {
    List<Message> messages = new ArrayList<>();
    
    // 第一轮
    messages.add(new SystemMessage("你是一个游戏设计师"));
    messages.add(new UserMessage("我想设计一个回合制游戏"));
    ChatResponse response = dashScopeChatModel.call(new Prompt(messages));
    messages.add(new AssistantMessage(response.getResult().getOutput().getText()));
    
    // 第二轮（带上历史）
    messages.add(new UserMessage("游戏画面使用元素风?"));
    response = dashScopeChatModel.call(new Prompt(messages));
    messages.add(new AssistantMessage(response.getResult().getOutput().getText()));
    
    // 第三轮（带上全部历史）
    messages.add(new UserMessage("主要是针对35岁男性玩家的游戏呢?"));
    return dashScopeChatModel.call(new Prompt(messages)).getResult().getOutput().getText();
}

问题：

手动管理 message 列表，代码繁琐
没有自动窗口管理，容易超出 Token 限制
无法持久化

自动记忆管理（推荐方式）

@Autowired
private ChatMemory chatMemory;  // Spring AI 自动注入 InMemoryChatMemory

@GetMapping("/callConversation")
public Flux<String> callConversation(String message, String chatId) {
    return chatClient
        .prompt()
        .user(message)
        .advisors(spec -> spec.param(ChatMemory.CONVERSATION_ID, chatId))
        .stream().content();
}

构建配置：

this.chatClient = ChatClient.builder(dashScopeChatModel)
    .defaultAdvisors(
        MessageChatMemoryAdvisor.builder(chatMemory).build(),  // 记忆 Advisor
        new SimpleLoggerAdvisor()                               // 日志 Advisor
    )
    .build();

关键点：

ChatMemory.CONVERSATION_ID：区分不同对话（如用户 A 和用户 B）
MessageChatMemoryAdvisor：自动保存/加载历史消息
多个 Advisors 按顺序执行

JdbcChatMemoryController

核心：对话历史持久化到 MySQL

配置类

@Configuration
public class JdbcChatMemoryConfiguration {
    @Bean
    public ChatMemory jdbcChatMemory(JdbcChatMemoryRepository jdbcChatMemoryRepository) {
        return MessageWindowChatMemory.builder()
            .chatMemoryRepository(jdbcChatMemoryRepository)  // JDBC 存储
            .maxMessages(20)                                  // 保留最近 20 条
            .build();
    }
}

使用方式

与内存记忆完全相同的代码：

@Autowired
private ChatMemory jdbcChatMemory;  // 注入的是 JDBC 实现的 ChatMemory

@GetMapping("/callDb")
public Flux<String> callDb(String message, String chatId) {
    return chatClient
        .prompt()
        .user(message)
        .advisors(spec -> spec.param(ChatMemory.CONVERSATION_ID, chatId))
        .stream().content();
}

构建配置：

this.chatClient = ChatClient.builder(dashScopeChatModel)
    .defaultAdvisors(
        MessageChatMemoryAdvisor.builder(jdbcChatMemory).build(),  // JDBC 记忆
        new SimpleLoggerAdvisor()
    )
    .build();

数据库表：Spring AI 自动创建 ai_chat_memory 表

字段	说明
`conversation_id`	对话 ID
`content`	消息内容（JSON）
`type`	消息类型（USER/ASSISTANT）
`timestamp`	时间戳

OllamaController

核心：本地运行开源大模型，无需云端 API

@Autowired
private OllamaChatModel ollamaChatModel;

@GetMapping("/stream")
public Flux<String> stream(String message, HttpServletResponse response) {
    response.setCharacterEncoding("UTF-8");
    return ollamaChatModel.stream(message);
}

配置 application.yml：

spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        model: deepseek-r1:7b

前提：

安装 Ollama：brew install ollama
下载模型：ollama pull deepseek-r1:7b
启动服务：ollama serve

优势：

无需网络，本地运行
数据隐私安全
免费

劣势：

性能依赖本地硬件
模型能力不如云端大模型

OrderRefundController

核心：完整业务场景 = 记忆 + 模板 + 工具调用

业务场景

拼多多退款客服机器人：

识别用户情绪和质量问题的关键词
确认问题后自动发起退款
多轮对话保持上下文

系统提示词

外部文件 pdd_refund_system_prompt.pt：

# Role
你是一名专业的电商平台客户体验专家...

# Task
第一步：主动识别与确认
第二步：判断与执行退款
第三步：后续安抚与闭环

# Limit
仅处理质量问题...

代码实现

@RestController
@RequestMapping("/order/refund")
public class OrderRefundController {

    @Autowired
    private OrderTools orderTools;  // 退款工具

    @Value("classpath:templates/pdd_refund_system_prompt.pt")
    private Resource systemText;    // 系统提示词文件

    @Autowired
    private ChatMemory chatmemory;  // 对话记忆

    // 初始化 ChatClient
    @PostConstruct
    public void init() {
        chatClient = ChatClient.builder(chatModel)
            .defaultAdvisors(
                MessageChatMemoryAdvisor.builder(chatmemory).build(),
                new SimpleLoggerAdvisor()
            )
            .defaultSystem(systemText)  // 加载外部提示词
            .build();
    }

    // 开始新对话
    @GetMapping("/newChat")
    public OrderChat newChat(String userId, String orderId) {
        String chatId = UUID.randomUUID().toString();
        
        return chatClient
            .prompt()
            .user(String.format("我要咨询订单...用户id是%s,订单号:%s...", userId, orderId, chatId))
            .advisors(spec -> spec.param(CONVERSATION_ID, chatId)
                                   .param("chat_memory_retrieve_size", 100))
            .call()
            .entity(OrderChat.class);  // 结构化输出
    }

    // 继续对话（带工具调用）
    @GetMapping("/ask")
    public Flux<String> ask(String question, String chatId) {
        return chatClient
            .prompt()
            .user(question)
            .tools(orderTools)  // 注册工具
            .advisors(spec -> spec.param(CONVERSATION_ID, chatId)
                                   .param("chat_memory_retrieve_size", 100))
            .stream().content();
    }
}

工具定义

@Component
public class OrderTools {
    @Autowired
    private OrderManageService orderManageService;

    @Tool(name = "apply_refund", description = "根据用户传入的订单信息发起退款")
    public String refund(
        @ToolParam(description = "订单编号，为数字类型") String orderId,
        @ToolParam(description = "商品名称") String name,
        @ToolParam(description = "退款原因") String reason
    ) {
        orderManageService.refund(orderId, reason);
        return "已为商品：" + name + ",订单号：" + orderId + "申请退款";
    }
}

工具调用流程：

用户："这件衣服开线了，我要退款"
    │
    ▼
LLM 分析：用户要求退款，需要调用 apply_refund 工具
    │
    ▼
Spring AI 自动调用 OrderTools.refund(orderId, name, reason)
    │
    ▼
工具返回结果，LLM 生成回复："已为您申请退款..."

核心归纳总结

Spring AI 核心架构

┌─────────────────────────────────────────────────────────────┐
│                        应用层                                │
│   Controller → Service → 业务逻辑                           │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                      ChatClient（推荐）                      │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│  │ 链式调用     │  │ Advisors    │  │ PromptTemplate      │ │
│  │ .prompt()   │  │ (记忆/日志)  │  │ (模板引擎)           │ │
│  │ .call()     │  │             │  │                     │ │
│  │ .stream()   │  │             │  │                     │ │
│  └─────────────┘  └─────────────┘  └─────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                      ChatModel（底层）                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│  │ call()      │  │ stream()    │  │ 切换模型             │ │
│  │ call(Prompt)│  │             │  │ (deepseek/qwen)     │ │
│  └─────────────┘  └─────────────┘  └─────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
                          │
┌─────────────────────────▼───────────────────────────────────┐
│                      模型提供商                              │
│   DashScope (通义千问)  │  Ollama (本地)  │  OpenAI 等      │
└─────────────────────────────────────────────────────────────┘

完整开发流程

1. 添加依赖（spring-ai-alibaba-starter-dashscope）
        ↓
2. 配置 API Key（application.yml）
        ↓
3. 注入 ChatModel（@Autowired）
        ↓
4. 构建 ChatClient（ChatClient.builder(chatModel)...）
        ↓
5. 编写提示词（SystemMessage / PromptTemplate）
        ↓
6. 调用并处理响应（call() / stream() / entity()）
        ↓
7. 添加记忆（MessageChatMemoryAdvisor）
        ↓
8. 添加工具（@Tool + .tools()）

关键设计模式

模式	应用
Builder 模式	`ChatClient.builder()`、`Prompt.Builder()`
Advisor 模式	`MessageChatMemoryAdvisor`、`SimpleLoggerAdvisor`
模板方法	`PromptTemplate` 变量替换
策略模式	切换不同 ChatModel（DashScope/Ollama）
工厂模式	`BeanOutputConverter` 自动创建对象

动手实验记录

实验 1：基础调用

# 简单调用
curl "http://localhost:8080/client/simpleCall?message=你好"

# 流式输出
curl -N "http://localhost:8080/client/stream?message=你好"

# SSE 测试
curl -N "http://localhost:8080/stream/sse"

实验 2：ChatModel 底层调用

# 字符串调用
curl "http://localhost:8080/model/call/string?message=你好"

# 翻译（多消息）
curl "http://localhost:8080/model/call/messages?message=你好世界"

# 切换模型
curl "http://localhost:8080/model/call/prompt?message=你好"

实验 3：提示词工程

# 角色设定
curl "http://localhost:8080/prompt/engineer/role?message=今天天气真好"

# Few-Shot 改写
curl "http://localhost:8080/prompt/engineer/shot?message=我今天心情不错"

# 思维链
curl -N "http://localhost:8080/prompt/engineer/step?message=从前有座山..."

实验 4：PromptTemplate

# 代码内模板
curl -N "http://localhost:8080/prompt/template/stream?topic=微服务"

# 文件模板
curl -N "http://localhost:8080/prompt/template/file?message=微服务"

实验 5：结构化输出

# Bean 转换
curl "http://localhost:8080/structure/convert"

# 列表转换
curl "http://localhost:8080/structure/convertList"

# Map 转换
curl "http://localhost:8080/structure/convertMap"

实验 6：对话记忆

# 内存记忆（多轮对话）
curl "http://localhost:8080/memory/callConversation?message=你好&chatId=user-001"
curl "http://localhost:8080/memory/callConversation?message=刚才我说了什么&chatId=user-001"

# JDBC 持久化记忆
curl -N "http://localhost:8080/jdbc/memory/callDb?message=你好&chatId=user-001"
curl -N "http://localhost:8080/jdbc/memory/callDb?message=记住我叫张三&chatId=user-001"
# 重启应用后
curl -N "http://localhost:8080/jdbc/memory/callDb?message=我叫什么&chatId=user-001"

实验 7：Ollama 本地模型

# 确保 Ollama 已启动
curl http://localhost:11434/api/tags

# 调用本地模型
curl -N "http://localhost:8080/ollama/stream?message=你好"

实验 8：业务场景（退款）

# 开始对话
curl "http://localhost:8080/order/refund/newChat?userId=10086&orderId=12345"

# 继续对话（LLM 会自动判断是否需要调用退款工具）
curl -N "http://localhost:8080/order/refund/ask?question=这件衣服开线了&chatId=xxx"

常见问题

Q1: ChatClient 和 ChatModel 怎么选？

选 ChatClient：90% 场景，需要记忆、日志、链式配置
选 ChatModel：需要底层控制，如动态切换模型、自定义 HTTP 参数

Q2: 对话记忆的工作原理？

用户发送消息 + CONVERSATION_ID
MessageChatMemoryAdvisor 拦截请求
从 ChatMemory 加载该 ID 的历史消息
将历史消息 + 新消息一起发给 LLM
LLM 回复后，保存到 ChatMemory

Q3: 内存记忆和 JDBC 记忆的区别？

特性	内存记忆	JDBC 记忆
存储位置	JVM 内存	MySQL 数据库
重启后	丢失	保留
多实例	不共享	共享
配置	自动注入	需配置 DataSource
代码	相同	相同

Q4: Advisors 的执行顺序？

A: 按注册顺序执行。例如：

.defaultAdvisors(
    MessageChatMemoryAdvisor.builder(chatMemory).build(),  // 先执行：加载记忆
    new SimpleLoggerAdvisor()                                // 后执行：打印日志
)

Q5: 如何控制记忆的窗口大小？

MessageWindowChatMemory.builder()
    .maxMessages(20)  // 只保留最近 20 条
    .build();

Q6: Ollama 模型下载慢怎么办？

配置镜像：export OLLAMA_MODELS=/path/to/models
使用国内镜像源
手动下载模型文件放到 ~/.ollama/models/

Q7: 工具调用时 LLM 怎么知道调用哪个工具？

@Tool(description = "...") 描述工具用途
Spring AI 将所有工具描述发给 LLM
LLM 根据用户输入判断是否需要工具
需要时返回 tool_calls，Spring AI 自动执行

Q8: `chat_memory_retrieve_size` 是什么？

A: 控制每次从记忆加载多少条历史消息。默认可能只有几条，设置为 100 可以加载更多上下文。

Spring AI 学习笔记（第一阶段 — 基础入门）

目录

pom.xml 依赖

核心依赖

核心概念速查

1. ChatClient vs ChatModel

2. 调用方式演进

3. 同步 vs 流式

4. Message 类型

5. 提示词工程技巧

6. 结构化输出 Converter

7. 对话记忆类型

Controller 详解

ChatClientController

StreamController

ChatModelController

PromptEngineerController

PromptTemplateController

StructureOutputController

ChatMemoryController

手动管理历史（底层方式）

自动记忆管理（推荐方式）

JdbcChatMemoryController

配置类

使用方式

OllamaController

OrderRefundController

业务场景

系统提示词

代码实现

工具定义

核心归纳总结

Spring AI 核心架构

完整开发流程

关键设计模式

动手实验记录

实验 1：基础调用

实验 2：ChatModel 底层调用

实验 3：提示词工程

实验 4：PromptTemplate

实验 5：结构化输出

实验 6：对话记忆

实验 7：Ollama 本地模型

实验 8：业务场景（退款）

常见问题

Q1: ChatClient 和 ChatModel 怎么选？

Q2: 对话记忆的工作原理？

Q3: 内存记忆和 JDBC 记忆的区别？

Q4: Advisors 的执行顺序？

Q5: 如何控制记忆的窗口大小？

Q6: Ollama 模型下载慢怎么办？

Q7: 工具调用时 LLM 怎么知道调用哪个工具？

Q8: chat_memory_retrieve_size 是什么？

参考链接

Q8: `chat_memory_retrieve_size` 是什么？