前言
在 AI 应用开发中,流式输出能极大提升用户体验——让 AI 的回答像打字机一样逐字呈现,而不是等待漫长的完整生成。本文将带从零搭建一个完整的 AI 对话项目,涵盖同步/流式接口、前端 SSE 对接、限流防护等核心能力。
技术栈
- 后端: NestJS + LangChain
- 前端: React + Ant Design + EventSource
- AI 模型: 通义千问 (qwen-plus),兼容 OpenAI API 格式
项目初始化
搭建项目
创建项目
pnpm install -g @nestjs/cli
nest new hello-nest-langchain
安装依赖
pnpm install @nestjs/config
pnpm install @langchain/core @langchain/openai
生成ai模块
nest g res ai --no-spec
配置环境变变量
MODEL_NAME=qwen-plus
OPENAI_API_KEY=sk-xxx
OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
全局配置 ConfigModel
import { Module } from '@nestjs/common';
import { AppController } from './app.controller';
import { AppService } from './app.service';
import { AiModule } from './ai/ai.module';
import { ConfigModule } from '@nestjs/config';
@Module({
imports: [
AiModule,
ConfigModule.forRoot({
isGlobal: true,
envFilePath: '.env',
}),
],
controllers: [AppController],
providers: [AppService],
})
export class AppModule {}
isGlobal 的意思是将 ConfigModel 注册为全局模块,不需要在每个模块的 imports 中重复导入
main.ts 配置跨域
import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
async function bootstrap() {
const app = await NestFactory.create(AppModule);
app.enableCors();
await app.listen(process.env.PORT ?? 3000);
}
bootstrap();
同步接口
在 AiService 里面创建 LangChain 调用链
import { StringOutputParser } from '@langchain/core/output_parsers';
import { PromptTemplate } from '@langchain/core/prompts';
import { Runnable } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { Injectable } from '@nestjs/common';
@Injectable()
export class AiService {
private readonly chain: Runnable<{ query: string }, string>;
constructor() {
const prompt = PromptTemplate.fromTemplate('请回答以下问题: \n\n{query}');
const model = new ChatOpenAI({
temperature: 0.7,
modelName: 'qwen-plus',
apiKey: 'xxx',
configuration: {
baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
},
});
this.chain = prompt.pipe(model).pipe(new StringOutputParser());
}
async runChain(query: string): Promise<string> {
return await this.chain.invoke({ query });
}
}
在 AiController 里暴露接口
import { Controller, Get, Query } from '@nestjs/common';
import { AiService } from './ai.service';
@Controller('ai')
export class AiController {
constructor(private readonly aiService: AiService) {}
@Get('chat')
async chat(@Query('query') query: string) {
const answer = await this.aiService.runChain(query);
return { answer };
}
}
流式接口
在 AiService 里面添加流式方法
async *streamChain(query: string): AsyncGenerator<string> {
const stream = await this.chain.stream({ query });
for await (const chunk of stream) {
yield chunk;
}
}
这时一个流式返回的异步生成器方法,可以让 Ai 的回答像打字机一样一个字一个字的展示,而不是等全部生成完才一次性返回
这里用到了 js 的生成器语法,也就是方法名那里标注 *,然后通过 yield 不断异步返回内容。
前端页面
pnpm create vite
pnpm i @tanstack/react-query
pnpm i antd
组件核心代码
import { useState, useRef, useEffect } from "react";
import "./App.css";
import "antd/dist/reset.css";
import { Card, Input, Button, Typography, Space, Form } from "antd";
const { Title } = Typography;
const { TextArea } = Input;
function App() {
const [apiUrl, setApiUrl] = useState("http://localhost:3000");
const [question, setQuestion] = useState("你是谁?");
const [responseText, setResponseText] = useState("回复将显示在这里...");
const esRef = useRef<EventSource | null>(null);
const [isStreaming, setIsStreaming] = useState(false);
const responseRef = useRef<HTMLDivElement | null>(null);
const handleStart = () => {
setResponseText("");
const base = apiUrl.replace(//+$/, "");
const url = `${base}/ai/chat/stream?query=${encodeURIComponent(question)}`;
if (esRef.current) {
esRef.current.close();
esRef.current = null;
}
try {
const es = new EventSource(url);
esRef.current = es;
setIsStreaming(true);
es.onmessage = (ev) => {
const chunk = ev.data;
setResponseText((prev) => {
if (
!prev ||
prev === "回复将显示在这里..." ||
prev.startsWith("(演示)")
)
return chunk;
return prev + chunk;
});
};
es.onerror = () => {
const ready = es.readyState;
if (ready === 2) {
setResponseText((prev) => (prev ? prev + "\n【已结束】" : "已结束"));
}
try {
es.close();
} catch {}
esRef.current = null;
setIsStreaming(false);
};
// 可选的自定义事件(后端可能发送 event: done)
es.addEventListener("done", () => {
try {
es.close();
} catch {}
esRef.current = null;
setIsStreaming(false);
setResponseText((prev) => (prev ? prev + "\n【已完成】" : "已完成"));
});
} catch (err) {
setResponseText(`错误:${String(err)}`);
setIsStreaming(false);
}
};
const handleStop = () => {
if (esRef.current) {
esRef.current.close();
esRef.current = null;
}
setIsStreaming(false);
setResponseText((prev) => (prev ? prev + "\n【已停止】" : "已停止"));
};
// 自动滚动到最底部
useEffect(() => {
const el = responseRef.current;
if (!el) return;
el.scrollTop = el.scrollHeight;
}, [responseText]);
return (
<div className="sse-page">
<Card className="sse-card" bordered={false}>
<Title level={2} className="sse-title">
SSE 流式接口测试
</Title>
<Form layout="vertical">
<Form.Item label="API 地址">
<Input value={apiUrl} onChange={(e) => setApiUrl(e.target.value)} />
</Form.Item>
<Form.Item label="问题">
<TextArea
value={question}
onChange={(e) => setQuestion(e.target.value)}
rows={3}
/>
</Form.Item>
<Form.Item>
<Space>
<Button
type="primary"
onClick={handleStart}
disabled={isStreaming}
>
开始流式请求
</Button>
<Button danger onClick={handleStop} disabled={!isStreaming}>
停止
</Button>
</Space>
</Form.Item>
<Form.Item label="">
<Card className="response-box" bordered={false}>
<div className="response-content" ref={responseRef}>
<div className="response-text">{responseText}</div>
</div>
</Card>
</Form.Item>
</Form>
</Card>
</div>
);
}
export default App;
实现效果
一些优化
动态注入
将 ChatOpenAI 实例通过 NestJS 的 DI 容器管理,解耦配置与业务逻辑。
nest 动态注入就是不用 new 依赖对象,只要声明下,运行的时候会自动注入依赖的实例对象
在 AiModel 中使用 useFactory 创建 CHAT_MODEL
通过 @Injectable 声明的 Service,和通过 useFactory 创建的对象,都可以作为 provider 来注入
import { Module } from '@nestjs/common';
import { AiService } from './ai.service';
import { AiController } from './ai.controller';
import { ConfigService } from '@nestjs/config';
import { ChatOpenAI } from '@langchain/openai';
@Module({
controllers: [AiController],
providers: [
AiService,
{
provide: 'CHAT_MODEL',
useFactory: (configService: ConfigService) => {
return new ChatOpenAI({
modelName: configService.get<string>('MODEL_NAME'),
apiKey: configService.get<string>('OPENAI_API_KEY'),
configuration: {
baseURL: configService.get<string>('OPENAI_BASE_URL'),
},
});
},
inject: [ConfigService],
},
],
})
export class AiModule {}
在 AiService 中直接使用
import { StringOutputParser } from '@langchain/core/output_parsers';
import { PromptTemplate } from '@langchain/core/prompts';
import { Runnable } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { Inject, Injectable } from '@nestjs/common';
@Injectable()
export class AiService {
private readonly chain: Runnable<{ query: string }, string>;
constructor(@Inject('CHAT_MODEL') private model: ChatOpenAI) {
const prompt = PromptTemplate.fromTemplate('请回答以下问题: \n\n{query}');
this.chain = prompt.pipe(model).pipe(new StringOutputParser());
}
async runChain(query: string): Promise<string> {
return await this.chain.invoke({ query });
}
async *streamChain(query: string): AsyncGenerator<string> {
const stream = await this.chain.stream({ query });
for await (const chunk of stream) {
yield chunk;
}
}
}
ip 限流保护
安装限流模块
pnpm i @nestjs/throttler
配置 trust proxy 来获取客户端真实的 IP
trust proxy 是 Express 的一个开关,作用是:当请求经过 Nginx / CDN / 负载均衡时,读取 X-Forwarded-For 请求头里的原始客户端 ip
import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
import type { Express } from 'express';
async function bootstrap() {
const app = await NestFactory.create(AppModule);
const expressApp = app.getHttpAdapter().getInstance() as Express;
expressApp.set('trust proxy', 1);
app.enableCors();
await app.listen(process.env.PORT ?? 3000);
}
bootstrap();
配置全局限流
AppModel 里面添加每个 ip 每秒内最多请求 30 次,并且对所有的请求生效
import { Module } from '@nestjs/common';
import { APP_GUARD } from '@nestjs/core';
import { AppController } from './app.controller';
import { AppService } from './app.service';
import { AiModule } from './ai/ai.module';
import { ConfigModule } from '@nestjs/config';
import { ThrottlerGuard, ThrottlerModule } from '@nestjs/throttler';
@Module({
imports: [
AiModule,
ConfigModule.forRoot({
isGlobal: true,
envFilePath: '.env',
}),
ThrottlerModule.forRoot([
{
ttl: 60000,
limit: 30,
},
]),
],
controllers: [AppController],
providers: [
AppService,
{
provide: APP_GUARD,
useClass: ThrottlerGuard,
},
],
})
export class AppModule {}
在 AiController 里面对 sse 接口限流为每秒钟 5 次
import { Controller, Get, Query, Sse } from '@nestjs/common';
import { AiService } from './ai.service';
import { from, Observable } from 'rxjs';
import { map } from 'rxjs/operators';
import { Throttle } from '@nestjs/throttler';
@Controller('ai')
export class AiController {
constructor(private readonly aiService: AiService) {}
@Get('chat')
@Throttle({ default: { ttl: 60000, limit: 20 } })
async chat(@Query('query') query: string) {
const answer = await this.aiService.runChain(query);
return { answer };
}
@Sse('/chat/stream')
@Throttle({ default: { ttl: 60000, limit: 5 } })
chatStream(@Query('query') query: string): Observable<{ data: string }> {
return from(this.aiService.streamChain(query)).pipe(
map((chunk) => ({ data: chunk })),
);
}
}
封装 useSseChat hook
将 SSE 逻辑抽离为可复用的自定义 Hook
import { useState, useRef, useEffect, useCallback } from "react";
type Status = "idle" | "connecting" | "streaming" | "done" | "error";
interface UseSseChatOptions {
/** SSE 接口基地址,末尾不需要带斜杠 */
baseUrl: string;
}
interface UseSseChatReturn {
/** 当前累积的响应文本 */
responseText: string;
/** 当前连接状态 */
status: Status;
/** 是否正在流式接收(streaming / connecting) */
isStreaming: boolean;
/** 滚动锚点 ref,绑定到响应内容容器上可实现自动滚动 */
responseRef: React.RefObject<HTMLDivElement | null>;
/** 发起一次流式请求 */
start: (query: string) => void;
/** 手动停止当前流 */
stop: () => void;
}
export function useSseChat({ baseUrl }: UseSseChatOptions): UseSseChatReturn {
const [responseText, setResponseText] = useState("回复将显示在这里...");
const [status, setStatus] = useState<Status>("idle");
const esRef = useRef<EventSource | null>(null);
const responseRef = useRef<HTMLDivElement | null>(null);
const isStreaming = status === "connecting" || status === "streaming";
// 响应内容变化时自动滚动到底部
useEffect(() => {
const el = responseRef.current;
if (!el) return;
el.scrollTop = el.scrollHeight;
}, [responseText]);
// 组件卸载时关闭连接
useEffect(() => {
return () => {
esRef.current?.close();
};
}, []);
const stop = useCallback(() => {
esRef.current?.close();
esRef.current = null;
setStatus("idle");
setResponseText((prev) => (prev ? prev + "\n【已停止】" : "已停止"));
}, []);
const start = useCallback(
(query: string) => {
// 关闭上一次未结束的连接
esRef.current?.close();
esRef.current = null;
setResponseText("");
setStatus("connecting");
const base = baseUrl.replace(//+$/, "");
const url = `${base}/ai/chat/stream?query=${encodeURIComponent(query)}`;
try {
const es = new EventSource(url);
esRef.current = es;
es.onmessage = (ev) => {
setStatus("streaming");
setResponseText((prev) => prev + ev.data);
};
es.onerror = () => {
// readyState === 2 表示连接已关闭(正常结束或异常断开)
if (es.readyState === EventSource.CLOSED) {
setResponseText((prev) =>
prev ? prev + "\n【已结束】" : "已结束",
);
setStatus("done");
} else {
setStatus("error");
}
es.close();
esRef.current = null;
};
// 后端可发送 event: done 来明确标记结束
es.addEventListener("done", () => {
es.close();
esRef.current = null;
setStatus("done");
setResponseText((prev) => (prev ? prev + "\n【已完成】" : "已完成"));
});
} catch (err) {
setResponseText(`错误:${String(err)}`);
setStatus("error");
}
},
[baseUrl],
);
return { responseText, status, isStreaming, responseRef, start, stop };
}
使用示例
const { responseText, isStreaming, responseRef, start, stop } = useSseChat({
baseUrl: apiUrl,
});
小结
使用 invoke 和 stream 实现了同步和流式的接口。
在 service层生成流式内容,在 controller 层创建了一个 sse 接口,返回流式数据。
前端使用 EventSource 来监听流式接口的 message 事件。
最后对 sse 请求限流,对依赖进行解耦,对 sse 请求进行封装解耦。