第 11 节：流式响应与实时交互流式响应能够显著提升用户体验，让用户实时看到 AI 的思考过程。本节我们将学习如何使用

第 11 节：流式响应与实时交互

阅读时间：约 8 分钟
难度级别：实战
前置知识：FastAPI、Workflow、SSE 协议基础

本节概要

通过本节学习，你将掌握：

Server-Sent Events (SSE) 协议的原理
在 FastAPI 中实现 SSE 流式响应
将 Workflow 输出转换为 SSE 格式
处理流式响应中的错误和异常
优化流式响应的性能和用户体验
测试和调试流式接口

引言

流式响应能够显著提升用户体验，让用户实时看到 AI 的思考过程。本节我们将学习如何使用 SSE 协议实现流式响应，将 Workflow 的输出实时传递给前端。

流式响应让用户能够实时看到 AI 的思考过程，极大提升了用户体验。本文将介绍如何实现 Server-Sent Events (SSE) 流式响应。

🎯 本章目标

完成后，你将拥有：

✅ SSE 流式响应接口
✅ Workflow 事件流处理
✅ 前后端协议设计
✅ 错误处理机制
✅ 实时用户体验

🌊 什么是流式响应？

传统响应 vs 流式响应

传统响应：

用户请求 → 等待... → 完整结果返回

流式响应：

用户请求 → 实时输出 → 实时输出 → ... → 完成

SSE (Server-Sent Events)

SSE 是一种服务器向客户端推送数据的技术：

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

data: 第一块数据\n\n
data: 第二块数据\n\n
data: 第三块数据\n\n
data: [DONE]\n\n

📡 实现后端流式接口

Step 1: 创建 Workflow 路由

backend/routers/workflow.py：

"""
Workflow 路由
Text-to-BI Workflow 相关接口
"""
from fastapi import APIRouter, HTTPException
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from typing import Optional
import json

from workflows.text_to_bi import text_to_bi_workflow

router = APIRouter(prefix="/workflow", tags=["Workflow"])


class QueryRequest(BaseModel):
    """查询请求模型"""
    message: str
    cubejs_url: Optional[str] = "http://localhost:4000"

Step 2: 实现流式接口

@router.post("/query")
async def query_with_workflow(request: QueryRequest):
    """
    使用 Workflow 处理自然语言查询（流式响应）
    
    完整的 Text-to-BI 流水线：
    1. 生成 CubeJS 查询
    2. 获取 SQL
    3. 执行查询
    4. 格式化结果
    5. 生成分析
    
    Response: Server-Sent Events (SSE) 流式响应
    """
    message = request.message
    
    if not message:
        raise HTTPException(status_code=400, detail="Message is required")
    
    def generate_response():
        """生成器函数，用于流式输出"""
        try:
            current_step_content = ""
            step_index = 0
            
            # 运行 workflow，获取流式输出
            for event in text_to_bi_workflow.run(
                input=message, 
                stream=True, 
                stream_events=True
            ):
                event_type = event.event
                
                # 步骤开始
                if event_type == "StepStarted":
                    if current_step_content:
                        # 发送累积的内容
                        json_content = json.dumps(
                            current_step_content, 
                            ensure_ascii=False
                        )
                        yield f"data: {json_content}\n\n"
                        current_step_content = ""
                    
                    step_index += 1
                    # 发送步骤分隔信号
                    yield 'data: {"type":"step_start"}\n\n'
                
                # 运行内容 - 流式输出（Agent 步骤）
                elif event_type == "RunContent":
                    if hasattr(event, 'content') and event.content:
                        content_chunk = event.content
                        current_step_content += content_chunk
                        json_content = json.dumps(
                            content_chunk, 
                            ensure_ascii=False
                        )
                        yield f"data: {json_content}\n\n"
                
                # 步骤输出 - 非流式步骤（函数步骤）
                elif event_type == "StepOutput":
                    if hasattr(event, 'content') and event.content:
                        content = event.content
                        json_content = json.dumps(
                            content, 
                            ensure_ascii=False
                        )
                        yield f"data: {json_content}\n\n"
                        current_step_content = content
                
                # 步骤完成
                elif event_type == "StepCompleted":
                    if hasattr(event, 'content') and event.content:
                        if event.content != current_step_content:
                            content = event.content
                            json_content = json.dumps(
                                content, 
                                ensure_ascii=False
                            )
                            yield f"data: {json_content}\n\n"
                            current_step_content = content
                    
                    # 发送步骤结束信号
                    yield 'data: {"type":"step_end"}\n\n'
                    current_step_content = ""
            
            yield 'data: "[DONE]"\n\n'
            
        except Exception as e:
            error_message = f"Error: {str(e)}"
            yield f"data: {error_message}\n\n"
    
    return StreamingResponse(
        generate_response(),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no",
        }
    )

Step 3: 实现同步接口

@router.post("/query-sync")
async def query_with_workflow_sync(request: QueryRequest):
    """
    使用 Workflow 处理自然语言查询（同步响应）
    
    非流式版本，适用于不需要实时反馈的场景。
    
    Response: JSON 格式的完整结果
    """
    message = request.message
    
    if not message:
        raise HTTPException(status_code=400, detail="Message is required")
    
    try:
        # 非流式执行 workflow
        response = text_to_bi_workflow.run(input=message, stream=False)
        
        return {
            "success": True,
            "content": response.content,
            "workflow_id": response.workflow_id if hasattr(response, 'workflow_id') else None,
            "run_id": response.run_id if hasattr(response, 'run_id') else None,
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

🎨 实现前端 SSE 客户端

Step 1: 创建 API 封装

frontend/src/api/workflow.ts：

import type { AxiosProgressEvent, GenericAbortSignal } from 'axios'
import { post } from './request'

/**
 * Text-to-BI Workflow 查询（流式）
 */
export const streamWorkflow = (
  params: { message: string; cubejs_url?: string },
  onMessage: (content: string, isNewStep: boolean) => void,
  abortSignal?: GenericAbortSignal
) => {
  let previousLength = 0
  
  return post({
    url: '/workflow/query',
    data: params,
    signal: abortSignal,
    responseType: 'text',
    onDownloadProgress: (progressEvent: AxiosProgressEvent) => {
      // 获取完整的响应数据
      const rawData = progressEvent.event.target.response
      if (!rawData || typeof rawData !== 'string') return
      
      // 只处理新增的数据
      const newData = rawData.slice(previousLength)
      previousLength = rawData.length
      
      if (!newData) return
      
      // 解析 SSE 格式: data: {content}\n\n
      const lines = newData.split('\n')
      
      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = line.slice(6).trim()
          
          if (!data) continue
          
          try {
            // 解析 JSON 编码的内容
            const parsed = JSON.parse(data)
            
            if (parsed === '[DONE]') {
              return
            }
            
            // 检查是否是步骤控制信号
            if (typeof parsed === 'object' && parsed.type) {
              if (parsed.type === 'step_start') {
                onMessage('', true) // 通知创建新步骤
              } else if (parsed.type === 'step_end') {
                // 步骤结束
              }
            } else if (typeof parsed === 'string') {
              // 普通内容
              onMessage(parsed, false)
            }
          } catch (e) {
            // JSON 解析错误
          }
        }
      }
    },
  })
}

Step 2: 在组件中使用

frontend/src/components/WorkflowPage.vue：

const executeWorkflow = async () => {
  if (!inputMessage.value.trim() || isLoading.value) return

  const query = inputMessage.value.trim()
  userQuery.value = query
  inputMessage.value = ''

  // 重置内容
  allContent.value = ''

  isLoading.value = true
  await nextTick()
  scrollToBottom()

  try {
    await streamWorkflow(
      { message: query },
      (content: string, isNewStep: boolean) => {
        if (isNewStep) {
          // 新步骤开始，添加分隔
          if (allContent.value) {
            if (!allContent.value.endsWith('\n')) {
              allContent.value += '\n'
            }
            allContent.value += '\n\n'
          }
        } else if (content) {
          // 直接累积所有内容
          allContent.value += content
        }
        
        // 防抖滚动
        debouncedScrollToBottom()
      }
    )
    
  } catch (error) {
    allContent.value = '抱歉，执行Workflow时出现错误。'
    message.error('Workflow执行失败，请稍后重试')
  } finally {
    isLoading.value = false
    await nextTick()
    scrollToBottom()
  }
}

🔍 SSE 协议设计

数据格式

# 普通内容
data: "这是一段文本"\n\n

# 步骤控制
data: {"type":"step_start"}\n\n
data: {"type":"step_end"}\n\n

# 完成信号
data: "[DONE]"\n\n

# 错误信息
data: "Error: 错误描述"\n\n

事件流示例

data: {"type":"step_start"}\n\n
data: "## 🔍 查询分析"\n\n
data: "\n\n"
data: "统计员工总数"\n\n
data: "\n\n"
data: "```json"\n\n
data: "\n"
data: "{"\n\n
data: "  \"measures\": [\"employees.total_employees\"]"\n\n
data: "\n"
data: "}"\n\n
data: "\n"
data: "```"\n\n
data: {"type":"step_end"}\n\n

data: {"type":"step_start"}\n\n
data: "\n\n"
data: "**SQL Query**"\n\n
data: "\n\n"
data: "```sql"\n\n
data: "\n"
data: "SELECT COUNT(*) FROM employees"\n\n
data: "\n"
data: "```"\n\n
data: {"type":"step_end"}\n\n

data: "[DONE]"\n\n

🐛 错误处理

后端错误处理

def generate_response():
    try:
        for event in workflow.run(...):
            yield format_event(event)
        yield 'data: "[DONE]"\n\n'
    except ValueError as ve:
        # 已知错误
        error_message = f"Error: {str(ve)}"
        yield f"data: {error_message}\n\n"
    except Exception as e:
        # 未知错误
        logger.error(f"Workflow error: {e}")
        error_message = "系统错误，请稍后重试"
        yield f"data: {error_message}\n\n"

前端错误处理

try {
  await streamWorkflow(params, onMessage)
} catch (error) {
  if (axios.isCancel(error)) {
    // 用户取消
    message.info('查询已取消')
  } else if (error.response) {
    // 服务器错误
    message.error(`服务器错误: ${error.response.status}`)
  } else if (error.request) {
    // 网络错误
    message.error('网络连接失败')
  } else {
    // 其他错误
    message.error('未知错误')
  }
}

🧪 测试流式响应

使用 curl 测试

curl -N -X POST "http://localhost:8000/workflow/query" \
  -H "Content-Type: application/json" \
  -d '{"message": "统计员工总数"}'

使用 Python 测试

import requests

def test_stream():
    url = "http://localhost:8000/workflow/query"
    data = {"message": "统计员工总数"}
    
    with requests.post(url, json=data, stream=True) as response:
        for line in response.iter_lines():
            if line:
                decoded_line = line.decode('utf-8')
                if decoded_line.startswith('data: '):
                    content = decoded_line[6:]
                    print(content)

if __name__ == "__main__":
    test_stream()

使用浏览器测试

// 在浏览器控制台运行
const response = await fetch('http://localhost:8000/workflow/query', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ message: '统计员工总数' })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  console.log(chunk);
}

💡 Vibe Coding 要点

1. 协议先行

在实现前，先设计好前后端协议：

与 AI 对话：
"设计一个 SSE 协议，用于传输 Workflow 的流式输出：
- 支持步骤分隔
- 支持内容流式传输
- 支持错误处理
- 使用 JSON 编码"

2. 逐步实现

第1版：简单的文本流
第2版：添加 JSON 编码
第3版：添加步骤控制
第4版：添加错误处理
第5版：优化性能

3. 充分测试

# 测试正常流程
curl -N ...

# 测试错误处理
curl -N ... -d '{"message": ""}'

# 测试网络中断
# 启动请求后，停止服务器

本节小结

本节我们完成了流式响应的实现：

SSE 协议：理解了 Server-Sent Events 的工作原理和格式
FastAPI 集成：使用 StreamingResponse 实现 SSE 接口
Workflow 流式：将 Workflow 的输出转换为 SSE 格式
格式规范：定义了统一的 SSE 数据格式
错误处理：实现了流式响应中的异常处理
性能优化：通过异步和缓冲优化性能
测试验证：使用多种方式测试流式接口

现在我们有了完整的流式响应功能，用户可以实时看到查询过程。

思考与练习

思考题

SSE 和 WebSocket 有什么区别？什么场景下应该使用 SSE？
如果网络中断，SSE 连接会如何？如何实现自动重连？
流式响应对服务器性能有什么影响？如何优化？
如何在流式响应中实现进度条？

实践练习

添加进度信息：
- 在 SSE 数据中添加进度字段
- 显示当前步骤和总步骤数
- 估算剩余时间
错误恢复：
- 实现 SSE 连接的自动重连
- 支持从断点继续
- 测试各种网络异常情况
性能测试：
- 测试并发多个 SSE 连接
- 监控服务器资源使用
- 找出性能瓶颈并优化
协议扩展：
- 支持客户端发送控制信号（如暂停、取消）
- 实现双向通信
- 对比 SSE 和 WebSocket 的实现

上一节：第 10 节：实现 Workflow 流程编排
下一节：第 12 节：后端测试与调试