第 12 节：后端测试与调试测试是保证代码质量的重要手段。本节我们将学习如何为 Text-to-BI 系统编写测试，以及

第 12 节：后端测试与调试

阅读时间：约 7 分钟
难度级别：实战
前置知识：Python 测试基础、FastAPI

本节概要

通过本节学习，你将掌握：

FastAPI 应用的测试策略和方法
使用 pytest 编写单元测试和集成测试
测试 SSE 流式接口的技巧
Mock 外部依赖进行隔离测试
使用日志和调试工具排查问题
性能测试和优化方法

引言

测试是保证代码质量的重要手段。本节我们将学习如何为 Text-to-BI 系统编写测试，以及如何使用各种工具进行调试和性能优化。

测试是保证代码质量的关键。本文将介绍如何测试和调试 Text-to-BI 系统的后端服务。

🎯 本章目标

完成后，你将掌握：

✅ Agent 测试方法
✅ Workflow 测试技巧
✅ API 接口测试
✅ 调试工具使用
✅ 常见问题排查

🧪 测试 Agent

单元测试

创建测试文件：

# backend/tests/test_cubejs_agent.py
import pytest
from agents.cubejs_agent import build_cubejs_agent, query_cubejs

def test_agent_creation():
    """测试 Agent 创建"""
    agent = build_cubejs_agent()
    assert agent is not None
    assert agent.name is not None

def test_simple_query():
    """测试简单查询"""
    response = query_cubejs("统计员工总数", stream=False)
    
    # 验证响应包含必要元素
    assert "查询分析" in response
    assert "json" in response.lower()
    assert "measures" in response.lower()

def test_group_query():
    """测试分组查询"""
    response = query_cubejs("按性别统计员工数量", stream=False)
    
    assert "dimensions" in response.lower()
    assert "gender" in response.lower()

def test_invalid_query():
    """测试无效查询"""
    response = query_cubejs("这是一个无关的问题", stream=False)
    
    # Agent 应该能够处理或拒绝无关查询
    assert response is not None

集成测试

# backend/tests/test_integration.py
import pytest
from agents.cubejs_agent import build_cubejs_agent
from services.cubejs_service import CubeJSService
import json
import re

def extract_json_from_response(response: str):
    """从响应中提取 JSON"""
    json_pattern = r'```json\s*([\s\S]*?)\s*```'
    matches = re.findall(json_pattern, response)
    if matches:
        return json.loads(matches[0])
    return None

def test_agent_to_cubejs_integration():
    """测试 Agent 生成的查询能否被 CubeJS 执行"""
    # 1. 使用 Agent 生成查询
    agent = build_cubejs_agent()
    response = agent.run("统计员工总数", stream=False)
    query = extract_json_from_response(response.content)
    
    assert query is not None
    assert "measures" in query
    
    # 2. 使用 CubeJS Service 执行查询
    service = CubeJSService(base_url="http://localhost:4000")
    result = service.load(query)
    
    # 3. 验证结果
    assert "data" in result
    assert len(result["data"]) > 0

运行测试

# 安装 pytest
pip install pytest pytest-asyncio

# 运行所有测试
pytest

# 运行特定测试文件
pytest tests/test_cubejs_agent.py

# 显示详细输出
pytest -v

# 显示打印输出
pytest -s

# 运行特定测试
pytest tests/test_cubejs_agent.py::test_simple_query

🔄 测试 Workflow

基础测试

# backend/tests/test_workflow.py
import pytest
from workflows.text_to_bi import text_to_bi_workflow

def test_workflow_execution():
    """测试 Workflow 执行"""
    response = text_to_bi_workflow.run(
        input="统计员工总数",
        stream=False
    )
    
    assert response is not None
    assert response.content is not None
    assert len(response.content) > 0

def test_workflow_steps():
    """测试 Workflow 步骤"""
    response = text_to_bi_workflow.run(
        input="按性别统计员工数量",
        stream=False
    )
    
    # 验证包含关键内容
    content = response.content
    assert "查询分析" in content or "SQL" in content
    assert "查询结果" in content or "分析" in content

流式测试

def test_workflow_streaming():
    """测试流式输出"""
    chunks = []
    
    for event in text_to_bi_workflow.run(
        input="统计员工总数",
        stream=True
    ):
        if hasattr(event, 'content') and event.content:
            chunks.append(event.content)
    
    # 验证收到了多个块
    assert len(chunks) > 0
    
    # 验证内容完整性
    full_content = "".join(chunks)
    assert len(full_content) > 0

错误处理测试

def test_workflow_error_handling():
    """测试错误处理"""
    try:
        response = text_to_bi_workflow.run(
            input="",  # 空输入
            stream=False
        )
        # 应该能够处理空输入
        assert response is not None
    except Exception as e:
        # 或者抛出明确的错误
        assert "empty" in str(e).lower() or "required" in str(e).lower()

🌐 测试 API 接口

使用 pytest + httpx

# backend/tests/test_api.py
import pytest
from httpx import AsyncClient
from main import app

@pytest.mark.asyncio
async def test_health_check():
    """测试健康检查接口"""
    async with AsyncClient(app=app, base_url="http://test") as client:
        response = await client.get("/health")
        assert response.status_code == 200
        assert response.json()["status"] == "healthy"

@pytest.mark.asyncio
async def test_chat_api():
    """测试聊天接口"""
    async with AsyncClient(app=app, base_url="http://test") as client:
        response = await client.post(
            "/chat/ask",
            json={"message": "Hello"}
        )
        assert response.status_code == 200

@pytest.mark.asyncio
async def test_workflow_sync_api():
    """测试同步 Workflow 接口"""
    async with AsyncClient(app=app, base_url="http://test") as client:
        response = await client.post(
            "/workflow/query-sync",
            json={"message": "统计员工总数"}
        )
        assert response.status_code == 200
        data = response.json()
        assert data["success"] == True
        assert "content" in data

使用 curl 测试

# 测试健康检查
curl http://localhost:8000/health

# 测试聊天接口
curl -X POST http://localhost:8000/chat/ask \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

# 测试同步 Workflow
curl -X POST http://localhost:8000/workflow/query-sync \
  -H "Content-Type: application/json" \
  -d '{"message": "统计员工总数"}'

# 测试流式 Workflow
curl -N -X POST http://localhost:8000/workflow/query \
  -H "Content-Type: application/json" \
  -d '{"message": "统计员工总数"}'

🐛 调试技巧

1. 启用调试模式

# 在 Agent 中启用调试
agent = Agent(
    model=DeepSeek(id="deepseek-chat"),
    instructions="...",
    debug_mode=True  # 显示详细日志
)

# 在 Workflow 中启用调试
workflow = Workflow(
    name="TextToBIWorkflow",
    steps=[...],
    debug_mode=True
)

2. 添加日志

import logging

# 配置日志
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

logger = logging.getLogger(__name__)

# 在代码中添加日志
def get_sql_and_execute(step_input):
    logger.debug(f"输入内容: {step_input.get_last_step_content()}")
    
    query = extract_json_query(step_input.get_last_step_content())
    logger.debug(f"提取的查询: {query}")
    
    result = service.load(query)
    logger.debug(f"查询结果: {result}")
    
    return StepOutput(content=...)

3. 使用 Python 调试器

# 在代码中设置断点
import pdb; pdb.set_trace()

# 或使用 ipdb（更友好）
import ipdb; ipdb.set_trace()

# 调试命令
# n - 下一行
# s - 进入函数
# c - 继续执行
# p variable - 打印变量
# l - 显示当前代码
# q - 退出调试

4. 使用 VS Code 调试

创建 .vscode/launch.json：

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: FastAPI",
      "type": "python",
      "request": "launch",
      "module": "uvicorn",
      "args": [
        "main:app",
        "--reload",
        "--host",
        "0.0.0.0",
        "--port",
        "8000"
      ],
      "jinja": true,
      "justMyCode": false
    },
    {
      "name": "Python: Workflow Test",
      "type": "python",
      "request": "launch",
      "program": "${workspaceFolder}/backend/workflows/text_to_bi.py",
      "console": "integratedTerminal"
    }
  ]
}

📊 性能测试

使用 pytest-benchmark

# backend/tests/test_performance.py
import pytest
from workflows.text_to_bi import text_to_bi_workflow

def test_workflow_performance(benchmark):
    """测试 Workflow 性能"""
    def run_workflow():
        return text_to_bi_workflow.run(
            input="统计员工总数",
            stream=False
        )
    
    result = benchmark(run_workflow)
    assert result is not None

# 运行性能测试
# pytest tests/test_performance.py --benchmark-only

使用 locust 进行负载测试

# backend/tests/locustfile.py
from locust import HttpUser, task, between

class WorkflowUser(HttpUser):
    wait_time = between(1, 3)
    
    @task
    def query_workflow(self):
        self.client.post(
            "/workflow/query-sync",
            json={"message": "统计员工总数"}
        )

# 运行负载测试
# locust -f tests/locustfile.py

🔍 常见问题排查

问题 1: Agent 输出格式不正确

症状： JSON 解析失败

排查：

# 打印 Agent 原始输出
response = agent.run("统计员工总数", stream=False)
print("原始输出:")
print(response.content)
print("\n" + "="*60 + "\n")

# 检查是否包含 JSON 代码块
if "```json" in response.content:
    print("✓ 包含 JSON 代码块")
else:
    print("✗ 缺少 JSON 代码块")

解决： 优化 Agent 的 instructions

问题 2: CubeJS 连接失败

症状： Connection refused

排查：

# 检查 CubeJS 是否运行
curl http://localhost:4000/cubejs-api/v1/meta

# 检查 Docker 容器
cd backend/cubejs
docker-compose ps

# 查看日志
docker-compose logs

解决： 启动 CubeJS 服务

问题 3: Workflow 执行缓慢

症状： 响应时间过长

排查：

import time

def get_sql_and_execute(step_input):
    start = time.time()
    
    # 步骤 1
    t1 = time.time()
    query = extract_json_query(...)
    print(f"提取查询耗时: {time.time() - t1:.2f}s")
    
    # 步骤 2
    t2 = time.time()
    sql_result = service.sql(query)
    print(f"获取 SQL 耗时: {time.time() - t2:.2f}s")
    
    # 步骤 3
    t3 = time.time()
    results = service.load(query)
    print(f"执行查询耗时: {time.time() - t3:.2f}s")
    
    print(f"总耗时: {time.time() - start:.2f}s")
    
    return StepOutput(...)

解决： 优化慢的步骤

💡 Vibe Coding 要点

1. 测试先行

与 AI 对话：
"为 CubeJS Agent 创建单元测试，测试：
1. Agent 创建
2. 简单查询
3. 分组查询
4. 错误处理"

2. 逐步调试

第1步：打印变量
第2步：添加日志
第3步：使用调试器
第4步：单元测试
第5步：集成测试

3. 自动化测试

# 创建测试脚本
cat > test.sh << 'EOF'
#!/bin/bash
echo "运行单元测试..."
pytest tests/test_cubejs_agent.py

echo "运行集成测试..."
pytest tests/test_integration.py

echo "运行 API 测试..."
pytest tests/test_api.py

echo "所有测试完成！"
EOF

chmod +x test.sh
./test.sh

本节小结

本节我们完成了后端的测试和调试：

测试策略：建立了单元测试、集成测试、端到端测试的完整体系
pytest 使用：掌握了 pytest 的基本用法和高级特性
FastAPI 测试：使用 TestClient 测试 API 接口
SSE 测试：实现了流式接口的测试方法
Mock 技术：使用 Mock 隔离外部依赖
日志调试：配置了完善的日志系统
性能测试：使用 locust 进行压力测试

现在我们有了完整的测试体系，可以保证代码质量。

思考与练习

思考题

单元测试和集成测试的边界如何划分？
什么情况下应该使用 Mock？过度使用 Mock 有什么问题？
如何平衡测试覆盖率和开发效率？
测试代码本身需要测试吗？

实践练习

提高测试覆盖率：
- 为现有代码添加测试
- 目标达到 80% 以上覆盖率
- 使用 coverage 工具生成报告
测试驱动开发：
- 选择一个新功能
- 先写测试再写实现
- 体会 TDD 的优势
性能优化：
- 使用 profiler 找出性能瓶颈
- 优化慢查询和慢接口
- 对比优化前后的性能
CI/CD 集成：
- 配置 GitHub Actions
- 自动运行测试
- 生成测试报告

上一节：第 11 节：流式响应与实时交互
下一节：第 13 节：Vue 3 + TypeScript 项目初始化