🚀 Vercel AI SDK 使用指南：Reranking (重排序)在构建 RAG（检索增强生成）应用时，我们经常

在构建 RAG（检索增强生成）应用时，我们经常遇到一个痛点：向量检索（Embedding Search）虽然快，但精度有时不够理想。

它可能会召回很多看似相关但实际上并不匹配的文档，或者关键信息被淹没在大量的检索结果中。这时候，Reranking（重排序） 就派上用场了。它像是一个更精准的“第二轮筛选”，能显著提升 RAG 的最终效果。

本文将带你深入了解如何在 Vercel AI SDK 中使用 rerank 功能。

什么是 Reranking？

简单来说，Reranking 是一个精排过程：

初排（Retrieval）： 使用向量搜索（Embeddings）快速从数据库中捞出前 100 个可能相关的文档。
重排（Reranking）： 使用专门的 Rerank 模型（如 Cohere Rerank）对这 100 个文档进行逐一打分，精准计算它们与查询（Query）的相关性。
截断（Cutoff）： 取分数最高的 Top 5 发送给 LLM。

Vercel AI SDK Core 提供了标准化的 rerank 函数，让你能轻松接入 Cohere、Amazon Bedrock 等顶级重排序模型。

🛠️ 环境准备

首先，确保你安装了 ai SDK 和对应的 Provider（以 Cohere 为例，它是目前最流行的 Rerank 提供商之一）：

Bash

npm install ai @ai-sdk/cohere

确保你已经获取了 Cohere 的 API Key，并配置在环境变量中：

COHERE_API_KEY=your_key_here

💻 基础用法：重排字符串数组

这是最简单的场景。假设你有一组文本片段，想找出其中与 "rain"（下雨）最相关的。

TypeScript

import { rerank } from 'ai';
import { cohere } from '@ai-sdk/cohere';

async function main() {
  const documents = [
    'sunny day at the beach',       // 文档 0
    'rainy afternoon in the city',  // 文档 1
    'snowy night in the mountains', // 文档 2
  ];

  // 调用 rerank 函数
  const { results } = await rerank({
    model: cohere.reranking('rerank-v3.5'), // 指定模型
    query: 'talk about rain',               // 用户的查询
    documents,                              // 待排序的文档列表
    topN: 2,                                // 只保留前 2 名
  });

  // 输出结果
  results.forEach((result) => {
    console.log(`Index: ${result.index}, Score: ${result.score}, Content: ${result.document}`);
  });
}

main().catch(console.error);

输出示例：

Plaintext

Index: 1, Score: 0.98, Content: rainy afternoon in the city
Index: 0, Score: 0.01, Content: sunny day at the beach

可以看到，SDK 自动帮我们计算了相关性分数，并按分数降序排列。

🏗️ 进阶用法：处理结构化数据 (JSON)

在实际开发中，我们检索到的通常是数据库里的对象（Object），包含标题、作者、正文等字段。rerank 函数完美支持这种情况。

TypeScript

import { rerank } from 'ai';
import { cohere } from '@ai-sdk/cohere';

// 模拟从数据库取出的邮件列表
const emails = [
  {
    id: 'msg_01',
    from: 'Paul Doe',
    subject: 'Follow-up',
    text: 'We are happy to give you a discount of 20% on your next order.',
  },
  {
    id: 'msg_02',
    from: 'John McGill',
    subject: 'Missing Info',
    text: 'Sorry, but here is the pricing information from Oracle: $5000/month',
  },
];

async function rankEmails() {
  const { results } = await rerank({
    model: cohere.reranking('rerank-v3.5'),
    documents: emails, // 直接传入对象数组
    query: 'Which pricing did we get from Oracle?', // 查询关于 Oracle 的报价
    topN: 1, // 我们只需要最准确的那一条
  });
  
  // 这里的 results[0].document 就是原始的邮件对象
  const topMatch = results[0].document;
  
  console.log('最相关的邮件来自:', topMatch.from);
  console.log('邮件内容:', topMatch.text);
}

rankEmails();

关键点： Rerank 模型会自动理解 JSON 对象的语义，你不需要手动把对象拼接成字符串。

⚙️ 核心参数详解

在使用 rerank 时，有几个关键参数决定了你的成本和效果：

参数	类型	说明
`model`	Model	必填。使用的模型，如 `cohere.reranking('rerank-v3.5')`。
`query`	string	必填。用户的搜索词。
`documents`	array	必填。待排序的文档（字符串或对象）。
`topN`	number	强烈推荐设置。指定返回前几名。例如传入 100 个文档，设置 `topN: 5`，只返回分数最高的 5 个。
`minScore`	number	(部分模型支持) 设定最低分阈值，过滤掉相关性太低的结果。

🌟 最佳实践：RAG 管道集成

一个成熟的 RAG 检索流通常是这样的：

DB Query: SELECT * FROM knowledge_base ORDER BY embedding <=> query_embedding LIMIT 50 (先召回 50 条)
Rerank: 将这 50 条传给 rerank 函数，设置 topN: 5。
LLM Generation: 将这 5 条最精华的内容作为 Context 喂给大模型。

这样做既节省了 LLM 的 Token 费用（只发 5 条），又大大减少了幻觉（因为这 5 条是经过精挑细选的）。

💡 小贴士

Token 限制：Rerank 模型通常也有 Token 上限（例如 4k 或 8k），不要一次性传入整本书，记得切片（Chunking）。
回退机制：如果 Rerank 接口挂了，可以简单地回退到使用向量检索的前几名，保证系统不崩。

总结

Vercel AI SDK 的 rerank 就像是给你的搜索系统加了一个“专家审核员”。虽然它会增加一次 API 调用的时间（通常几百毫秒），但对于提升复杂问题的回答质量，这个投入是绝对值得的。

如果你正在做企业级知识库助手，Embedding + Reranking 是目前的黄金标准组合。