Vertex AI 与 Elasticsearch 开放推理 API 集成，为你的 RAG 应用程序带来重新排名

作者：来自 Elastic Tim Grein

你可以使用 Google Vertex AI 和 Elasticsearch 开放推理 API 构建语义搜索和语义重新排名！

在我们与 Google Vertex AI 团队密切合作之后，我们很高兴地宣布 Elasticsearch 向量数据库现已与 Google Vertex AI 原生集成，使开发人员能够存储由 Google Vertex AI 文本嵌入 API（Google Vertex AI Text Embeddings API）中可用的任何文本嵌入模型生成的嵌入。Elasticsearch 还通过 Elastic 开放推理 API 与 Google Vertex AI 的重新排名功能原生集成。开发人员可以结合使用这两种功能，也可以单独使用，以构建强大的语义搜索和 RAG 应用程序。

Google Vertex AI 是一个用于 AI 应用程序的托管开发平台。你可以访问各种模型，包括文本嵌入和重新排名模型。在这篇博文中，我们将使用 text-embedding-004 为英文文本生成嵌入。为了提高搜索结果的质量，我们将使用 semantic-ranker-512@latest 模型。

在此博客中，我们将：

使用 text-embedding-004 生成嵌入
使用 Elastic 的新 semantic_text 字段在 Elasticsearch 中分块和存储嵌入
使用 semantic-ranker-512@latest 构建语义重新排名示例
使用 Elastic 的检索器构建具有 BM25 和语义重新排名的两阶段检索

开始生成嵌入

首先，你需要一个可以访问 Google Vertex AI 平台的 Google 帐户。然后，你需要在 Google Cloud 控制台中选择一个现有项目或创建一个新的 Google Cloud 项目。将你的项目 ID 保存在某个地方，因为我们稍后会用到它。

之后你需要启用 Vertex AI API：

然后你需要在 IAM & Admin > Service Accounts 下创建一个服务帐户：

你需要确保你的服务帐户具有正确的角色和权限，以便能够使用 Google Vertex AI 生成嵌入。分配包含权限 aiplatform.endpoints.predict 的 Vertex AI 用户（roles aiplatform.user）。

为你的服务帐户创建密钥时，请确保选择 json。下载服务帐户 JSON 并保存以供日后使用。

现在，打开 Kibana 的开发控制台。你还可以使用 HTTP GUI 客户端（如 postman）或任何其他工具（如 curl）执行以下步骤。

你将使用创建推理 API（Create inference API）创建推理端点，方法是提供服务帐户 JSON、位置、要使用的模型和项目 ID：



1.  PUT _inference/text_embedding/google_vertex_ai_embedding
2.  {
3.      "service": "googlevertexai",
4.      "service_settings": {
5.          "service_account_json": "<service-account-json>",
6.          "model_id": "<model-id>",
7.          "location": "<location>",
8.          "project_id": "<project-id>"
9.      }
10.  }

你将收到来自 Elasticsearch 的响应，其中包含创建的端点：



1.  {
2.    "inference_id": "google_vertex_ai_embedding",
3.    "task_type": "text_embedding",
4.    "service": "googlevertexai",
5.    "service_settings": {
6.      "location": "<location>",
7.      "project_id": "<project-id>",
8.      "model_id": "<model-id>",
9.      "dimensions": 768,
10.      "similarity": "dot_product",
11.      "rate_limit": {
12.        "requests_per_minute": 30000
13.      }
14.    },
15.    "task_settings": {}
16.  }

在底层，Elasticsearch 将使用你的凭据连接到 Google Vertex AI，以获取用于生成嵌入的维度数。它还会将检索期间使用的相似度度量设置为合理的默认值（在本例中为 dot_product）。

你可以通过调用执行推理 API（ perform inference API）来测试你的端点：



1.  POST _inference/text_embedding/google_vertex_ai_embedding
2.  {
3.    "input": "This text will be embedded"
4.  }

API 将返回一个响应，其中包含针对你的输入生成的嵌入：



1.  {
2.    "text_embedding": [
3.      {
4.        "embedding": [
5.          -0.014122169,
6.          0.044469967,
7.          0.02421774,
8.          -0.003546892,
9.          ...
10.         ]
11.      }
12.     ]
13.  }

将 semantic_text 与 Google Vertex AI 嵌入结合使用

现在，我们已经使用 Google Vertex AI 嵌入设置了推理端点，我们可以使用 Elasticsearch 中的新 semantic_text 字段类型来执行开箱即用的语义搜索，而无需设置额外的应用程序代码，在提取和查询期间明确调用推理 API 来生成嵌入。

请注意，如果要生成嵌入的文本太大，推理 API 和 semantic_text 的组合会执行自动分块（将较大的文本分成较小的块）！我们非常高兴看到开发人员如何使用此功能。

为了继续我们的示例，我们将创建一个索引，该索引引用我们的推理端点 google_vertex_ai_embedding，以便在索引文档和查询时搜索时生成嵌入：



1.  PUT my-index
2.  {
3.    "mappings": {
4.      "properties": {
5.        "my_field": {
6.          "type": "semantic_text",
7.          "inference_id": "google_vertex_ai_embedding"
8.        }
9.      }
10.    }
11.  }

我们现在可以索引文档，semantic_text 字段类型将负责使用我们的推理端点生成密集嵌入，该端点在后台调用 Google Vertex AI API。出于演示目的，我们只索引一个文档：



1.  PUT my-index/_doc/doc1
2.  {
3.    "my_field": "These are not the droids you're looking for. He's free to go around"
4.  }

你现在可以使用以下请求发出语义搜索请求：



1.  GET my-index/_search

3.  {
4.    "query": {
5.      "semantic": {
6.        "field": "my_field",
7.        "query": "robots you're searching for"
8.      }
9.    }
10.  }

你应该取回我们之前索引的文档：



1.  {
2.    "took": 818,
3.    "timed_out": false,
4.    "_shards": {
5.      "total": 1,
6.      "successful": 1,
7.      "skipped": 0,
8.      "failed": 0
9.    },
10.    "hits": {
11.      "total": {
12.        "value": 1,
13.        "relation": "eq"
14.      },
15.      "max_score": 0.7608695,
16.      "hits": [
17.        {
18.          "_index": "my-index",
19.          "_id": "doc1",
20.          "_score": 0.7608695,
21.          "_source": {
22.            "my_field": {
23.              "text": "These are not the droids you're looking for. He's free to go around",
24.              "inference": {
25.                "inference_id": "google_vertex_ai_embedding",
26.                "model_settings": {
27.                  "task_type": "text_embedding",
28.                  "dimensions": 768,
29.                  "similarity": "dot_product",
30.                  "element_type": "float"
31.                },
32.                "chunks": [
33.                  {
34.                    "text": "These are not the droids you're looking for. He's free to go around",
35.                    "embeddings": [
36.                      -0.0022440397,
37.                      -0.028731884,
38.                      ...
39.  		     ]
40.                  }
41.                ]
42.             }
43.        }
44.     ]
45.  }

开始使用语义重新排序

我们已经创建了一个帐户和一个项目，我们将重复使用它们来设置重新排序（reranking）。

为什么要使用重新排序器？重新排序器可以提高早期检索机制结果的相关性。语义重新排序器使用机器学习模型根据搜索结果与查询的语义相似性对其进行重新排序。

回到我们的实现，你需要启用 Discovery Engine API 才能对文档进行重新排序：

你可以重复使用我们之前创建的服务帐户，也可以创建一个新的服务帐户。将包含权限 discoveryengine.rankingConfigs.rank 的角色 Discovery Engine Viewer (roles/discoveryengine.viewer) 分配给你将使用的服务帐户。

对于以下步骤，你可以再次使用 Kibana Dev Console 或你喜欢的任何 http 客户端。

首先，我们将创建一个推理端点，以便能够对文档进行重新排名：



1.  PUT _inference/rerank/google_vertex_ai_rerank
2.  {
3.      "service": "googlevertexai",
4.      "service_settings": {
5.          "service_account_json": "<service-account-json>",
6.          "project_id": "<project-id>"
7.      }
8.  }

你将再次收到表明推理端点已成功创建的响应：



1.  {
2.    "inference_id": "google_vertex_ai_rerank",
3.    "task_type": "rerank",
4.    "service": "googlevertexai",
5.    "service_settings": {
6.      "project_id": "<project-id>",
7.      "rate_limit": {
8.        "requests_per_minute": 300
9.      }
10.    },
11.    "task_settings": {}
12.  }

让我们将一组文档发送到我们新创建的端点，以确保重新排名有效。我们将使用 Google Vertex AI 的重新排名文档作为示例：



1.  POST _inference/rerank/google_vertex_ai_rerank
2.  {
3.    "query": "Why is the sky blue?",
4.    "input": [
5.     "A canvas stretched across the day,\nWhere sunlight learns to dance and play.\nBlue, a hue of scattered light,\nA gentle whisper, soft and bright.",
6.     "The sky appears blue due to a phenomenon called Rayleigh scattering. Sunlight is comprised of all the colors of the rainbow. Blue light has shorter wavelengths than other colors, and is thus scattered more easily."
7.    ]
8.  }

你将收到回复，其中科学解释的排名应该更高：



1.  {
2.    "rerank": [
3.      {
4.        "index": 0,
5.        "relevance_score": 0.82,
6.        "text": "The sky appears blue due to a phenomenon called Rayleigh scattering. Sunlight is comprised of all the colors of the rainbow. Blue light has shorter wavelengths than other colors, and is thus scattered more easily."
7.      },
8.      {
9.        "index": 1,
10.        "relevance_score": 0.43,
11.        "text": """A canvas stretched across the day,
12.  Where sunlight learns to dance and play.
13.  Blue, a hue of scattered light,
14.  A gentle whisper, soft and bright."""
15.      }
16.    ]
17.  }

将检索器与 Google Vertex AI 语义重新排名结合使用

在 8.14 中，我们引入了另一个令人兴奋的功能，称为检索器（retrievers），它提供了一个直观的 API，可以在单个 _search API 调用中定义多阶段检索管道。这消除了应用程序向 Elasticsearch 发出多个搜索请求并在之后相应地合并结果的负担。在我们的示例中，我们定义了一个简单的多阶段检索管道，它使用常见的 BM25 算法来检索一组与术语 “sky” 相关的文档。之后，这组文档将传递到我们的推理端点 google_vertex_ai_rerank，以进一步优化结果集的顺序，从而为我们提供天空为何是蓝色的科学解释。

我们正在使用 Bulk API 创建一小部分文档，其中包括关于 mountains、sky、ocean 的诗歌，以及天空为何是蓝色的一个科学解释：



1.  PUT _bulk
2.  {"index": {"_index": "another-index", "_id": "1"} }
3.  {"text": "A canvas stretched across the day,\nWhere sunlight learns to dance and play.\nBlue, a hue of scattered light,\nA gentle whisper, soft and bright."}
4.  {"index": {"_index": "another-index", "_id": "2"} }
5.  {"text": "The sky appears blue due to a phenomenon called Rayleigh scattering. Sunlight is comprised of all the colors of the rainbow. Blue light has shorter wavelengths than other colors, and is thus scattered more easily."}
6.  {"index": {"_index": "another-index", "_id": "3"} }
7.  {"text": "The sky wraps around the earth so wide, A tender touch where dreams reside. Golden streaks at dawn’s first light, The sky awakens, pure and bright."}
8.  {"index": {"_index": "another-index", "_id": "4"} }
9.  {"text": "The sky dims as the day unwinds, A lullaby of soft winds kind. Purple hues in twilight's grace, The night arrives, a tender embrace."}
10.  {"index": {"_index": "another-index", "_id": "5"} }
11.  {"text": "The sky at night, a velvet sea, With stars that shine so endlessly. A canvas dark, yet full of light, Guiding travelers through the night."}
12.  {"index": {"_index": "another-index", "_id": "6"} }
13.  {"text": "The ocean hums a gentle tune, Beneath the light of the silver moon. Waves that cradle dreams so deep, In their rhythm, we find sleep."}
14.  {"index": {"_index": "another-index", "_id": "7"} }
15.  {"text": "The tide retreats with a whispered sigh, Leaving shells as memories pass by. A dance of water, soft and slow, In and out, a constant flow."}
16.  {"index": {"_index": "another-index", "_id": "8"} }
17.  {"text": "The ocean’s arms are vast and wide, A place where endless wonders hide. Blue as far as the eye can see, A world of calm and mystery."}
18.  {"index": {"_index": "another-index", "_id": "9"} }
19.  {"text": "The ocean speaks in murmurs low, Secrets only the dolphins know. A world below the surface bright, Where colors blend in liquid light."}
20.  {"index": {"_index": "another-index", "_id": "10"} }
21.  {"text": "The ocean sings a song so clear, A melody that draws us near. Waves that rise and gently fall, In their call, we hear it all."}
22.  {"index": {"_index": "another-index", "_id": "11"} }
23.  {"text": "The mountains stand with ancient pride, Their secrets in the rocks they hide. A silent strength, so bold and high, Reaching peaks that touch the eye."}
24.  {"index": {"_index": "another-index", "_id": "12"} }
25.  {"text": "A mountain trail that winds and weaves, Through forests thick with whispering leaves. Each step a journey, each climb a test, Until you find the summit's rest."}
26.  {"index": {"_index": "another-index", "_id": "13"} }
27.  {"text": "The mountains echo with the sound, Of winds that dance and streams that bound. A fortress carved by time’s own hand, Where nature’s might and beauty stand."}



1.  POST another-index/_search
2.  {
3.    "retriever": { // Retriever query
4.      "text_similarity_reranker": { // Outermost retriever will perform reranking
5.        "retriever": {
6.          "standard": { // First-stage retriever is a standard Elasticsearch query
7.            "query": {
8.              "match": { // Standard BM25 matching
9.                "text": "sky"
10.              }
11.            }
12.          }
13.        },
14.        "field": "text", // Document field to send to reranker
15.        "rank_window_size": 10, // Reranking will work on top K hits
16.        "inference_id": "google_vertex_ai_rerank", // Inference endpoint
17.        "inference_text": "Why is the sky blue?",
18.        "min_score": 0.6 // Minimum relevance score
19.      }
20.    }
21.  }

结论

只需使用几个简单的 API 调用，即可利用 semantic_text 和 retrievers 的强大功能以及 Google Vertex AI 的密集嵌入和重新排名功能，从而抽象出语义搜索和重新排名的复杂部分。所有这些功能都已在我们的 serverless 产品中提供，请立即试用！

访问 Search Labs 上的 Google Vertex AI 页面，或尝试搜索实验室 GitHub 上的其他示例笔记本。

准备好自己尝试了吗？开始免费试用。

Elasticsearch 集成了 LangChain、Cohere 等工具。加入我们的 Beyond RAG Basics 网络研讨会，构建您的下一个 GenAI 应用程序！

原文：Elasticsearch vector database adds support for Google Vertex AI — Search Labs