Elasticsearch 开放推理 API 增加对 IBM watsonx.ai rerank 模型的支持作者：来自

作者：来自 Elastic Saikat Sarkar

探索在构建 Elasticsearch 向量数据库中的搜索体验时如何使用 IBM watsonx™ reranking。

Elasticsearch 原生集成了业界领先的生成式 AI 工具和提供商。你可以观看我们的网络研讨会，了解如何超越 RAG 基础，或如何在 Elastic Vector Database 上构建可用于生产的应用。

为了构建最适合你用例的搜索解决方案，现在就开始免费的云试用，或在本地机器上体验 Elastic。

Elastic 于 2024 年 12 月发布了 Elastic Rerank，带来了强大的语义搜索能力，无需重新索引即可实现高相关性、卓越性能和高效率。Elastic 提供的核心功能现更加灵活，开发者可以使用来自 Cohere、Vertex AI、Hugging Face、Jina AI，以及现在的 IBM watsonx.ai 的自有模型。借助开放的 Inference API，你可以按需集成、测试并优化 reranking。

除了支持 IBM watsonx™ Slate 嵌入模型外，Elasticsearch 向量数据库还为 watsonx Assistant 提供对话式搜索能力，并通过语义 reranking 提升答案质量。

Reranking 通过先进的评分方法对最相关文档进行优先排序，从而优化大语言模型的响应。这种多阶段检索方式可适用于无需重新索引或重映射的数据集。

IBM watsonx 提供高质量的 reranker 模型，能根据查询相关性准确评分和排序段落，帮助优化搜索结果的精度。这些模型增强了语义搜索、文档对比等任务，是构建 AI 驱动检索系统中高相关性答案的关键。

在本博客中，我们将探索在 Elasticsearch 向量数据库中使用 IBM watsonx™ reranking 的方法，通过语义方式重新排序搜索结果，让你获得更精准、更有上下文的答案，而无需更改已有索引。

重新排序如何打造强大的搜索体验

语义重新排序至关重要，因为用户期望最好的答案出现在最上方，而生成式 AI 模型也需要准确的结果来避免生成错误信息。语义重新排序提供一致的评分，确保 AI 模型使用最相关的文档，并实现有效的截断点以防止幻觉。

前提条件与推理端点创建

创建一个 Elasticsearch Serverless 项目。

Elasticsearch Cloud Serverless 提供快速的查询执行和与开放 Inference API 的无缝集成，非常适合在无需基础设施负担的情况下部署重新排序。

在 IBM Cloud 中生成 API 密钥

前往 IBM watsonx.ai Cloud 并使用你的凭证登录。你将进入欢迎页面。

进入 API 密钥页面。
创建一个 API 密钥。

Elasticsearch 中的步骤

在 Kibana 的 DevTools 中，使用 watsonxai 服务创建一个用于重新排序的推理端点。此示例使用 IBM 支持的 MS Marco MiniLM L-12 v2 模型，以确保段落检索的高相关性。

`

1.  PUT _inference/rerank/ibm_watsonx_rerank
2.  {
3.      "service": "watsonxai",
4.      "service_settings": {
5.          "api_key": "<api_key>",
6.          "url": "xxx.ml.cloud.ibm.com",
7.          "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
8.          "project_id": "<project_id>",
9.          "api_version": "2024-05-02"
10.      }
11.  }

`AI写代码

成功创建推理端点后，你将收到以下响应：

`

1.  {
2.    "inference_id": "ibm_watsonx_rerank",
3.    "task_type": "rerank",
4.    "service": "watsonxai",
5.    "service_settings": {
6.      "url": "xxx.ml.cloud.ibm.com",
7.      "api_version": "2024-05-02",
8.      "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
9.      "project_id": "<project_id>",
10.      "rate_limit": {
11.        "requests_per_minute": 120
12.      }
13.    }
14.  }

`AI写代码

现在让我们创建一个索引。

`

1.  PUT quotes-index
2.  {
3.    "mappings": { 
4.      "properties": {
5.        "movie_title": {
6.          "type": "text"
7.        },
8.        "quotes": {
9.          "type": "text"
10.        }
11.      }
12.    }
13.  }

`AI写代码

接下来，将数据插入到已创建的索引中。

`

1.  PUT quotes-index/_doc/1
2.  {
3.    "movie_title": "The Big Lebowski",
4.    "quotes": [
5.      "That rug really tied the room together",
6.      "Yeah, well, you know, that's just like, uh, your opinion, man"
7.    ]
8.  }

10.  PUT quotes-index/_doc/2
11.  {
12.    "movie_title": "Star Wars",
13.    "quotes": [
14.      "These are not the droids you're looking for",
15.      "I have a bad feeling about this",
16.      "Do. Or do not. There is no try."
17.    ]
18.  }

20.  PUT quotes-index/_doc/3
21.  {
22.    "movie_title": "The Avengers",
23.    "quotes": [
24.      "What's the matter, scared of a little lightning?",
25.      "Superheroes? In New York? Give me a break!"
26.    ]
27.  }

`AI写代码

接下来，让我们使用 text_similarity_reranker 检索器进行搜索。它通过使用机器学习模型，根据与指定推理文本的语义相似度重新排序文档，从而提升搜索结果。

该检索器帮助你在一次 API 调用中配置搜索结果的检索和重新排序。

`

1.  POST quotes-index/_search
2.  {
3.    "retriever": {
4.      "text_similarity_reranker": {
5.        "retriever": {
6.          "standard": {
7.            "query": {
8.              "match": {
9.                "quotes": "feeling lightning"
10.              }
11.            }
12.          }
13.        },
14.        "field": "quotes",
15.        "inference_id": "ibm_watsonx_rerank",
16.        "inference_text": "feeling lightning",
17.        "rank_window_size": 50
18.      }
19.    },
20.    "size": 50
21.  }

`AI写代码

接下来，让我们验证返回的结果。

`

1.  {
2.    "took": 718,
3.    "timed_out": false,
4.    "_shards": {
5.      "total": 1,
6.      "successful": 1,
7.      "skipped": 0,
8.      "failed": 0
9.    },
10.    "hits": {
11.      "total": {
12.        "value": 2,
13.        "relation": "eq"
14.      },
15.      "max_score": 0.003072956,
16.      "hits": [
17.        {
18.          "_index": "quotes-index",
19.          "_id": "3",
20.          "_score": 0.003072956,
21.          "_source": {
22.            "movie_title": "The Avengers",
23.            "quotes": [
24.              "What's the matter, scared of a little lightning?",
25.              "Superheroes? In New York? Give me a break!"
26.            ]
27.          }
28.        },
29.        {
30.          "_index": "quotes-index",
31.          "_id": "2",
32.          "_score": 0.000024473073,
33.          "_source": {
34.            "movie_title": "Star Wars",
35.            "quotes": [
36.              "These are not the droids you're looking for",
37.              "I have a bad feeling about this",
38.              "Do. Or do not. There is no try."
39.            ]
40.          }
41.        }
42.      ]
43.    }
44.  }

`AI写代码

段落现在被重新排序，得分最高的段落排在最前面。在此示例中，基于词匹配的词汇检索最初选中了《The Avengers - 复仇者联盟》（_id: 3）和《Star Wars - 星球大战》（_id: 2），因为一个段落包含 “lightning”，另一个包含 “feeling”。这种方法只考虑表面重叠和关键词。

IBM watsonx.ai rerank 根据上下文重新评估结果，将《复仇者联盟》排在更前面，因为“lightning”与查询“feeling lightning”直接相关。这表明通过优先考虑语义而非简单关键词匹配，reranking 能确保更相关的搜索结果。

今天就试试结合 watsonx 和 Elasticsearch 的语义重新排序吧

借助 IBM watsonx™ rerank 模型的集成，Elasticsearch 开放推理 API 继续为开发者提供更强大、更灵活的 AI 驱动搜索体验能力。探索更多可在 watsonx.ai 中使用的支持编码器基础模型。

此外，立即使用 IBM watsonx Assistant 的新对话式搜索功能和 IBM watsonx Discovery。访问 IBM watsonx Discovery，了解基于 Elasticsearch 的这一新能力。你可以按照这些步骤完成与 IBM watsonx Assistant 的设置和集成。

原文：Elasticsearch open inference API adds support for IBM watsonx.ai rerank models - Elasticsearch Labs