Elasticsearch:智能搜索 - AI builder,workflow 及 skills

0 阅读2分钟

想象一下,我们如何搜索如下的一个问题:

`Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000.`AI写代码

这类问题存在于很多的电子商务网站搜索中。它也是一种非常实用的搜索方式之一。那么要实现这样的搜索方式,我们有如下的几种方式来实现:

  1. 使用 Python 代码实现工具,并让 LLM 来进行调用。我们需要调用 LLM 来提取我们搜索的参数。为了精准搜索,我们可以使用 template 来下继续搜索。详细的情况,可以参考文章 “统一 Elastic 向量数据库与 LLM 功能,实现智能查询
  2. 我们可以为这个搜索用 Python 创建一个定制的 MCP 服务器,然后在客户端里进行调用。我们可以参考文章 “Elasticsearch:智能搜索的 MCP
  3. 我们使用 AI Builder 及 Workflow 来实现。在 workflow 里实现类似于在 DSL 中的模版搜索从而达到精确搜索的目的。详细的使用说明,请参考文章 “Elasticsearch:智能搜索 - AI Builder 及 Workflow”。

在如上的三中方案里,第三种方案的实现最为简捷,因为它不需要另外单独的编程。我们只需要在 Kibana 里创建 agent 及 workflow 来完成即可。维护起来也非常简单直接。那么我们有没有更为方便的方法呢?答案是肯定的。在即将推出的 Elastic Stack 9.4 中(目前在 Elastic Serverless cloud 中可用),我们可以使用 skill 来更进一步简化的目的。

步骤一:写入数据

我们需要按照文章 “Elasticsearch:智能搜索的 MCP” 写入文档到 Elasticsearch 中。

步骤二:创建 geocoding worflow 及相应的工具

在我们的实现里,我们需要根据位置信息来得到一个精确的经纬度,以便实现相应的搜索。我们可以仿照之前的文章 “Elasticsearch:创建 geocoding workflow,并在 agent 中使用它进行位置搜索”。

:上面的界面为 Elastic Serverless Cloud 界面。它和我们在本地部署的 Elastic Stack 界面有所不同。

步骤三:创建 DSL search template

在目前的情况下,在 agent 里,我们只能创建如下的几种 tools:

也就是说,如果我们想创建一个类似于 DSL search template 的搜索,在当前的 agent 设计中,DSL 是不被支持的。那么我们该如何实现这个呢?

答案是,我们可以为 agent 设计一个 skill。这个 skill 可以为我们的 agent 提供额外的能力。我们按照如下的步骤来创建一个这样的 elasticsearch_dsl_template skill:

它的设置如下:

  • ID:elasticsearch_dsl_template
  • Name:Elasticsearch DSL search template
  • Description:Construct Elasticsearch using DSL search template.
  • Instructions
`1.  The details for implementing a search template can be found at https://www.elastic.co/docs/solutions/search/search-templates. 

3.  For our search template, we need to use the following search template to do the DSL search:

5.  {
6.      "_source": false,
7.      "size": 5,
8.      "fields": ["title", "tax", "maintenance_fee", "bathrooms", "bedrooms", "square_footage", "home_price", "property_features"],
9.      "retriever": {
10.          "standard": {
11.              "query": {
12.                  "semantic": {
13.                      "field": "body_content_semantic_text",
14.                      "query": "{{query}}"
15.                  }
16.              },
17.              "filter": {
18.                  "bool": {
19.                      "must": [
20.                          {{#distance}}{
21.                              "geo_distance": {
22.                                  "distance": "{{distance}}",
23.                                  "location": {
24.                                      "lat": {{latitude}},
25.                                      "lon": {{longitude}}
26.                                  }
27.                              }
28.                          }{{/distance}}
29.                          {{#bedrooms}}{{#distance}},{{/distance}}{
30.                              "range": {
31.                                  "bedrooms": {
32.                                      "gte": {{bedrooms}}
33.                                  }
34.                              }
35.                          }{{/bedrooms}}
36.                          {{#bathrooms}}{{#distance}}{{^bedrooms}},{{/bedrooms}}{{/distance}}{{#bedrooms}},{{/bedrooms}}{
37.                              "range": {
38.                                  "bathrooms": {
39.                                      "gte": {{bathrooms}}
40.                                  }
41.                              }
42.                          }{{/bathrooms}}
43.                          {{#tax}}{{#distance}}{{^bedrooms}}{{^bathrooms}},{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}},{{/bathrooms}}{{/bedrooms}}{{#bathrooms}},{{/bathrooms}}{
44.                              "range": {
45.                                  "tax": {
46.                                      "lte": {{tax}}
47.                                  }
48.                              }
49.                          }{{/tax}}
50.                          {{#maintenance}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{#tax}},{{/tax}}{
51.                              "range": {
52.                                  "maintenance_fee": {
53.                                      "lte": {{maintenance}}
54.                                  }
55.                              }
56.                          }{{/maintenance}}
57.                          {{#square_footage_max}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{#tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{#maintenance}},{{/maintenance}}{
58.                              "range": {
59.                                  "square_footage": {
60.                                      "gte": {{#square_footage_min}}{{square_footage_min}}{{/square_footage_min}}{{^square_footage_min}}0{{/square_footage_min}},
61.                                      "lte": {{square_footage_max}}
62.                                  }
63.                              }
64.                          }{{/square_footage_max}}
65.                          {{#home_price_max}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{#tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{#maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{#square_footage}},{{/square_footage}}{
66.                              "range": {
67.                                  "home_price": {
68.                                      "gte": {{#home_price_min}}{{home_price_min}}{{/home_price_min}}{{^home_price_min}}0{{/home_price_min}},
69.                                      "lte": {{home_price_max}}
70.                                  }
71.                              }
72.                          }{{/home_price_max}}
73.                          {{#feature}},{
74.                              "bool": {
75.                                  "should": [
76.                                      {
77.                                          "match": {
78.                                              "property_features": {
79.                                                  "query": "{{feature}}",
80.                                                  "operator": "or"
81.                                              }
82.                                          }
83.                                      }
84.                                  ],
85.                                  "minimum_should_match": 1
86.                              }
87.                          }{{/feature}}
88.                      ]
89.                  }
90.              }
91.          }
92.      }
93.  } 

95.  We need to use "properties" index to do the search.` AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)

我们保存好上面的 skill。

步骤四:创建 Property search agent

我们按照如下的步骤创建 Property search agent:

其中的设置:

  • Agent ID:property_search
  • Custom Instructions:
`

1.  This agent is used to search for properties:

3.  # Step 1:
4.  You are an information extraction assistant.
5.          Extract real estate search parameters from the user query.

7.          Parameter descriptions:
8.          - bathrooms: Number of bathrooms
9.          - bedrooms: Number of bedrooms
10.          - tax: Real estate tax amount
11.          - maintenance: Maintenance fee amount
12.          - square_footage_min: Minimum property square footage. If only a max square footage is provided, set this to 0. Otherwise set this to the minimum square footage specified by the user.
13.          - square_footage_max: Maximum property square footage
14.          - home_price_min: Minimum home price. If only a max home price is provided, set this to 0. Otherwise set this to the minimum home price specified by the user.
15.          - home_price_max: Maximum home price
16.          - property_features: Home features such as AC, pool, updated kitchens, etc should be listed as a single string.
17.          - location: City, state, or full address if present.

19.          Rules:
20.          - Only include parameters explicitly mentioned.
21.          - property_features must be a single space-separated string.
22.          - Return ONLY a JSON object (not a string, no quotes, no extra text, no explanations).
23.          - Do not include explanations.

25.          Example JSON:
26.          {
27.            "query": "Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000."
28.            "bathrooms": 2,
29.            "bedrooms": 2,
30.            "home_price_min": 0,
31.            "home_price_max": 300000,
32.            "property_features": "central air tile floors",
33.            "location": "Miami, Florida"
34.          }

36.  # Step 2:
37.  Based on the last extracted location, please use the get_coordinate_by_location tool to get the location, and finally get its coordinate info. The final data format is like:

39.          Example JSON:
40.          {
41.             "query": "Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2              bathrooms, central air, and tile floors, with a budget up to $300,000.",
42.              "bathrooms": "2",
43.              "bedrooms": "2",
44.              "home_price_max": "300000",
45.              "property_features": "central air, tile floors",
46.              "longitude": -80.1917902,
47.              "latitude": 25.7616798,
48.              "distance_meters": 16093.4
49.        }

51.  # Step 3:
52.  Use the above constructed JSON format, and do a DSL template search. Please print out the search template used for search, and then print out the top **4 results** for viewing.

`AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)
  • Display name:Property search
  • Display description:Search for property

为 agent 添加所需要的 geocoding 工具:

为 agent 添加 skill

这样我们的 agent 就设计好了。

测试

我们使用如下的搜索例子来做测试:

`Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000.`AI写代码

上面显示了我们所需要的结果。和我们之前的搜索结果是一样的。我们可以查看一下它的思索过程:

很显然,它使用我们提供的 skill 来完成这个搜索。我们可以看到 Calling tool filestore.read。它使用了我们的 skill。更加有意思的是,它理解我们的意图,并使用 ES|QL 生成相应的 ES|QL 查询:

`

1.  FROM properties
2.  | WHERE ST_DISTANCE(location, TO_GEOPOINT("POINT(-80.1917902 25.7616798)")) <= 16093.44
3.    AND bedrooms >= 2
4.    AND bathrooms >= 2
5.    AND home_price >= 0
6.    AND home_price <= 300000
7.    AND MATCH(property_features, "central air tile floors")
8.  | KEEP title, tax, maintenance_fee, bathrooms, bedrooms, square_footage, home_price, property_features
9.  | LIMIT 4

`AI写代码

很显然,这个就是我们所需要的类似于 DSL template 的查询。根据含有的字段来做相应的搜索,虽然它使用的不是 DSL template 查询。

我们使用另外一个例子来展示:

`Find a home within 10 miles of DeBary, Florida with 5 bedrooms, at least 2 bathrooms, central air, and a garage, with a budget up to $600,000.`AI写代码

结论

Skill 是在即将发布的 Elastic Stack 9.4 里一个非常重要的功能。它极大地方便了我们对 agent 的设计。在很多的设计中,我们甚至不需要使用任何的代码实现就可以在 Kibana 中完成我们所需要的功能。

祝大家学习愉快!