Elasticsearch:智能搜索 - AI builder 及 skills

0 阅读1分钟

想象一下,我们如何搜索如下的一个问题:

`Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000.`AI写代码

这类问题存在于很多的电子商务网站搜索中。它也是一种非常实用的搜索方式之一。那么要实现这样的搜索方式,我们有如下的几种方式来实现:

  1. 使用 Python 代码实现工具,并让 LLM 来进行调用。我们需要调用 LLM 来提取我们搜索的参数。为了精准搜索,我们可以使用 template 来下继续搜索。详细的情况,可以参考文章 “统一 Elastic 向量数据库与 LLM 功能,实现智能查询
  2. 我们可以为这个搜索用 Python 创建一个定制的 MCP 服务器,然后在客户端里进行调用。我们可以参考文章 “Elasticsearch:智能搜索的 MCP
  3. 我们使用 AI Builder 及 Workflow 来实现。在 workflow 里实现类似于在 DSL 中的模版搜索从而达到精确搜索的目的。详细的使用说明,请参考文章 “Elasticsearch:智能搜索 - AI Builder 及 Workflow”。
  4. 我们使用 AI Builder,Workflow 及 Skills 来共同完成。我们使用 geocoding workflow 来完成地理位置的获取,详细的实现请参阅文章 “Elasticsearch:智能搜索 - AI builder,workflow 及 skills”。

在如上的三种方案里,第三种方案的实现最为简捷,因为它不需要另外单独的编程。我们只需要在 Kibana 里创建 agent 及 workflow 来完成即可。维护起来也非常简单直接。那么我们有没有更为方便的方法呢?答案是肯定的。在即将推出的 Elastic Stack 9.4 中(目前在 Elastic Serverless cloud 中可用),我们可以使用 skill 来更进一步简化的目的。

在第四中方案里,我们也使用了 workflow 来完成 geocoding 的工作。我们是否可以直接省去这个环节,在 skill 里完成呢?答案是肯定的。我们下面来展示是如何完成的。

步骤一:写入数据

我们需要按照文章 “Elasticsearch:智能搜索的 MCP” 写入文档到 Elasticsearch 中。

步骤二:创建 property_search_skills

配置

  • ID:property_search_skills
  • Name:Property search skills
  • Description
`

1.  - Invoke python script to do geocoding
2.  - Construct Elasticsearch using DSL search template.

`AI写代码
  • Instructions:
`

1.  - Used to do the geocoding based the queries
2.  - How to construct DSL search template to search the "properties" index

`AI写代码
  • Files
    •  File name:property_search_skill

    • Folder path

      `./`AI写代码
      
    • Content

      ``
      
      1.  ---
      2.  name: Property Search Skills
      3.  description: Skills related to property search functionality, including geocoding and location-based queries.
      4.  ---
      
      6.  # Property Search Skills
      7.  This document outlines the skills and tools used for implementing property search functionality, particularly focusing on geocoding and location-based queries.
      
      9.  ## Geocoding Tool
      10.  The geocoding tool is designed to convert addresses into geographic coordinates (latitude and longitude) using the Google Maps Geocoding API. This allows for location-based searches and queries in property search applications. The API to be used is the Google Maps Geocoding API, which requires an API key for authentication.
      
      12.  ### Environment Variables
      13.  To use the geocoding tool, the following environment variables need to be set:
      14.  - `GOOGLE_MAPS_API_KEY`: Your Google Maps API key for accessing the Geocoding API.  
      
      16.  **geocode_tool.py is the script that implements the geocoding functionality. It takes an address as input and returns the corresponding latitude and longitude coordinates. The results can be stored in Elasticsearch for further querying and analysis in property search applications.**
      
      18.  ```
      19.  import os
      20.  import sys
      21.  import json
      22.  import argparse
      23.  import requests
      
      25.  GEOCODE_URL = "https://maps.googleapis.com/maps/api/geocode/json"
      
      28.  def geocode(address: str, api_key: str) -> dict:
      29.      params = {
      30.          "address": address,
      31.          "key": api_key
      32.      }
      
      34.      resp = requests.get(GEOCODE_URL, params=params, timeout=10)
      35.      resp.raise_for_status()
      36.      data = resp.json()
      
      38.      if data.get("status") != "OK":
      39.          return {
      40.              "success": False,
      41.              "error": data.get("status"),
      42.              "raw": data
      43.          }
      
      45.      result = data["results"][0]
      46.      location = result["geometry"]["location"]
      
      48.      return {
      49.          "success": True,
      50.          "formatted_address": result["formatted_address"],
      51.          "location": {
      52.              "lat": location["lat"],
      53.              "lon": location["lng"]  # Google -> Elasticsearch format
      54.          },
      55.          "place_id": result.get("place_id"),
      56.          "types": result.get("types")
      57.      }
      
      60.  def main():
      61.      parser = argparse.ArgumentParser(description="Geocode an address")
      62.      parser.add_argument("--address", help="Address to geocode")
      
      64.      args = parser.parse_args()
      
      66.      try:
      67.          # Determine input source (stdin takes priority)
      68.          if not sys.stdin.isatty():
      69.              payload = json.load(sys.stdin)
      70.              address = payload.get("address")
      71.          else:
      72.              address = args.address
      
      74.          if not address:
      75.              raise ValueError("Missing address (provide via --address or stdin JSON)")
      
      77.          api_key = "<Your Google API Key>"
      
      79.          # api_key = os.environ.get("GOOGLE_API_KEY")
      80.          # if not api_key:
      81.          #     raise ValueError("Missing GOOGLE_API_KEY environment variable")
      
      83.          result = geocode(address, api_key)
      
      85.          print(json.dumps(result))
      86.          sys.exit(0 if result["success"] else 1)
      
      88.      except Exception as e:
      89.          print(json.dumps({
      90.              "success": False,
      91.              "error": str(e)
      92.          }))
      93.          sys.exit(1)
      
      96.  if __name__ == "__main__":
      97.      main()
      98.  ```
      
      100.  ### Usage
      101.  To run the geocoding tool, use the following command in your terminal:
      102.  ```bash
      103.  python geocode_tool.py --address "1600 Amphitheatre Parkway, Mountain View, CA"
      104.  ```
      
      106.  This command will geocode the provided address and return the corresponding latitude and longitude coordinates. The results can then be stored in Elasticsearch for further querying and analysis in property search applications.
      
      108.  # DSL search templates
      109.  In addition to geocoding, property search applications often require the ability to perform complex queries on the indexed property data. This can be achieved using Elasticsearch's Domain Specific Language (DSL) for searching and filtering data based on various criteria such as location, price range, property type, etc.
      
      111.  The details for implementing a search template can be found at https://www.elastic.co/docs/solutions/search/search-templates. 
      
      113.  For our search template, we need to use the following search template to do the DSL search:
      
      115.  {
      116.      "_source": false,
      117.      "size": 5,
      118.      "fields": ["title", "tax", "maintenance_fee", "bathrooms", "bedrooms", "square_footage", "home_price", "property_features"],
      119.      "retriever": {
      120.          "standard": {
      121.              "query": {
      122.                  "semantic": {
      123.                      "field": "body_content_semantic_text",
      124.                      "query": "{{query}}"
      125.                  }
      126.              },
      127.              "filter": {
      128.                  "bool": {
      129.                      "must": [
      130.                          {{#distance}}{
      131.                              "geo_distance": {
      132.                                  "distance": "{{distance}}",
      133.                                  "location": {
      134.                                      "lat": {{latitude}},
      135.                                      "lon": {{longitude}}
      136.                                  }
      137.                              }
      138.                          }{{/distance}}
      139.                          {{#bedrooms}}{{#distance}},{{/distance}}{
      140.                              "range": {
      141.                                  "bedrooms": {
      142.                                      "gte": {{bedrooms}}
      143.                                  }
      144.                              }
      145.                          }{{/bedrooms}}
      146.                          {{#bathrooms}}{{#distance}}{{^bedrooms}},{{/bedrooms}}{{/distance}}{{#bedrooms}},{{/bedrooms}}{
      147.                              "range": {
      148.                                  "bathrooms": {
      149.                                      "gte": {{bathrooms}}
      150.                                  }
      151.                              }
      152.                          }{{/bathrooms}}
      153.                          {{#tax}}{{#distance}}{{^bedrooms}}{{^bathrooms}},{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}},{{/bathrooms}}{{/bedrooms}}{{#bathrooms}},{{/bathrooms}}{
      154.                              "range": {
      155.                                  "tax": {
      156.                                      "lte": {{tax}}
      157.                                  }
      158.                              }
      159.                          }{{/tax}}
      160.                          {{#maintenance}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}},{{/tax}}{{/bathrooms}}{{#tax}},{{/tax}}{
      161.                              "range": {
      162.                                  "maintenance_fee": {
      163.                                      "lte": {{maintenance}}
      164.                                  }
      165.                              }
      166.                          }{{/maintenance}}
      167.                          {{#square_footage_max}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{/bathrooms}}{{#tax}}{{^maintenance}},{{/maintenance}}{{/tax}}{{#maintenance}},{{/maintenance}}{
      168.                              "range": {
      169.                                  "square_footage": {
      170.                                      "gte": {{#square_footage_min}}{{square_footage_min}}{{/square_footage_min}}{{^square_footage_min}}0{{/square_footage_min}},
      171.                                      "lte": {{square_footage_max}}
      172.                                  }
      173.                              }
      174.                          }{{/square_footage_max}}
      175.                          {{#home_price_max}}{{#distance}}{{^bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{/distance}}{{#bedrooms}}{{^bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{/bedrooms}}{{#bathrooms}}{{^tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{/bathrooms}}{{#tax}}{{^maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{/tax}}{{#maintenance}}{{^square_footage}},{{/square_footage}}{{/maintenance}}{{#square_footage}},{{/square_footage}}{
      176.                              "range": {
      177.                                  "home_price": {
      178.                                      "gte": {{#home_price_min}}{{home_price_min}}{{/home_price_min}}{{^home_price_min}}0{{/home_price_min}},
      179.                                      "lte": {{home_price_max}}
      180.                                  }
      181.                              }
      182.                          }{{/home_price_max}}
      183.                          {{#feature}},{
      184.                              "bool": {
      185.                                  "should": [
      186.                                      {
      187.                                          "match": {
      188.                                              "property_features": {
      189.                                                  "query": "{{feature}}",
      190.                                                  "operator": "or"
      191.                                              }
      192.                                          }
      193.                                      }
      194.                                  ],
      195.                                  "minimum_should_match": 1
      196.                              }
      197.                          }{{/feature}}
      198.                      ]
      199.                  }
      200.              }
      201.          }
      202.      }
      203.  } 
      
      205.  We need to use "properties" index to do the search.  **please do see the range searches for bedrooms and bathrooms“. We want to have bigger or equal matches. For the price, we need to have equal or smaller matches
      
      ``AI写代码![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)收起代码块![](https://csdnimg.cn/release/blogv2/dist/pc/img/arrowup-line-top-White.png)
      

:请在上面的代码中添加自己的  api_key = ""。

我们保存好上面的 skill 配置。

步骤三:创建 property_search_skill agent

我们按照如的步骤来创建 agent:

  

配置:

  • ID:property_search_skill

  • Custom Instructions

    `
    
    1.  This agent is used to search for properties:
    
    3.  # Step 1:
    4.  You are an information extraction assistant.
    5.          Extract real estate search parameters from the user query.
    
    7.          Parameter descriptions:
    8.          - bathrooms: Number of bathrooms
    9.          - bedrooms: Number of bedrooms
    10.          - tax: Real estate tax amount
    11.          - maintenance: Maintenance fee amount
    12.          - square_footage_min: Minimum property square footage. If only a max square footage is provided, set this to 0. Otherwise set this to the minimum square footage specified by the user.
    13.          - square_footage_max: Maximum property square footage
    14.          - home_price_min: Minimum home price. If only a max home price is provided, set this to 0. Otherwise set this to the minimum home price specified by the user.
    15.          - home_price_max: Maximum home price
    16.          - property_features: Home features such as AC, pool, updated kitchens, etc should be listed as a single string.
    17.          - location: City, state, or full address if present.
    
    19.          Rules:
    20.          - Only include parameters explicitly mentioned.
    21.          - property_features must be a single space-separated string.
    22.          - Return ONLY a JSON object (not a string, no quotes, no extra text, no explanations).
    23.          - Do not include explanations.
    
    25.          Example JSON:
    26.          {
    27.            "query": "Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000."
    28.            "bathrooms": 2,
    29.            "bedrooms": 2,
    30.            "home_price_min": 0,
    31.            "home_price_max": 300000,
    32.            "property_features": "central air tile floors",
    33.            "location": "Miami, Florida"
    34.          }
    
    36.  # Step 2:
    37.  - Use the above constructed JSON format, and do a DSL template search. If you need to convert it to ES|QL queries, please do follow exactly the DSL template search ranges:
    38.  1. bathrooms is bigger or equal to the extracted one
    39.  2. bedrooms is bigger or equal to the extracted one
    40.  3. home_price is smaller or equal to the extracted one (home_price_max)
    
    42.  - Before you do the searches, please DO refer to the requirements specified by the property search skills/property_search_skill.md.
    
    44.  - Please print out the search template used for search, and then print out the top **4 results** for viewing.
    
    `AI写代码javascript运行![](https://csdnimg.cn/release/blogv2/dist/pc/img/runCode/icon-arrowwhite.png)
    
  • Display name:Property search skills

  • Display description:Search for property

添加 skills:

我们把创建的 skill 添加到我们创建的 agent 中。

测试

我们还是按照之前的测试用例来进行测试:

`Find a home within 10 miles of Miami, Florida that has 2 bedrooms, 2 bathrooms, central air, and tile floors, with a budget up to $300,000.`AI写代码

我们再以第二个例子来做展示:

`Find a home within 10 miles of DeBary, Florida with 5 bedrooms, at least 2 bathrooms, central air, and a garage, with a budget up to $600,000.`AI写代码javascript运行

结论

在这个例子里,我们看到了 skill 的强大之处。我们甚至省去了繁琐的代码及 workflow 的创建。我们只使用 skill 即可。这些 skill 只有在需要的时候才会装载。非常省内存。

祝大家学习愉快!