ES中常用查询解释

4,123 阅读5分钟

ES中常用查询解释

项目中间经常使用es进行一些查询,这里通过一些例子简单说明一下es中常见的几种查询,以下通过products索引来对match,term,bool查询进行讲解说明

PUT /products
{
  "mappings": {
    "properties": {
      "product_id": { "type": "integer" },
      "name": { "type": "text" },
      "category": { "type": "keyword" },
      "price": { "type": "float" }
    }
  }
}

示例数据

[
  {
    "product_id": 1,
    "name": "iPhone 13 Pro",
    "category": "Electronics",
    "price": 1099
  },
  {
    "product_id": 2,
    "name": "Samsung Galaxy",
    "category": "Electronics",
    "price": 899
  },
  {
    "product_id": 3,
    "name": "Nike Running Shoes",
    "category": "Sportswear",
    "price": 99
  },
  {
    "product_id": 4,
    "name": "Sony Headphones",
    "category": "Electronics",
    "price": 199
  },
  {
    "product_id": 5,
    "name": "Canon EOS R5",
    "category": "Photography",
    "price": 3499
  },
  {
    "product_id": 6,
    "name": "Adidas Soccer Ball",
    "category": "Sports",
    "price": 20
  },
  {
    "product_id": 7,
    "name": "Logitech Keyboard",
    "category": "Electronics",
    "price": 79
  },
  {
    "product_id": 8,
    "name": "Dell Laptop",
    "category": "Electronics",
    "price": 1299
  }
]

1. Match 查询:

match 查询会对文本字段进行全文搜索(通过分词来进行查询),不同的字段类型在存入es的时候会根据使用的分词器进行相应的处理。

比如

  • text类型字段会在写入es的时候,会对字段内容进行分词处理,分词结果由使用的分词器决定。
  • keyword类型字段在写入es的时候,不会对字段内容进行分词处理,可以理解为写入是什么值存储就是什么,类似于根据字段进行精确查询。

对于text类型的字段,也可以通过${filed}.keyword来进行keyword精确查询。但是别混淆match对于分词的匹配和field text类型分词。

  • match是对查询条件使用分词查询
  • field是对于字段内容写入es而言

查询条件:

{
  "query": {
    "match": {
      "category": "Electronics"
    }
  }
}

返回category包含"Electronics"的所有产品(没有使用keyword)。

查询结果:

[
  {
    "product_id": 1,
    "name": "iPhone 13 Pro",
    "category": "Electronics",
    "price": 1099
  },
  {
    "product_id": 2,
    "name": "Samsung Galaxy",
    "category": "Electronics",
    "price": 899
  },
  {
    "product_id": 7,
    "name": "Logitech Keyboard",
    "category": "Electronics",
    "price": 79
  },
  {
    "product_id": 8,
    "name": "Dell Laptop",
    "category": "Electronics",
    "price": 1299
  }
]

2. Term 查询:

term 查询用于精确匹配某个字段的值。

查询条件:

{
  "query": {
    "term": {
      "category.keyword": "Electronics"
    }
  }
}

查询结果:

返回类别为"Electronics"的所有产品。

[
  {
    "product_id": 1,
    "name": "iPhone 13 Pro",
    "category": "Electronics",
    "price": 1099
  },
  {
    "product_id": 2,
    "name": "Samsung Galaxy",
    "category": "Electronics",
    "price": 899
  },
  {
    "product_id": 4,
    "name": "Sony Headphones",
    "category": "Electronics",
    "price": 199
  },
  {
    "product_id": 7,
    "name": "Logitech Keyboard",
    "category": "Electronics",
    "price": 79
  },
  {
    "product_id": 8,
    "name": "Dell Laptop",
    "category": "Electronics",
    "price": 1299
  }
]

3. Bool 查询:

bool 查询允许组合多个查询条件,包括must(必须匹配)、should(可选匹配)和must_not(不匹配)。

查询条件:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "category": "Electronics" }},
        { "range": { "price": { "gte": 200 }}}
      ],
      "must_not": [
        { "term": { "name.keyword": "Samsung Galaxy" }}
      ]
    }
  }
}

查询结果:

返回类别为"Electronics"且价格大于等于200的产品,但不包括名称为"Samsung Galaxy"的产品。

[
  {
    "product_id": 1,
    "name": "iPhone 13 Pro",
    "category": "Electronics",
    "price": 1099
  },
  {
    "product_id": 4,
    "name": "Sony Headphones",
    "category": "Electronics",
    "price": 199
  },
  {
    "product_id": 8,
    "name": "Dell Laptop",
    "category": "Electronics",
    "price": 1299
  }
]

Filter 查询:

查询条件:

{
  "query": {
    "bool": {
      "filter": [
        { "range": { "price": { "lte": 100 }}}
      ]
    }
  }
}

查询结果:

[
  {
    "product_id": 3,
    "name": "Nike Running Shoes",
    "category": "Sportswear",
    "price": 99
  },
  {
    "product_id": 6,
    "name": "Adidas Soccer Ball",
    "category": "Sports",
    "price": 20
  },
  {
    "product_id": 7,
    "name": "Logitech Keyboard",
    "category": "Electronics",
    "price": 79
  }
]

其实filter查询是bool查询的一种,bool查询通过结合filter,must,should,must_not关键字可以完成很多灵活的查询。

Bool查询filter,must,should,must_not关键字分析

通过以下示例来了解bool查询中filter,must,should,must_not关键字。

假设我们的索引products包含以下数据:

product_idnamecategoryprice
1iPhone 13 ProElectronics1099
2Samsung GalaxyElectronics899
3Nike Running ShoesSportswear99
4Sony HeadphonesElectronics199
5Canon EOS R5Photography3499
6Adidas Soccer BallSports20
7Logitech KeyboardElectronics79
8Dell LaptopElectronics1299

Filter 查询示例:

1. 价格低于等于100,并且属于电子产品的:

{
  "query": {
    "bool": {
      "filter": [
        { "range": { "price": { "lte": 100 }}},
        { "term": { "category.keyword": "Electronics" }}
      ]
    }
  }
}

2. 不是电子产品的,并且价格在200到1000之间的:

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "category.keyword": "Electronics" }},
        { "range": { "price": { "gte": 200, "lte": 1000 }}}
      ],
      "must_not": [
        { "term": { "category.keyword": "Electronics" }}
      ]
    }
  }
}

3. 价格在100到500之间,或者是运动类别的:

{
  "query": {
    "bool": {
      "filter": [
        { "range": { "price": { "gte": 100, "lte": 500 }}},
        { "term": { "category.keyword": "Sportswear" }}
      ],
      "should": [
        { "range": { "price": { "gte": 100, "lte": 500 }}},
        { "term": { "category.keyword": "Sportswear" }}
      ],
      "minimum_should_match": 1
    }
  }
}

4. 不是电子产品,并且价格不高于1000,或者是相机类别的:

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "category.keyword": "Electronics" }},
        { "range": { "price": { "lte": 1000 }}}
      ],
      "must_not": [
        { "term": { "category.keyword": "Electronics" }}
      ],
      "should": [
        { "term": { "category.keyword": "Photography" }}
      ],
      "minimum_should_match": 1
    }
  }
}

在通过以上示例了解之后,有一个疑问就是如果四个关键字同时使用,那么他们之间都会对结果产生什么影响呢?以及他们之间的执行顺序呢?

在查找了一些资料之后,发现没有明确的资料阐述四个关键字的具体执行顺序,但是可以通过分析几个关键字分别在查询阶段和评分阶段的作用来发现以下结果。

  • filter和must_not只作用于查询阶段,不影响评分阶段
  • must和should会影响评分
  • 查询文档如果符合filter条件会被返回,查询文档结果如果不符合must条件不会被返回,查询文档如果匹配must_not,那么不会被返回

总结

因此总结来看filter,must,must_not作用于查询阶段,must,should作用于评分阶段。简单来说就是,通过filter,must,must_not关键字用来缩小匹配到的文档集合,再根据must,should对符合查询条件的结果进行评分最终返回结果集合。所以我们可以简单理解filter,must,must_not会先执行,并且执行顺序一样,should只在评分阶段使用(这里说的顺序只作为理解来看)。

参考链接

In which order are my Elasticsearch queries/filters executed? | Elastic Blog

Elasticsearch Bool Query - Filter, Must, Should & Must Not Queries (opster.com)

elasticsearch - Does bool query exist execution order? - Stack Overflow