在我之前的文章文章：

我列举了很多关于 Elasticsearch 查询的例子。抱着多多益善的想法，在今天的文章中，我给大家带来更多的例子给大家练习。希望大家对 Elasticsearch 有更多的认识。

Elasticsearch 提供了一组强大的选项来查询各种用例的文档，因此了解将哪个查询应用于特定案例很有用。以下是一个动手教程，可帮助你利用 Elasticsearch 提供的最重要的查询。

在本指南中，你将学习许多带有详细解释的流行查询示例。此处涵盖的每个查询将分为 2 种类型：

结构化查询：用于检索结构化数据的查询，例如日期、数字、密码等。
全文查询：用于查询纯文本的查询。

请注意：在本文章中，我将使用最新的 Elastic Stack 7.16.3 发布来进行展示。有于文章比较长，所以分为两个部分：

设置演示索引

让我们首先使用一些示例数据创建一个新索引，以便你可以按照每个搜索示例进行操作。创建一个名为 “employees” 的索引：

PUT employees

为包含在摄入文档中的字段 (比如，date_of_birth) 定义映射（模式）：



1.  PUT employees/_mapping
2.  {
3.    "properties": {
4.      "date_of_birth": {
5.        "type": "date",
6.        "format": "dd/MM/yyyy"
7.      }
8.    }
9.  }

上面显示，我们文档的日期格式是 dd/MM/yyyy。

现在让我们将一些文档摄入到我们新创建的索引中，如下面的示例所示，使用 Elasticsearch 的 _bulk API：



1.  POST _bulk
2.  {"index":{"_index":"employees","_id":"1"}}
3.  {"id":1,"name":"Huntlee Dargavel","email":"hdargavel0@japanpost.jp","gender":"male","ip_address":"58.11.89.193","date_of_birth":"11/09/1990","company":"Talane","position":"Research Associate","experience":7,"country":"China","phrase":"Multi-channelled coherent leverage","salary":180025}
4.  {"index":{"_index":"employees","_id":"2"}}
5.  {"id":2,"name":"Othilia Cathel","email":"ocathel1@senate.gov","gender":"female","ip_address":"3.164.153.228","date_of_birth":"22/07/1987","company":"Edgepulse","position":"Structural Engineer","experience":11,"country":"China","phrase":"Grass-roots heuristic help-desk","salary":193530}
6.  {"index":{"_index":"employees","_id":"3"}}
7.  {"id":3,"name":"Winston Waren","email":"wwaren2@4shared.com","gender":"male","ip_address":"202.37.210.94","date_of_birth":"10/11/1985","company":"Yozio","position":"Human Resources Manager","experience":12,"country":"China","phrase":"Versatile object-oriented emulation","salary":50616}
8.  {"index":{"_index":"employees","_id":"4"}}
9.  {"id":4,"name":"Alan Thomas","email":"athomas2@example.com","gender":"male","ip_address":"200.47.210.95","date_of_birth":"11/12/1985","company":"Yamaha","position":"Resources Manager","experience":12,"country":"China","phrase":"Emulation of roots heuristic coherent systems","salary":300000}

现在我们已经有了一个包含文档的索引和一个指定的映射，我们已经准备好开始使用示例搜索了。为方便大家阅读，我把其中的一个文档的字段列出来：

GET employees/_doc/1?filter_path=_source



1.  {
2.    "_source" : {
3.      "id" : 1,
4.      "name" : "Huntlee Dargavel",
5.      "email" : "hdargavel0@japanpost.jp",
6.      "gender" : "male",
7.      "ip_address" : "58.11.89.193",
8.      "date_of_birth" : "11/09/1990",
9.      "company" : "Talane",
10.      "position" : "Research Associate",
11.      "experience" : 7,
12.      "country" : "China",
13.      "phrase" : "Multi-channelled coherent leverage",
14.      "salary" : 180025
15.    }
16.  }

从上面的返回数据中，我们可以看到包含在每个文档中的字段。整个索引我们只包含4个文档，但是它足以让我们了解各个搜索。较少的数据集可以让我们看得更加清楚。

Match query

“match” 查询是 Elasticsearch 中最基本、最常用的查询之一，起到全文查询的作用。我们可以使用这个查询来搜索文本、数字或布尔值。

让我们在之前提取的文档中搜索名为 “phrase” 的字段中包含的单词 “heuristic”。



1.  POST employees/_search
2.  {
3.    "query": {
4.      "match": {
5.        "phrase": {
6.          "query": "heuristic"
7.        }
8.      }
9.    }
10.  }

在我们索引中的 4 个文档中，只有 2 个文档中的 “phrase” 字段中的包含 “heuristic”一词：

`

1.  {
2.    "took" : 1,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 2,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 0.6785375,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "2",
21.          "_score" : 0.6785375,
22.          "_source" : {
23.            "id" : 2,
24.            "name" : "Othilia Cathel",
25.            "email" : "ocathel1@senate.gov",
26.            "gender" : "female",
27.            "ip_address" : "3.164.153.228",
28.            "date_of_birth" : "22/07/1987",
29.            "company" : "Edgepulse",
30.            "position" : "Structural Engineer",
31.            "experience" : 11,
32.            "country" : "China",
33.            "phrase" : "Grass-roots heuristic help-desk",
34.            "salary" : 193530
35.          }
36.        },
37.        {
38.          "_index" : "employees",
39.          "_type" : "_doc",
40.          "_id" : "4",
41.          "_score" : 0.62577873,
42.          "_source" : {
43.            "id" : 4,
44.            "name" : "Alan Thomas",
45.            "email" : "athomas2@example.com",
46.            "gender" : "male",
47.            "ip_address" : "200.47.210.95",
48.            "date_of_birth" : "11/12/1985",
49.            "company" : "Yamaha",
50.            "position" : "Resources Manager",
51.            "experience" : 12,
52.            "country" : "China",
53.            "phrase" : "Emulation of roots heuristic coherent systems",
54.            "salary" : 300000
55.          }
56.        }
57.      ]
58.    }
59.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

如果我们要搜索多个单词会发生什么？使用我们刚刚执行的相同查询，让我们搜索 “heuristic roots help”：



1.  POST employees/_search
2.  {
3.    "query": {
4.      "match": {
5.        "phrase": {
6.          "query": "heuristic roots help"
7.        }
8.      }
9.    }
10.  }

这将返回与以前相同的文档，因为默认情况下，Elasticsearch 使用 OR 运算符处理搜索查询中的每个单词。在我们的例子中，查询将匹配任何包含 “heuristic” 或 “roots” 或 “help” 的文档。

应用于多词搜索的 OR 运算符是 match 的默认行为，但是我们可以使用与 “match” 查询一起传递的 “operator” 参数来更改。我们可以使用 “OR” 或 “AND” 值指定 operator 参数。
让我们看看当我们在之前执行的同一查询中提供运算符参数 “AND” 时会发生什么。



1.  POST employees/_search
2.  {
3.    "query": {
4.      "match": {
5.        "phrase": {
6.          "query": "heuristic roots help",
7.          "operator": "AND"
8.        }
9.      }
10.    }
11.  }

现在结果将只返回一个文档（文档 id=2），因为这是在 “phrase” 字段中同时包含所有三个搜索关键字的唯一文档。

minimum_should_match

更进一步，我们可以为文档必须包含的最小匹配词设置一个阈值。例如，如果我们将此参数设置为 1，则查询将检查至少有 1 个匹配词的任何文档。
现在，如果我们将 “minium_should_match” 参数设置为 3，那么所有三个单词都必须出现在文档中才能被归类为匹配项。

在我们的例子中，以下查询将仅返回 1 个文档（id=2），因为它是唯一符合我们条件的文档：



1.  POST employees/_search
2.  {
3.    "query": {
4.      "match": {
5.        "phrase": {
6.          "query" : "heuristic roots help",
7.          "minimum_should_match": 3
8.        }
9.      }
10.    }
11.  }

Mulit-Match Query

到目前为止，我们一直在处理单个字段上的匹配项——也就是说，我们在名为 “phrase” 的单个字段中搜索关键字。但是，如果我们需要在文档的多个字段中搜索关键字怎么办？这就是多匹配（multi-match）查询发挥作用的地方。
让我们尝试在文档中包含的 “position” 和 “phrase” 字段中搜索关键字 “research help” 的示例。



1.  POST employees/_search
2.  {
3.    "query": {
4.      "multi_match": {
5.        "query": "research help",
6.        "fields": [
7.          "position",
8.          "phrase"
9.        ]
10.      }
11.    }
12.  }

这将导致以下响应：

`

1.  {
2.    "took" : 24,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 2,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.2613049,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "1",
21.          "_score" : 1.2613049,
22.          "_source" : {
23.            "id" : 1,
24.            "name" : "Huntlee Dargavel",
25.            "email" : "hdargavel0@japanpost.jp",
26.            "gender" : "male",
27.            "ip_address" : "58.11.89.193",
28.            "date_of_birth" : "11/09/1990",
29.            "company" : "Talane",
30.            "position" : "Research Associate",
31.            "experience" : 7,
32.            "country" : "China",
33.            "phrase" : "Multi-channelled coherent leverage",
34.            "salary" : 180025
35.          }
36.        },
37.        {
38.          "_index" : "employees",
39.          "_type" : "_doc",
40.          "_id" : "2",
41.          "_score" : 1.1785964,
42.          "_source" : {
43.            "id" : 2,
44.            "name" : "Othilia Cathel",
45.            "email" : "ocathel1@senate.gov",
46.            "gender" : "female",
47.            "ip_address" : "3.164.153.228",
48.            "date_of_birth" : "22/07/1987",
49.            "company" : "Edgepulse",
50.            "position" : "Structural Engineer",
51.            "experience" : 11,
52.            "country" : "China",
53.            "phrase" : "Grass-roots heuristic help-desk",
54.            "salary" : 193530
55.          }
56.        }
57.      ]
58.    }
59.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

从上面的结果中，我们可以看出来任何在 position 或 phrase 字段包含 research 或者 help 的文档都将被搜索到。

Match Phrase

Match_phrase 是另一种常用的查询，正如其名称所示，它匹配字段中的短语。
如果我们需要在员工索引的 “phrase” 字段中搜索短语 “roots heuristic coherent”，我们可以使用 “match_phrase” 查询：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "match_phrase": {
5.        "phrase": {
6.          "query": "roots heuristic coherent"
7.        }
8.      }
9.    }
10.  }

这将返回具有确切短语 “roots heuristic coherent” 的文档，包括单词的顺序。在我们的例子中，我们只有一个符合上述条件的结果，如下面的响应所示：

`

1.  {
2.    "took" : 23,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 1,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.877336,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "4",
21.          "_score" : 1.877336,
22.          "_source" : {
23.            "id" : 4,
24.            "name" : "Alan Thomas",
25.            "email" : "athomas2@example.com",
26.            "gender" : "male",
27.            "ip_address" : "200.47.210.95",
28.            "date_of_birth" : "11/12/1985",
29.            "company" : "Yamaha",
30.            "position" : "Resources Manager",
31.            "experience" : 12,
32.            "country" : "China",
33.            "phrase" : "Emulation of roots heuristic coherent systems",
34.            "salary" : 300000
35.          }
36.        }
37.      ]
38.    }
39.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

slop 参数

我们可以在 match_phrase 查询中使用的一个有用功能是 “slop” 参数，它允许我们创建更灵活的搜索。
假设我们使用 match_phrase 查询搜索 “roots coherent”。我们不会收到从员工索引返回的任何文档。这是因为要匹配 match_phrase，这些术语需要按照准确的顺序排列。
现在，让我们使用 slop 参数，看看会发生什么：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "match_phrase": {
5.        "phrase": {
6.          "query": "roots coherent",
7.          "slop": 1
8.        }
9.      }
10.    }
11.  }

当 slop=1 时，查询表明可以移动一个单词进行匹配，因此我们将收到以下响应。在下面的响应中，你可以看到 “roots coherent” 与 “roots heuristic coherent” 文档相匹配。这是因为 slop 参数允许跳过 1 个术语。

`

1.  {
2.    "took" : 3,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 1,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 0.7873249,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "4",
21.          "_score" : 0.7873249,
22.          "_source" : {
23.            "id" : 4,
24.            "name" : "Alan Thomas",
25.            "email" : "athomas2@example.com",
26.            "gender" : "male",
27.            "ip_address" : "200.47.210.95",
28.            "date_of_birth" : "11/12/1985",
29.            "company" : "Yamaha",
30.            "position" : "Resources Manager",
31.            "experience" : 12,
32.            "country" : "China",
33.            "phrase" : "Emulation of roots heuristic coherent systems",
34.            "salary" : 300000
35.          }
36.        }
37.      ]
38.    }
39.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

Match Phrase Prefix

match_phrase_prefix 查询类似于 match_phrase 查询，但这里将搜索关键字的最后一个词视为前缀，用于匹配以该前缀词开头的任何词。
首先，让我们在索引中插入一个文档，以更好地理解 match_phrase_prefix 查询：



1.  PUT employees/_doc/5
2.  {
3.    "id": 4,
4.    "name": "Jennifer Lawrence",
5.    "email": "jlaw@example.com",
6.    "gender": "female",
7.    "ip_address": "100.37.110.59",
8.    "date_of_birth": "17/05/1995",
9.    "company": "Monsnto",
10.    "position": "Resources Manager",
11.    "experience": 10,
12.    "country": "Germany",
13.    "phrase": "Emulation of roots heuristic complete systems",
14.    "salary": 300000
15.  }

现在让我们应用 match_phrase_prefix：



1.  GET employees/_search
2.  {
3.    "_source": [
4.      "phrase"
5.    ],
6.    "query": {
7.      "match_phrase_prefix": {
8.        "phrase": {
9.          "query": "roots heuristic co"
10.        }
11.      }
12.    }
13.  }

在下面的结果中，我们可以看到具有 coherent 和 complete 的文档与查询匹配。我们还可以在 “match_phrase” 查询中使用 slop 参数。

`

1.  {
2.    "took" : 61,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 2,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 3.0871696,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "4",
21.          "_score" : 3.0871696,
22.          "_source" : {
23.            "phrase" : "Emulation of roots heuristic coherent systems"
24.          }
25.        },
26.        {
27.          "_index" : "employees",
28.          "_type" : "_doc",
29.          "_id" : "5",
30.          "_score" : 3.0871696,
31.          "_source" : {
32.            "phrase" : "Emulation of roots heuristic complete systems"
33.          }
34.        }
35.      ]
36.    }
37.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

注意：“match_phrase_query” 尝试匹配最后提供的关键字（在我们的示例中为 co）的 50 个扩展（默认情况下），也就是说搜索到包含有 co 的50个结果。这可以通过指定 “max_expansions” 参数来增加或减少。



1.  GET employees/_search
2.  {
3.    "_source": [
4.      "phrase"
5.    ],
6.    "query": {
7.      "match_phrase_prefix": {
8.        "phrase": {
9.          "query": "roots heuristic co",
10.          "max_expansions": 1
11.        }
12.      }
13.    }
14.  }

比如，上面的查询将只返回一个文档：

`

1.  {
2.    "took" : 0,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 1,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.805721,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "4",
21.          "_score" : 1.805721,
22.          "_source" : {
23.            "phrase" : "Emulation of roots heuristic coherent systems"
24.          }
25.        }
26.      ]
27.    }
28.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

由于这个前缀属性和 match_phrase_prefix 查询的易于设置的属性，它通常用于自动完成功能。
现在让我们删除刚刚添加的 id=5 的文档：

DELETE employees/_doc/5

Term 级查询

术语级查询用于查询结构化数据，通常是精确值。

Term Query/Terms Query

这是最简单的术语级别查询。此查询针对文档中的字段搜索搜索关键字（keyword）的完全匹配。
例如，如果我们对 “gender” 字段使用术语查询来搜索 “Male” 这个词，它会完全按照这个词进行搜索，即使有大小写也是如此。
这可以通过以下两个查询来证明：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "term": {
5.        "gender": {
6.          "value": "female"
7.        }
8.      }
9.    }
10.  }

`

1.  {
2.    "took" : 0,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 2,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 0.87546873,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "2",
21.          "_score" : 0.87546873,
22.          "_source" : {
23.            "id" : 2,
24.            "name" : "Othilia Cathel",
25.            "email" : "ocathel1@senate.gov",
26.            "gender" : "female",
27.            "ip_address" : "3.164.153.228",
28.            "date_of_birth" : "22/07/1987",
29.            "company" : "Edgepulse",
30.            "position" : "Structural Engineer",
31.            "experience" : 11,
32.            "country" : "China",
33.            "phrase" : "Grass-roots heuristic help-desk",
34.            "salary" : 193530
35.          }
36.        },
37.        {
38.          "_index" : "employees",
39.          "_type" : "_doc",
40.          "_id" : "5",
41.          "_score" : 0.87546873,
42.          "_source" : {
43.            "id" : 4,
44.            "name" : "Jennifer Lawrence",
45.            "email" : "jlaw@example.com",
46.            "gender" : "female",
47.            "ip_address" : "100.37.110.59",
48.            "date_of_birth" : "17/05/1995",
49.            "company" : "Monsnto",
50.            "position" : "Resources Manager",
51.            "experience" : 10,
52.            "country" : "Germany",
53.            "phrase" : "Emulation of roots heuristic complete systems",
54.            "salary" : 300000
55.          }
56.        }
57.      ]
58.    }
59.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

上面的搜索返回有两个结果。如果我们做如下的查询：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "term": {
5.        "gender": {
6.          "value": "Female"
7.        }
8.      }
9.    }
10.  }

在上面，我们把 female 修改为 Female，那么我们将搜索不到任何的文档。在上述情况下，两个查询之间的唯一区别是搜索关键字的大小写不同。案例 1 全部为小写，这是匹配的，因为这个字段的值就是按照小写保存的。但是对于案例 2，搜索没有得到任何结果，因为没有针对带有大写 “F” 的 “gender” 字段的此类 token。

我们还可以使用 terms query 传递多个要在同一字段上搜索的术语。让我们在性别字段中搜索 “female” 和 “male”。为此，我们可以使用以下 terms query：



1.  POST employees/_search
2.  {
3.    "query": {
4.      "terms": {
5.        "gender": [
6.          "female",
7.          "male"
8.        ]
9.      }
10.    }
11.  }

上面的查询将返回所有的4个文档。

Exists 查询

有时会发生字段没有索引值，或者文档中不存在该字段。在这种情况下，它有助于识别此类文件并分析影响。
例如，让我们将下面的文档索引到 “employee” 索引



1.  PUT employees/_doc/5
2.  {
3.    "id": 5,
4.    "name": "Michael Bordon",
5.    "email": "mbordon@example.com",
6.    "gender": "male",
7.    "ip_address": "10.47.210.65",
8.    "date_of_birth": "12/12/1995",
9.    "position": "Resources Manager",
10.    "experience": 12,
11.    "country": null,
12.    "phrase": "Emulation of roots heuristic coherent systems",
13.    "salary": 300000
14.  }

此文档没有名为 “company” 的字段，并且 “country” 字段的值为 null。

现在，如果我们想查找字段为 “company” 的文档，我们可以使用如下的 exist 查询：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "exists": {
5.        "field": "company"
6.      }
7.    }
8.  }

上面的查询将列出所有具有 “company” 字段的文档。上面查询的结果将返回4个文档，并且它们都含有 company 字段。而 id=5 的文档不被搜索到。
也许更有用的解决方案是列出所有没有 “company” 字段的文档。这也可以通过使用如下的存在查询来实现：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "bool": {
5.        "must_not": [
6.          {
7.            "exists": {
8.              "field": "company"
9.            }
10.          }
11.        ]
12.      }
13.    }
14.  }

bool 查询将在以下部分中详细说明。上面的查询将只返回 id=5 的文档：

`

1.  {
2.    "took" : 0,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 1,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 0.0,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "5",
21.          "_score" : 0.0,
22.          "_source" : {
23.            "id" : 5,
24.            "name" : "Michael Bordon",
25.            "email" : "mbordon@example.com",
26.            "gender" : "male",
27.            "ip_address" : "10.47.210.65",
28.            "date_of_birth" : "12/12/1995",
29.            "position" : "Resources Manager",
30.            "experience" : 12,
31.            "country" : null,
32.            "phrase" : "Emulation of roots heuristic coherent systems",
33.            "salary" : 300000
34.          }
35.        }
36.      ]
37.    }
38.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

让我们从索引中删除现在插入的文档，为了方便和统一，通过键入以下请求：

DELETE employees/_doc/5

Range queries

Elasticsearch 世界中另一个最常用的查询是范围查询。范围查询允许我们获取包含指定范围内的术语的文档。范围查询是术语级别的查询（表示用于查询结构化数据），可用于数值字段、日期字段等。

数值字段的 range 查询

例如，在我们创建的数据集中，如果我们需要过滤掉 experience 水平在 5 到 10 年之间的人，我们可以对其应用以下范围查询：



1.  POST employees/_search
2.  {
3.    "query": {
4.      "range": {
5.        "experience": {
6.          "gte": 5,
7.          "lte": 10
8.        }
9.      }
10.    }
11.  }

什么是 gte、gt、lt 和 lt？

gte 大于等于，gte: 5 表示大于等于5，其中包括5。greater than or equal to
gt 大于，gt: 5 ，表示大于5，不包括5。greater than
lte 小于或等于，lte: 5 ，表示小于等于5，其中包括5。less than or equal to
lt 小于，less than
gt: 5 ，表示小于5，不包括5。 greater than

上面查询的结果为：

`

1.  {
2.    "took" : 0,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 1,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.0,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "1",
21.          "_score" : 1.0,
22.          "_source" : {
23.            "id" : 1,
24.            "name" : "Huntlee Dargavel",
25.            "email" : "hdargavel0@japanpost.jp",
26.            "gender" : "male",
27.            "ip_address" : "58.11.89.193",
28.            "date_of_birth" : "11/09/1990",
29.            "company" : "Talane",
30.            "position" : "Research Associate",
31.            "experience" : 7,
32.            "country" : "China",
33.            "phrase" : "Multi-channelled coherent leverage",
34.            "salary" : 180025
35.          }
36.        }
37.      ]
38.    }
39.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

也就是说 experience 在 5 和 10 之间的只有一个文档。

日期字段的 range query

同样，范围查询也可以应用于日期字段。如果我们需要找出 1986 年之后出生的人，我们可以发出如下所示的查询：



1.  GET employees/_search
2.  {
3.      "query": {
4.          "range" : {
5.              "date_of_birth" : {
6.                  "gte" : "01/01/1986"
7.              }
8.          }
9.      }
10.  }

这将为我们获取仅在 1986 年之后具有 date_of_birth 字段的文档。

Ids queries

ids 查询是一个相对较少使用的查询，但它是最有用的查询之一，因此有资格在此列表中。有时我们需要根据文档的 ID 来检索文档。这可以使用单个 get 请求来实现，如下所示：

GET indexname/_doc/documentId

如果一个 ID 只能获取一个文档，这可能是一个很好的解决方案，但是如果我们有更多文档怎么办？

这就是 ids 查询非常方便的地方。使用 Ids 查询，我们可以在单个请求中完成此操作。
在下面的示例中，我们通过单个请求从 employees 索引中获取 id 为 1 和 4 的文档。



1.  POST employees/_search
2.  {
3.      "query": {
4.          "ids" : {
5.              "values" : ["1", "4"]
6.          }
7.      }
8.  }

上面查询将返回：

`

1.  {
2.    "took" : 0,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 2,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.0,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "1",
21.          "_score" : 1.0,
22.          "_source" : {
23.            "id" : 1,
24.            "name" : "Huntlee Dargavel",
25.            "email" : "hdargavel0@japanpost.jp",
26.            "gender" : "male",
27.            "ip_address" : "58.11.89.193",
28.            "date_of_birth" : "11/09/1990",
29.            "company" : "Talane",
30.            "position" : "Research Associate",
31.            "experience" : 7,
32.            "country" : "China",
33.            "phrase" : "Multi-channelled coherent leverage",
34.            "salary" : 180025
35.          }
36.        },
37.        {
38.          "_index" : "employees",
39.          "_type" : "_doc",
40.          "_id" : "4",
41.          "_score" : 1.0,
42.          "_source" : {
43.            "id" : 4,
44.            "name" : "Alan Thomas",
45.            "email" : "athomas2@example.com",
46.            "gender" : "male",
47.            "ip_address" : "200.47.210.95",
48.            "date_of_birth" : "11/12/1985",
49.            "company" : "Yamaha",
50.            "position" : "Resources Manager",
51.            "experience" : 12,
52.            "country" : "China",
53.            "phrase" : "Emulation of roots heuristic coherent systems",
54.            "salary" : 300000
55.          }
56.        }
57.      ]
58.    }
59.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

Prefix Queries

前缀查询（Prefix query）用于获取包含给定搜索字符串作为指定字段前缀的文档。
假设我们需要在 “name” 字段中获取所有包含 “al” 作为前缀的文档，那么我们可以使用前缀查询如下：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "prefix": {
5.        "name": "al"
6.      }
7.    }
8.  }

这将导致以下响应：

`

1.  {
2.    "took" : 1,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 1,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.0,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "4",
21.          "_score" : 1.0,
22.          "_source" : {
23.            "id" : 4,
24.            "name" : "Alan Thomas",
25.            "email" : "athomas2@example.com",
26.            "gender" : "male",
27.            "ip_address" : "200.47.210.95",
28.            "date_of_birth" : "11/12/1985",
29.            "company" : "Yamaha",
30.            "position" : "Resources Manager",
31.            "experience" : 12,
32.            "country" : "China",
33.            "phrase" : "Emulation of roots heuristic coherent systems",
34.            "salary" : 300000
35.          }
36.        }
37.      ]
38.    }
39.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

由于前缀查询是一个术语查询，它将按原样传递搜索字符串。那就是搜索 “al” 和 “Al” 是不同的。如果在上面的示例中，我们搜索 “Al”，我们将得到 0 个结果，因为在 “name” 字段的倒排索引中没有以 “Al” 开头的 token。但是，如果我们查询“name.keyword”字段，使用 “Al” 我们将得到上述结果，在这种情况下，查询 “al” 将导致零命中。

Wildcard quieries

这个也叫做通配符查询（wildcard query）。将获取具有与给定通配符模式匹配的术语的文档。例如，让我们在字段 “country” 上使用通配符查询来搜索 “c*a”，如下所示：



1.  GET employees/_search
2.  {
3.      "query": {
4.          "wildcard": {
5.              "country": {
6.                  "value": "c*a"
7.              }
8.          }
9.      }
10.  }

上面的查询将获取所有以 “c” 开头并以 “a” 结尾的 “country” 名称的文档（例如：China、Canada、Cambodia 等）。

这里 * 运算符可以匹配零个或多个字符。

Regexp

这个是正则查询。这类似于我们上面看到的 “通配符” 查询，但将接受正则表达式作为输入并获取匹配的文档。



1.  GET employees/_search
2.  {
3.    "query": {
4.      "regexp": {
5.        "position": "res[a-z]*"
6.      }
7.    }
8.  }

上面的查询将得到匹配正则表达式 res[a-z]* 的单词的文档。

`

1.  {
2.    "took" : 3,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 3,
13.        "relation" : "eq"
14.      },
15.      "max_score" : 1.0,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "1",
21.          "_score" : 1.0,
22.          "_source" : {
23.            "id" : 1,
24.            "name" : "Huntlee Dargavel",
25.            "email" : "hdargavel0@japanpost.jp",
26.            "gender" : "male",
27.            "ip_address" : "58.11.89.193",
28.            "date_of_birth" : "11/09/1990",
29.            "company" : "Talane",
30.            "position" : "Research Associate",
31.            "experience" : 7,
32.            "country" : "China",
33.            "phrase" : "Multi-channelled coherent leverage",
34.            "salary" : 180025
35.          }
36.        },
37.        {
38.          "_index" : "employees",
39.          "_type" : "_doc",
40.          "_id" : "3",
41.          "_score" : 1.0,
42.          "_source" : {
43.            "id" : 3,
44.            "name" : "Winston Waren",
45.            "email" : "wwaren2@4shared.com",
46.            "gender" : "male",
47.            "ip_address" : "202.37.210.94",
48.            "date_of_birth" : "10/11/1985",
49.            "company" : "Yozio",
50.            "position" : "Human Resources Manager",
51.            "experience" : 12,
52.            "country" : "China",
53.            "phrase" : "Versatile object-oriented emulation",
54.            "salary" : 50616
55.          }
56.        },
57.        {
58.          "_index" : "employees",
59.          "_type" : "_doc",
60.          "_id" : "4",
61.          "_score" : 1.0,
62.          "_source" : {
63.            "id" : 4,
64.            "name" : "Alan Thomas",
65.            "email" : "athomas2@example.com",
66.            "gender" : "male",
67.            "ip_address" : "200.47.210.95",
68.            "date_of_birth" : "11/12/1985",
69.            "company" : "Yamaha",
70.            "position" : "Resources Manager",
71.            "experience" : 12,
72.            "country" : "China",
73.            "phrase" : "Emulation of roots heuristic coherent systems",
74.            "salary" : 300000
75.          }
76.        }
77.      ]
78.    }
79.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

Fuzzy

模糊查询可用于返回包含与搜索词相似的词的文档。这在处理拼写错误时尤其有用。即使我们使用模糊查询搜索 “Chnia” 而不是“China”，我们也可以获得结果。
让我们看一个例子：



1.  GET employees/_search
2.  {
3.    "query": {
4.      "fuzzy": {
5.        "country": {
6.          "value": "Chnia",
7.          "fuzziness": "2"
8.        }
9.      }
10.    }
11.  }

这里的模糊度是匹配允许的最大编辑距离。我们在 “match_phrase” 查询中看到的 “max_expansions” 等参数也可以使用。更多相关文档可以在这里找到

模糊查询也可以与 “match” 查询类型一起出现。以下示例显示了在 multi_match 查询中使用的模糊性：



1.  POST employees/_search
2.  {
3.    "query": {
4.      "multi_match": {
5.        "query": "heursitic reserch",
6.        "fields": [
7.          "phrase",
8.          "position"
9.        ],
10.        "fuzziness": 2
11.      }
12.    },
13.    "size": 10
14.  }

尽管查询中存在拼写错误，上述查询仍将返回匹配 “heuristic” 或 “research” 的文档。

Boosting

在查询时，首先获得更受欢迎的结果通常会有所帮助。执行此操作的最简单方法在 Elasticsearch 中称为 boosting。当我们查询多个字段时，这会派上用场。例如，考虑以下查询：



1.  POST employees/_search
2.  {
3.      "query": {
4.          "multi_match" : {
5.              "query" : "versatile Engineer",
6.              "fields": ["position^3", "phrase"]
7.          }
8.      }
9.  }

这将返回与 “position” 字段匹配的文档位于顶部的响应，而不是与“phrase”字段匹配的文档。

Sorting - 排序

默认排序

当搜索请求中没有指定排序参数时，Elasticsearch 根据 “_score” 字段的降序返回文档。这个“_score”是根据使用 Elasticsearch 的默认评分方法的查询匹配程度来计算的。在我们上面讨论的所有示例中，你可以在结果中看到相同的行为。
只有当我们使用 “filter” 上下文时，才不会计算评分，以便更快地返回结果。

如何根据字段来进行排名

Elasticsearch 为我们提供了基于字段进行排序的选项。比如说，让我们需要根据员工的经验降序对员工进行排序。我们可以使用启用了排序选项的以下查询来实现：



1.  GET employees/_search
2.  {
3.    "_source": [
4.      "name",
5.      "experience",
6.      "salary"
7.    ],
8.    "sort": [
9.      {
10.        "experience": {
11.          "order": "desc"
12.        }
13.      }
14.    ]
15.  }

上述查询的结果如下：

`

1.  {
2.    "took" : 0,
3.    "timed_out" : false,
4.    "_shards" : {
5.      "total" : 1,
6.      "successful" : 1,
7.      "skipped" : 0,
8.      "failed" : 0
9.    },
10.    "hits" : {
11.      "total" : {
12.        "value" : 4,
13.        "relation" : "eq"
14.      },
15.      "max_score" : null,
16.      "hits" : [
17.        {
18.          "_index" : "employees",
19.          "_type" : "_doc",
20.          "_id" : "3",
21.          "_score" : null,
22.          "_source" : {
23.            "name" : "Winston Waren",
24.            "experience" : 12,
25.            "salary" : 50616
26.          },
27.          "sort" : [
28.            12
29.          ]
30.        },
31.        {
32.          "_index" : "employees",
33.          "_type" : "_doc",
34.          "_id" : "4",
35.          "_score" : null,
36.          "_source" : {
37.            "name" : "Alan Thomas",
38.            "experience" : 12,
39.            "salary" : 300000
40.          },
41.          "sort" : [
42.            12
43.          ]
44.        },
45.        {
46.          "_index" : "employees",
47.          "_type" : "_doc",
48.          "_id" : "2",
49.          "_score" : null,
50.          "_source" : {
51.            "name" : "Othilia Cathel",
52.            "experience" : 11,
53.            "salary" : 193530
54.          },
55.          "sort" : [
56.            11
57.          ]
58.        },
59.        {
60.          "_index" : "employees",
61.          "_type" : "_doc",
62.          "_id" : "1",
63.          "_score" : null,
64.          "_source" : {
65.            "name" : "Huntlee Dargavel",
66.            "experience" : 7,
67.            "salary" : 180025
68.          },
69.          "sort" : [
70.            7
71.          ]
72.        }
73.      ]
74.    }
75.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

从上面的响应中可以看出，结果是根据员工体验的降序排列的。此外，还有两名员工，他们的经验水平与 12 级相同。

如何根据多字段来进行排名

在上面的示例中，我们看到有两个员工的经验等级相同，均为 12，但我们需要根据薪水的降序再次排序。我们也可以提供多个字段进行排序，如下面的查询所示：



1.  GET employees/_search?filter_path=**.hits
2.  {
3.    "_source": [
4.      "name",
5.      "experience",
6.      "salary"
7.    ],
8.    "sort": [
9.      {
10.        "experience": {
11.          "order": "desc"
12.        }
13.      },
14.      {
15.        "salary": {
16.          "order": "desc"
17.        }
18.      }
19.    ]
20.  }

现在我们得到以下结果：

`

1.  {
2.    "hits" : {
3.      "hits" : [
4.        {
5.          "_index" : "employees",
6.          "_type" : "_doc",
7.          "_id" : "4",
8.          "_score" : null,
9.          "_source" : {
10.            "name" : "Alan Thomas",
11.            "experience" : 12,
12.            "salary" : 300000
13.          },
14.          "sort" : [
15.            12,
16.            300000
17.          ]
18.        },
19.        {
20.          "_index" : "employees",
21.          "_type" : "_doc",
22.          "_id" : "3",
23.          "_score" : null,
24.          "_source" : {
25.            "name" : "Winston Waren",
26.            "experience" : 12,
27.            "salary" : 50616
28.          },
29.          "sort" : [
30.            12,
31.            50616
32.          ]
33.        },
34.        {
35.          "_index" : "employees",
36.          "_type" : "_doc",
37.          "_id" : "2",
38.          "_score" : null,
39.          "_source" : {
40.            "name" : "Othilia Cathel",
41.            "experience" : 11,
42.            "salary" : 193530
43.          },
44.          "sort" : [
45.            11,
46.            193530
47.          ]
48.        },
49.        {
50.          "_index" : "employees",
51.          "_type" : "_doc",
52.          "_id" : "1",
53.          "_score" : null,
54.          "_source" : {
55.            "name" : "Huntlee Dargavel",
56.            "experience" : 7,
57.            "salary" : 180025
58.          },
59.          "sort" : [
60.            7,
61.            180025
62.          ]
63.        }
64.      ]
65.    }
66.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

在上面的结果中，你可以看到，在具有相同经验级别的员工中，薪水最高的人在订单中被提前了（Alan 和 Winston 的经验级别相同，但与之前的搜索结果不同，这里 Alan 的排名被提升因为他的薪水更高）。

注意：如果我们改变排序数组中排序参数的顺序，即先保留 “salary” 参数，然后保留 “experience” 参数，那么搜索结果也会发生变化。结果将首先根据薪水参数进行排序，然后将考虑经验参数，而不影响基于薪水的排序。

让我们将上述查询的排序顺序颠倒一下，即先保留 “salary”，然后是 “experience”，如下所示：



1.  GET employees/_search?filter_path=**.hits
2.  {
3.    "_source": [
4.      "name",
5.      "experience",
6.      "salary"
7.    ],
8.    "sort": [
9.      {
10.        "salary": {
11.          "order": "desc"
12.        }
13.      },
14.      {
15.        "experience": {
16.          "order": "desc"
17.        }
18.      }
19.    ]
20.  }

结果如下：

`

1.  {
2.    "hits" : {
3.      "hits" : [
4.        {
5.          "_index" : "employees",
6.          "_type" : "_doc",
7.          "_id" : "4",
8.          "_score" : null,
9.          "_source" : {
10.            "name" : "Alan Thomas",
11.            "experience" : 12,
12.            "salary" : 300000
13.          },
14.          "sort" : [
15.            300000,
16.            12
17.          ]
18.        },
19.        {
20.          "_index" : "employees",
21.          "_type" : "_doc",
22.          "_id" : "2",
23.          "_score" : null,
24.          "_source" : {
25.            "name" : "Othilia Cathel",
26.            "experience" : 11,
27.            "salary" : 193530
28.          },
29.          "sort" : [
30.            193530,
31.            11
32.          ]
33.        },
34.        {
35.          "_index" : "employees",
36.          "_type" : "_doc",
37.          "_id" : "1",
38.          "_score" : null,
39.          "_source" : {
40.            "name" : "Huntlee Dargavel",
41.            "experience" : 7,
42.            "salary" : 180025
43.          },
44.          "sort" : [
45.            180025,
46.            7
47.          ]
48.        },
49.        {
50.          "_index" : "employees",
51.          "_type" : "_doc",
52.          "_id" : "3",
53.          "_score" : null,
54.          "_source" : {
55.            "name" : "Winston Waren",
56.            "experience" : 12,
57.            "salary" : 50616
58.          },
59.          "sort" : [
60.            50616,
61.            12
62.          ]
63.        }
64.      ]
65.    }
66.  }

`![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)

你可以看到经验值 12 的候选人低于经验值 7 的候选人，因为后者的薪水高于前者。

继续阅读 “Elasticsearch：Elasticsearch 查询示例 - 动手练习（二）”

Elasticsearch：Elasticsearch 查询示例 - 动手练习（一）