阅读说明:
- 如果有排版格式问题,请移步www.yuque.com/mrhuang-ire… 《查询DSL》,选择宽屏模式效果更佳。
- 本文为原创文章,转发请注明出处。
查询简介
查询多个索引
声明索引查询, 支持通配符
GET /index_1, index_2, .../_search
GET /index*/_search
所有索引中查询
GET /_search
GET /_all/_search
GET /*/_search
查询基本模块
- query: 查询主体, 查询请求中最重要的部分,过滤出希望返回的文档。
- from: 和size一起使用,from用于分页操作
- size: 返回文档数量
- _source: 字段如何返回。默认返回完整的_source字段。_source有如下三种设置方式:
- _source设置成false, 不输出source内容;
- 选择输出对应字段,支持通配符,示例:
- 利用“includes”和“excludes”选择,支持通配符,示例:
GET /mt_product/_search
{
"query": {
"match_all": {}
},
"_source":{
"includes": [ "location.*", "date" ],
"excludes": [ "location.geolacation" ]
}
}
- sort: 指定排序依据,默认按照相关性得分进行排序。
查询基础模块示例:
GET /mt_product/_search
{
"query": {
"match_all": {}
},
'from":0,
"size":10,
"_source":["date", "title"],
"sort":["created_on":"desc"]
}
回复结构示例:
模糊查询
单字段模糊检索(match, match_phase)
GET /mt_product/_search
{
"query": {
"match": {
"name": "好吃 寿司" //name like '%好吃%' or name like '%寿司%'
}
}
}
结果返回三条数据;
{"took":5,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":11.056448,"hits":[{"_index":"mt_product","_id":"0","_score":11.056448,"_source":{"id": 0,"name": "好吃的寿司","tags": ["寿司"],"price": 20.9,"sales": 1000,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}},{"_index":"mt_product","_id":"3","_score":3.2359123,"_source":{"id": 3,"name": "三文鱼寿司","tags": ["寿司,鱼肉"],"price": 16.9,"sales": 820,"score": 4.9,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:03:00"
}},{"_index":"mt_product","_id":"4","_score":2.7279496,"_source":{"id": 4,"name": "极上全品寿司套餐","tags": ["寿司"],"price": 25,"sales": 1500,"score": 4.6,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:04:00"
}}]}}
match_phrase 是 Elasticsearch 中的一种全文查询类型,它用于精确匹配包含指定短语的文档。match_phrase 查询需要字段值中的单词顺序与查询字符串中的单词顺序完全一致。
GET /mt_product/_search
{
"query": {
"match_phrase": {
"name": "好吃 寿司" // name like '%好吃%' and name like '%寿司%' 并且好吃要在寿司的前面
}
}
}
{"took":19,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":11.056448,"hits":[{"_index":"mt_product","_id":"0","_score":11.056448,"_source":{"id": 0,"name": "好吃的寿司","tags": ["寿司"],"price": 20.9,"sales": 1000,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}}]}}
多字段模糊检索(multi_match)
GET /mt_product/_search
{
"query": {
"multi_match": {
"query":"好吃 寿司",
"fields": ["name", "store_name"] //name like '%好吃%' or name like '%寿司%' or store_name like '%好吃%' or store_name like '%寿司%'
}
}
}
{"took":6,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":5,"relation":"eq"},"max_score":8.449602,"hits":[{"_index":"mt_product","_id":"0","_score":8.449602,"_source":{"id": 0,"name": "好吃的寿司","tags": ["寿司"],"price": 20.9,"sales": 1000,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}},{"_index":"mt_product","_id":"3","_score":3.2359123,"_source":{"id": 3,"name": "三文鱼寿司","tags": ["寿司,鱼肉"],"price": 16.9,"sales": 820,"score": 4.9,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:03:00"
}},{"_index":"mt_product","_id":"4","_score":2.7279496,"_source":{"id": 4,"name": "极上全品寿司套餐","tags": ["寿司"],"price": 25,"sales": 1500,"score": 4.6,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:04:00"
}},{"_index":"mt_product","_id":"1","_score":1.7392528,"_source":{"id": 1,"name": "招牌海苔单人餐","tags": ["寿司"],"price": 9.9,"sales": 1000,"score": 4.7,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:01:00"
}},{"_index":"mt_product","_id":"2","_score":1.7392528,"_source":{"id": 2,"name": "1-2人招牌双拼套餐","tags": ["寿司"],"price": 18.9,"sales": 1200,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}}]}}
主键查询(ids)
根据多个主键查询,类似于 SQL 中的 id in (...),示例:
GET /mt_product/_search
{
"query": {
"ids" : {
"values" : ["1", "4"]
}
}
}
精确匹配
单字段单值精确检索(term)
如果想要找到具体店铺id的所有产品,按照关系数据库,通常是这样:
SELECT * FROM mt_product WHERE store_id = 1;
在Elasticsearch,可以使用term查询达到相同的目的,term 查询会查找我们指定的精确值。
GET /mt_product/_search
{
"query": {
"term": {
"store_id": 1 //搜索store_id = 1的数据
}
}
}
结果返回三条数据:
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":1.3862942,"hits":[{"_index":"mt_product","_id":"1","_score":1.3862942,"_source":{"id": 1,"name": "招牌海苔单人餐","tags": ["寿司"],"price": 9.9,"sales": 1000,"score": 4.7,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:01:00"
}},{"_index":"mt_product","_id":"0","_score":1.3862942,"_source":{"id": 0,"name": "好吃的寿司","tags": ["寿司"],"price": 20.9,"sales": 1000,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}},{"_index":"mt_product","_id":"2","_score":1.3862942,"_source":{"id": 2,"name": "1-2人招牌双拼套餐","tags": ["寿司"],"price": 18.9,"sales": 1200,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}}]}}
term查询文本时,需要注意是否文档中文本是否分词,会影响到查询结果。
单字段多值精确检索(terms)
如果想要查询多个精确值,使用terms,字段的值相应改为数组即可。
GET /mt_product/_search
{
"query": {
"terms": {
"store_id": [1, 2] //搜索store_id in(1, 2) 的数据
}
},
"_source":["id", "store_id"]
}
{"took":3,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":5,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"mt_product","_id":"1","_score":1.0,"_source":{"id":1,"store_id":1}},{"_index":"mt_product","_id":"0","_score":1.0,"_source":{"id":0,"store_id":1}},{"_index":"mt_product","_id":"2","_score":1.0,"_source":{"id":2,"store_id":1}},{"_index":"mt_product","_id":"3","_score":1.0,"_source":{"id":3,"store_id":2}},{"_index":"mt_product","_id":"4","_score":1.0,"_source":{"id":4,"store_id":2}}]}}
范围查询
range 查询可同时提供包含(inclusive)和不包含(exclusive)这两种范围表达式,可供组合的选项如下:
- gt:大于 (greater than)
- gte:大于等于 (greater than or equal to)
- lt:小于 (less than)
- lte:小于等于 (less than or equal to)
GET /mt_product/_search
{
"query": {
"range": {
"price": { "gte": 10,"lte": 20}
}
}
}
{"took":18,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":3,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"mt_product","_id":"2","_score":1.0,"_source":{"id": 2,"name": "1-2人招牌双拼套餐","tags": ["寿司"],"price": 18.9,"sales": 1200,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}},{"_index":"mt_product","_id":"3","_score":1.0,"_source":{"id": 3,"name": "三文鱼寿司","tags": ["寿司,鱼肉"],"price": 16.9,"sales": 820,"score": 4.9,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:03:00"
}},{"_index":"mt_product","_id":"9","_score":1.0,"_source":{"id": 9,"name": "霸道小酥鸡+薯霸王","tags": ["汉堡,鸡肉"],"price": 19,"sales": 300,"score": 4.2,"store_id": 4,"store_name": "汉堡王","create_time": "2023-01-18T08:09:00"
}}]}}
(存在与否) exists与missing
exists返回那些在指定字段有任何值的文档,
GET /my_index/posts/_search
{
"query" : {
"constant_score" : {
"filter" : {
"exists" : { "field" : "tags" } //等效于SELECT tags FROM posts WHERE tags IS NOT NULL
}
}
}
}
missing 查询本质上与 exists 恰好相反:它返回某个特定 无 值字段的文档。
GET /my_index/posts/_search
{
"query" : {
"constant_score" : {
"filter": {
"missing" : { "field" : "tags" } //等效于SELECT tags FROM posts WHERE tags IS NULL
}
}
}
}
有时候我们需要区分一个字段是没有值,还是它已被显式的设置成了 null ,在例子中,我们看到的默认的行为是无法做到这点的。可以选择将显式的 null 值替换成我们指定占位符(placeholder) 。
复合查询
复合查询有:Bool query(布尔查询)、Boosting query(提高查询)、Constant_score (固定分数查询)、Dis_max(最佳匹配查询)、Function_score(函数查询)。Bool query根据匹配条件过滤出文档,Boosting query,Constant_score,Dis_max,Function_score 通过调整相关性得分调整文档的排序结果。
Bool query(布尔查询)
复合查询就是一个或多个子句的组合,每一个子句都是一个子查询,根据组合的方式可分为下面几种类型:
- must:必须匹配每个子句,类似于 SQL 中的 and,参与评分。
- should:可以匹配任意子句,类似于 SQL 中的 or,参与评分。
- must_not:必须不匹配每个子类,类似于 SQL中的 not in,不参与评分。
- filter:过滤上下文,它与 must 的不同之处是不会影响匹配文档的分数。
必须匹配(must)
我们要查询 tag 为“寿司”,并且价格小于等于 15 块钱,就可以使用这样:
GET /mt_product/_search
{
"query": {
"bool": {
"must": [
{"match": {"tags": "寿司"}},
{"range": {"price": {"lte": 15}}}
]
}
}
}
{"took":16,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":3.145831,"hits":[{"_index":"mt_product","_id":"1","_score":3.145831,"_source":{"id": 1,"name": "招牌海苔单人餐","tags": ["寿司"],"price": 9.9,"sales": 1000,"score": 4.7,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:01:00"
}}]}}
可以匹配(should)
查询 tag 为“鱼肉”,或者 store_name 为“麦当劳”,可以这样查询:
GET /mt_product/_search
{
"query": {
"bool": {
"should": [
{"match": {"tags": "鱼肉"}},
{"match": {"store_name": "麦当劳"}}
]
}
},
"_source":["id", "store_name", "tags"]
}
{"took":7,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":8,"relation":"eq"},"max_score":4.90405,"hits":[{"_index":"mt_product","_id":"10","_score":4.90405,"_source":{"id":10,"tags":["汉堡,鸡肉"],"store_name":"麦当劳"}},{"_index":"mt_product","_id":"12","_score":4.90405,"_source":{"id":12,"tags":["汉堡,鸡肉"],"store_name":"麦当劳"}},{"_index":"mt_product","_id":"11","_score":4.3616457,"_source":{"id":11,"tags":["汉堡"],"store_name":"麦当劳"}},{"_index":"mt_product","_id":"3","_score":2.4834473,"_source":{"id":3,"tags":["寿司,鱼肉"],"store_name":"爱食寿司"}}]}}
不匹配(must_not)
查询 store_name 不是“麦当劳”,并且 tags 不包含“寿司”,可以这样查询:
GET /mt_product/_search
{
"query":{
"bool":{
"must_not":[
{
"match":{"store_name":"麦当劳"}
},
{
"match":{"tags":"寿司"}
}
]
}
},
"_source":["id", "store_name", "tags"]
}
{"took":11,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":5,"relation":"eq"},"max_score":0.0,"hits":[{"_index":"mt_product","_id":"5","_score":0.0,"_source":{"id":5,"tags":["汉堡,鸡肉"],"store_name":"肯德基"}},{"_index":"mt_product","_id":"6","_score":0.0,"_source":{"id":6,"tags":["汉堡,鸡肉"],"store_name":"肯德基"}},{"_index":"mt_product","_id":"7","_score":0.0,"_source":{"id":7,"tags":["鸡肉"],"store_name":"肯德基"}},{"_index":"mt_product","_id":"8","_score":0.0,"_source":{"id":8,"tags":["汉堡"],"store_name":"汉堡王"}},{"_index":"mt_product","_id":"9","_score":0.0,"_source":{"id":9,"tags":["汉堡,鸡肉"],"store_name":"汉堡王"}}]}}
过滤(filter)
filter 和must功能相同,检查含有某个字段,区别是过滤的结果不会影响原查询的得分。比如我们在上一条查询的基础上,增加 store_name 为“汉堡王”,其查询结果的得分,与上一条查询的得分是一样的。
GET /mt_product/_search
{
"query":{
"bool":{
"must_not":[
{
"match":{"store_name":"麦当劳"}
},
{
"match":{"tags":"寿司"}
}
],
"filter":[
{
"match":{"store_name":"汉堡王"}
}
]
}
},
"_source":["id", "store_name", "tags"]
}
{"took":7,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":2,"relation":"eq"},"max_score":0.0,"hits":[{"_index":"mt_product","_id":"8","_score":0.0,"_source":{"id":8,"tags":["汉堡"],"store_name":"汉堡王"}},{"_index":"mt_product","_id":"9","_score":0.0,"_source":{"id":9,"tags":["汉堡,鸡肉"],"store_name":"汉堡王"}}]}}
Boosting query(提高查询)
Boosting query返回命中了positive子查询的文档,并根据是否命中negative子查询调整相关的得分。不同于Bool query中通过must + must_not剔除不匹配的文档, Boosting query中如果命中了negative子查询, 则降低文档的得分(即score), 借此优化查询结果的文档排序。
Boosting query有以下重要参数:
- positive:正向查询子句(希望匹配的条件),类似于Bool query中的must, 任何返回的文档都必须匹配此查询;
- negative:负向查询子句(不希望匹配的条件),跟Bool query中的must_not不同,这个参数用于减少相关性得分,并不会剔除文档;
- negative_boost:0到1.0之间的浮点数,用于降低与negative查询匹配的文档的相关性得分;
得分计算规则:
- 如果文档不满足nagative,那么返回原始得分;
- 如果文档满足了nagative,那么将原始匹配得分乘以negative_boost;
官方文档示例:
GET /_search
{
"query": {
"boosting": {
"positive": {
"term": {
"text": "apple"
}
},
"negative": {
"term": {
"text": "pie tart fruit crumble tree"
}
},
"negative_boost": 0.5
}
}
}
场景举例:我们通过去索引中搜索 '苹果公司' 相关的信息,查询条件为: must = '苹果' AND must_not = '树 or 水果' 但是你想,这样做是不是太粗暴了,因为一个文档中包含'苹果'和'树'那不代表一定是苹果树,而可能是 '苹果公司组织员工一起去种树' 那么这条文档理应出现,而不是直接过滤掉,所以我们就可以用boosting query。
Constant score Query
Constant score Query会为匹配的文档分配一个固定相关性得分,不考虑文档的内容或其他因素。这种查询适用于只关心文档是否匹配的情况,而不关心匹配程度。
Constant Score Query的参数包括:
-
filter:用于指定过滤条件;
-
boost: 用来设置返回文档的常量相关性分数,默认为1.0;
得分计算规则:最终得分 = boost;
官方文档示例:
GET /_search
{
"query": {
"constant_score": {
"filter": {
"term": { "user.id": "kimchy" }
},
"boost": 1.2
}
}
}
Dis Max Query
如果有多个子查询,返回文档计算相关性分数时,选择子查询最高的得分作为最后的得分。相比于Bool query中多个should子查询之间评分均衡叠加作为文档的最终评分,Dis Max Query在得分算法上选择子查询最大的方式, 当然也可以通过tie_breaker参数调整其他子查询的得分加权。
Dis Max Query的参数包括:
- queries : 包含多个子查询的数组,返回的文档必须匹配一个或多个这些查询。如果一个文档匹配多个查询,Elasticsearch使用最高的相关性得分。
- tie_breaker:(可选的,float类型)0到1.0之间的浮点数,用于增加匹配查询子句的文档的相关性得分。默认为0.0。此值越大,文档的相关性得分越高。如果一个文档匹配多个子句,dis_max查询将计算该文档的相关性得分,如下所示: (1)从匹配子句中选取得分最高的相关度得分; (2)去除得分最高的子句外,其他匹配子句的得分乘以tie_breaker加权。 (3)步骤1和步骤2中得分相加,得到最终文档的相关性得分。 如果tie_breaker值大于0.0,那么所有匹配的子句都被计算在内,但是得分最高的子句被计算的最多。
官方文档示例:
GET /_search
{
"query": {
"dis_max": {
"queries": [
{ "term": { "title": "Quick pets" } },
{ "term": { "body": "Quick pets" } }
],
"tie_breaker": 0.7
}
}
}
Function Score Query
Function Score Query在计算文档的相关性的得分上支持使用自定义方法,这些方法可以基于文档的字段值、距离等信息来计算得分。这种查询非常适用于需要精细调整查询结果的情况。详细内容请参考官方文档:Function score query | Elasticsearch Guide [7.8] | Elastic。
附录
本文演示用例数据源:
PUT mt_product
{"mappings": {"properties": {"id": {"type": "long"},"name": {"type": "text","analyzer": "ik_max_word"},"tags": {"type": "text","analyzer": "ik_max_word"},"price": {"type": "float"},"sales": {"type": "integer"},"score": {"type": "float"},"store_id": {"type": "keyword"},"store_name": {"type": "text","analyzer": "ik_max_word"},"create_time": {"type": "date"}}}
}
POST /mt_product/_doc/0
{"id": 0,"name": "好吃的寿司","tags": ["寿司"],"price": 20.9,"sales": 1000,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}
POST /mt_product/_doc/1
{"id": 1,"name": "招牌海苔单人餐","tags": ["寿司"],"price": 9.9,"sales": 1000,"score": 4.7,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:01:00"
}
POST /mt_product/_doc/2
{"id": 2,"name": "1-2人招牌双拼套餐","tags": ["寿司"],"price": 18.9,"sales": 1200,"score": 4.8,"store_id": 1,"store_name": "M多寿司","create_time": "2023-01-18T08:02:00"
}
POST /mt_product/_doc/3
{"id": 3,"name": "三文鱼寿司","tags": ["寿司,鱼肉"],"price": 16.9,"sales": 820,"score": 4.9,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:03:00"
}
POST /mt_product/_doc/4
{"id": 4,"name": "极上全品寿司套餐","tags": ["寿司"],"price": 25,"sales": 1500,"score": 4.6,"store_id": 2,"store_name": "爱食寿司","create_time": "2023-01-18T08:04:00"
}
POST /mt_product/_doc/5
{"id": 5,"name": "劲脆鸡腿汉堡","tags": ["汉堡,鸡肉"],"price": 21.5,"sales": 200,"score": 4.5,"store_id": 3,"store_name": "肯德基","create_time": "2023-01-18T08:05:00"
}
POST /mt_product/_doc/6
{"id": 6,"name": "香辣鸡腿汉堡","tags": ["汉堡,鸡肉"],"price": 21.5,"sales": 98,"score": 4.4,"store_id": 3,"store_name": "肯德基","create_time": "2023-01-18T08:06:00"
}
POST /mt_product/_doc/7
{"id": 7,"name": "20块香辣鸡翅","tags": ["鸡肉"],"price": 99,"sales": 5,"score": 4.8,"store_id": 3,"store_name": "肯德基","create_time": "2023-01-18T08:07:00"
}
POST /mt_product/_doc/8
{"id": 8,"name": "3层芝士年堡套餐","tags": ["汉堡"],"price": 29,"sales": 4000,"score": 4.9,"store_id": 4,"store_name": "汉堡王","create_time": "2023-01-18T08:08:00"
}
POST /mt_product/_doc/9
{"id": 9,"name": "霸道小酥鸡+薯霸王","tags": ["汉堡,鸡肉"],"price": 19,"sales": 300,"score": 4.2,"store_id": 4,"store_name": "汉堡王","create_time": "2023-01-18T08:09:00"
}
POST /mt_product/_doc/10
{"id": 10,"name": "双层原味板烧鸡腿麦满分四件套","tags": ["汉堡,鸡肉"],"price": 29,"sales": 3000,"score": 4.8,"store_id": 5,"store_name": "麦当劳","create_time": "2023-01-18T08:10:00"
}
POST /mt_product/_doc/11
{"id": 11,"name": "火腿扒麦满分组合","tags": ["汉堡"],"price": 8,"sales": 100000,"score": 4.9,"store_id": 5,"store_name": "麦当劳","create_time": "2023-01-18T08:11:00"
}
POST /mt_product/_doc/12
{"id": 12,"name": "原味板烧鸡腿麦满组件","tags": ["汉堡,鸡肉"],"price": 9.9,"sales": 140000,"score": 4.9,"store_id": 5,"store_name": "麦当劳","create_time": "2023-01-18T08:12:00"
}
参考: [1].wed.xjx100.cn/news/50042.… [2]. www.elastic.co/guide/en/el… [3].Easticsearch实战 [4].blog.csdn.net/weixin_4471… [5].www.elastic.co/guide/en/el… [6].zhuanlan.zhihu.com/p/146979160… [7].www.cnblogs.com/zhouyi2021/…