这是我参与8月更文挑战的第14天,活动详情查看:8月更文挑战
本Elasticsearch相关文章的版本为:7.4.2
测试数据:
POST /cross_fileds_test/_doc/1
{
"last_name": "Will",
"first_name": "Smith"
}
POST /cross_fileds_test/_doc/2
{
"last_name": "Smith",
"first_name": "Will"
}
POST /cross_fileds_test/_doc/3
{
"last_name": "John",
"first_name": "Smith"
}
POST /cross_fileds_test/_doc/4
{
"last_name": "Smith",
"first_name": "John"
}
假设我们想找出包含Will Smith的文档,如果使用best_fields进行查询,然后限定必须两个单词都包含的话,就是operator=and:
GET /cross_fileds_test/_search
{
"query": {
"multi_match" : {
"query": "Will Smith",
"type": "best_fields",
"fields": [ "first_name", "last_name" ],
"operator": "and"
}
}
}
但是best_fields是以字段为中心的,意思即是operator这个参数会被独立的应用到每个字段。上面的查询语句将会是这样的效果:
(first_name:will AND first_name: smith) OR (last_name:will AND last_name:smith)
返回数据:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
将会导致要求所有分词都同时出现在同一个字段,即要求要么will simth出现在first_name, 要求要么will simth出现在last_name。很显然在这个查询first_name和last_name的查询中是不符合需求的。
那么,如何才可以达到我们的需求呢?这时候cross_fileds就可以很好的契合我们的查询需求。
cross_field: 在任何 字段中查找每个分词。采用以分词为中心的。它首先将查询字符串分析为单独的分词,然后在任何字段中查找每个分词,就好像它们是一个大字段一样。
继续查询找出包含Will Smith的文档:
GET /cross_fileds_test/_search
{
"query": {
"multi_match" : {
"query": "Will Smith",
"type": "cross_fields",
"fields": [ "first_name", "last_name" ],
"operator": "and"
}
}
}
上述查询具体过程:
- 将查询字符串分析为单独的分词: will和smith;
- 然后在任何字段中查找每个分词: (first_name:will OR last_name:will) AND (first_name:smith OR last_name:smith)
意思是所有分词必须至少出现在一个字段中才能匹配文档。所以上面的查询将会查询到两种情况的文档:
- 要么first_name为will且last_name为smith的文档;
- 要么first_name为smith且last_name为will的文档。
返回的数据, 只有doc1和doc2满足要求:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.89712,
"hits" : [
{
"_index" : "cross_fileds_test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.89712,
"_source" : {
"last_name" : "Will",
"first_name" : "Smith"
}
},
{
"_index" : "cross_fileds_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.89712,
"_source" : {
"last_name" : "Smith",
"first_name" : "Will"
}
}
]
}
}