遇到了一个问题,同样一个查询用term无法匹配到,用match可以匹配到。具体如下:
文档mapping:
{
"mappings": {
"tweet": {
"properties": {
"message": {
"type": "text"
},
"name": {
"type": "text"
},
"post_date": {
"type": "date"
},
"user": {
"type": "text"
}
}
}
}
}
索引一个文档:
{
"user" : "kimchy",
"post_date" : "2019-11-15T14:12:12",
"message" : "trying out ElasticSearch"
}
此时使用term根据message进行查询
{
"query" : {
"term" : {
"message" : "trying out ElasticSearch"
}
}
}
查询结果为空
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
而使用match根据message进行查询
{
"query" : {
"match" : {
"message" : "trying out ElasticSearch"
}
}
}
成功查询到这条数据
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.7594807,
"hits": [
{
"_index": "twitter",
"_type": "tweet",
"_id": "1",
"_score": 0.7594807,
"_source": {
"user": "kimchy",
"post_date": "2019-11-15T14:12:12",
"message": "trying out ElasticSearch"
}
}
]
}
}
出现这个问题的原因是我们的message字段类型为text,text类型的字段会被解析,这意味着message的值会被解析器解析成一系列的term,存到倒排索引中。例如我们上文中的message在倒排索引(inverted index)中存的条目(term)应该是[trying, out],而term查询不会经过解析器,而是精确匹配倒排索引中的term,所以我们用全文trying out ElasticSearch匹配的时候是匹配不到值的。而单独使用trying或者out是可以匹配到值的。
match查询会先经过解析器解析,再进行匹配,并计算相关度,所以match是能匹配到的。
term查询为啥要叫term查询:倒排索引中的每一项称为一个term,根据term去匹配的查询叫term查询
有时候我们需要对某一个字段进行精确匹配,不想对其分词,这时候可以指定其mapping映射类型为keyword,如下:
{
"mappings": {
"tweet": {
"properties": {
"message": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"post_date": {
"type": "date"
},
"user": {
"type": "keyword"
}
}
}
}
}
此时再使用term查询
{
"query" : {
"term" : {
"message" : "trying out ElasticSearch"
}
}
}
可以成功匹配到文档
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "twitter",
"_type": "tweet",
"_id": "1",
"_score": 0.2876821,
"_source": {
"user": "kimchy",
"post_date": "2019-11-15T14:12:12",
"message": "trying out ElasticSearch"
}
}
]
}
}
综上:如果要精确匹配倒排索引中的值,使用term查询;如果要搜索文本中的值,使用match查询。