term查询无法匹配到文档问题

3,197 阅读2分钟

遇到了一个问题,同样一个查询用term无法匹配到,用match可以匹配到。具体如下:
文档mapping:

{
    "mappings": {
        "tweet": {
            "properties": {
                "message": {
                    "type": "text"
                },
                "name": {
                    "type": "text"
                },
                "post_date": {
                    "type": "date"
                },
                "user": {
                    "type": "text"
                }
            }
        }
    }
}

索引一个文档:

{
	"user" : "kimchy",
	"post_date" : "2019-11-15T14:12:12",
	"message" : "trying out ElasticSearch"
}

此时使用term根据message进行查询

{
	"query" : {
		"term" : {
			"message" : "trying out ElasticSearch"
		}
	}
}

查询结果为空

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 0,
        "max_score": null,
        "hits": []
    }
}

而使用match根据message进行查询

{
	"query" : {
		"match" : {
			"message" : "trying out ElasticSearch"
		}
	}
}

成功查询到这条数据

{
	"took": 3,
	"timed_out": false,
	"_shards": {
		"total": 5,
		"successful": 5,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": 1,
		"max_score": 0.7594807,
		"hits": [
			{
				"_index": "twitter",
				"_type": "tweet",
				"_id": "1",
				"_score": 0.7594807,
				"_source": {
					"user": "kimchy",
					"post_date": "2019-11-15T14:12:12",
					"message": "trying out ElasticSearch"
				}
			}
		]
	}
}

出现这个问题的原因是我们的message字段类型为text,text类型的字段会被解析,这意味着message的值会被解析器解析成一系列的term,存到倒排索引中。例如我们上文中的message在倒排索引(inverted index)中存的条目(term)应该是[trying, out],而term查询不会经过解析器,而是精确匹配倒排索引中的term,所以我们用全文trying out ElasticSearch匹配的时候是匹配不到值的。而单独使用trying或者out是可以匹配到值的。
match查询会先经过解析器解析,再进行匹配,并计算相关度,所以match是能匹配到的。

term查询为啥要叫term查询:倒排索引中的每一项称为一个term,根据term去匹配的查询叫term查询

有时候我们需要对某一个字段进行精确匹配,不想对其分词,这时候可以指定其mapping映射类型为keyword,如下:

{
    "mappings": {
        "tweet": {
            "properties": {
                "message": {
                    "type": "keyword"
                },
                "name": {
                    "type": "keyword"
                },
                "post_date": {
                    "type": "date"
                },
                "user": {
                    "type": "keyword"
                }
            }
        }
    }
}

此时再使用term查询

{
	"query" : {
		"term" : {
			"message" : "trying out ElasticSearch"
		}
	}
}

可以成功匹配到文档

{
	"took": 1,
	"timed_out": false,
	"_shards": {
		"total": 5,
		"successful": 5,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": 1,
		"max_score": 0.2876821,
		"hits": [
			{
				"_index": "twitter",
				"_type": "tweet",
				"_id": "1",
				"_score": 0.2876821,
				"_source": {
					"user": "kimchy",
					"post_date": "2019-11-15T14:12:12",
					"message": "trying out ElasticSearch"
				}
			}
		]
	}
}

综上:如果要精确匹配倒排索引中的值,使用term查询;如果要搜索文本中的值,使用match查询。