这是我参与8月更文挑战的第23天,活动详情查看:8月更文挑战
本Elasticsearch相关文章的版本为:7.4.2
term用于查找单个值。例如查找技术文章里面标签是Elasticsearch的文章,那么可以这样查询:
测试数据:
POST /term_test_index/_doc/1
{
"tag": "elasticsearch"
}
POST /term_test_index/_doc/2
{
"tag": ["elasticsearch", "python"]
}
POST /term_test_index/_doc/3
{
"tag": ["elasticsearch", "golang"]
}
查询文章标签是Elasticsearch的文章:
GET /term_test_index/_search
{
"query": {
"term": {
"tag": "elasticsearch"
}
}
}
返回的数据:
{
"took" : 630,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.08860736,
"hits" : [
{
"_index" : "term_test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.08860736,
"_source" : {
"tag" : "elasticsearch"
}
},
{
"_index" : "term_test_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.06850317,
"_source" : {
"tag" : [
"elasticsearch",
"python"
]
}
},
{
"_index" : "term_test_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.06850317,
"_source" : {
"tag" : [
"elasticsearch",
"golang"
]
}
}
]
}
}
从上面的查询结果中,可以发现term并非是严格意义上的精确查找,而是包含,因为上面的查询不仅查出了标签仅有elasticsearch的文章,而且还查出了既包含elasticsearch又包含python或golang的文档。这是为什么呢?
Elasticsearch会在倒排索引中查找包括term所查找的分词的所有文档。在我们构造的例子中,倒排索引表如下:
| 分词 | 文档_id |
|---|---|
| elasticsearch | 1, 2, 3 |
| python | 2 |
| golang | 3 |
term查询时候会到这个倒排索引总查询分词elasticsearch,它的文档_id对应着有1, 2, 3。所以这三个文档都被返回了。
如何才可以返回技术文章里面只有一个标签并且是elasticsearch的文档呢?
方法一:先找到包含elasticsearch的文档,然后到倒排索引里逐行判断这些文档是否还包含其他标签。但是这样的效率太低。
方法二:新增一个记录标签个数的字段
测试数据:
POST /term_test_index/_doc/4
{
"tag": "elasticsearch", "tag_count": 1
}
POST /term_test_index/_doc/5
{
"tag": ["elasticsearch", "python"], "tag_count": 2
}
POST /term_test_index/_doc/6
{
"tag": ["elasticsearch", "golang"], "tag_count": 2
}
查询技术文章仅有elasticsearch标签的文档:
GET /term_test_index/_search
{
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{"term": {"tag": "elasticsearch"}},
{"term": {"tag_count": 1}}
]
}
}
}
}
}
返回的数据符合预期了:
{
"took" : 777,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "term_test_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"tag" : "elasticsearch",
"tag_count" : 1
}
}
]
}
}