这是我参与8月更文挑战的第26天,活动详情查看:8月更文挑战
本Elasticsearch相关文章的版本为:7.4.2
默认排序字段_score
Elasticsearch为了对文档进行排序,需要使用一个浮点数来表示相关性,这个数据就是_score, 默认排序是按照_score降序排序。
但是,有时候相关性是没有任何意义的,例如你只想获取性别为男的文档,那么相关性是没有任何意义的。因为只要性别是男的文档就可以了,它们之间没有哪个文档比哪个文档更相关。
测试数据:
POST /sort_test_index/_doc/1
{
"gender": "M"
}
POST /sort_test_index/_doc/2
{
"gender": "F"
}
POST /sort_test_index/_doc/3
{
"gender": "M"
}
POST /sort_test_index/_doc/4
{
"gender": "F"
}
获取性别是男的文档:
GET /sort_test_index/_search
{
"query": {
"bool": {
"filter": {
"term": {
"gender.keyword": "M"
}
}
}
}
}
返回的结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "sort_test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"gender" : "M"
}
},
{
"_index" : "sort_test_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.0,
"_source" : {
"gender" : "M"
}
}
]
}
}
filter的查询会把_score的分数均设置为0,且将按照随机顺序返回文档。如果觉得0不适合理解,那么我们可以使用constant_score设置相关性得分为1。
GET /sort_test_index/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"gender.keyword": "M"
}
}
}
}
}
返回的数据的_score均为1:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "sort_test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"gender" : "M"
}
},
{
"_index" : "sort_test_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"gender" : "M"
}
}
]
}
}
按特定字段排序
比如图书馆来了一批新书籍,那么我们希望最晚入馆的书籍展示在最前面,那么我们可以这样查询:
POST /library_index/_doc/1
{
"on_live": "2021-08-18",
"name": "Python入门"
}
POST /library_index/_doc/2
{
"on_live": "2021-08-30",
"name": "Python进阶"
}
POST /library_index/_doc/3
{
"on_live": "2021-08-24",
"name": "Golang入门"
}
POST /library_index/_doc/4
{
"on_live": "2021-08-31",
"name": "Golang实战"
}
POST /library_index/_doc/5
{
"on_live": "2021-08-31",
"name": "Elasticsearch优化实战"
}
最晚入馆的书籍展示在最前面:
GET /library_index/_search
{
"query": {
"match_all": {}
},
"sort": {
"on_live": {
"order": "desc"
}
}
}
多重字段排序
如果我们想优先按入馆时间降序排序,如果如果时间相同则按书名升序排序,查询可以这样写:
GET /library_index/_search
{
"query": {
"match_all": {}
},
"sort": {
"on_live": {
"order": "desc"
},
"name.keyword": {
"order": "asc"
}
}
}
返回的数据:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "library_index",
"_type" : "_doc",
"_id" : "5",
"_score" : null,
"_source" : {
"on_live" : "2021-08-31",
"name" : "Elasticsearch优化实战"
},
"sort" : [
1630368000000,
"Elasticsearch优化实战"
]
},
{
"_index" : "library_index",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"on_live" : "2021-08-31",
"name" : "Golang实战"
},
"sort" : [
1630368000000,
"Golang实战"
]
},
{
"_index" : "library_index",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"on_live" : "2021-08-30",
"name" : "Python进阶"
},
"sort" : [
1630281600000,
"Python进阶"
]
},
{
"_index" : "library_index",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"on_live" : "2021-08-24",
"name" : "Golang入门"
},
"sort" : [
1629763200000,
"Golang入门"
]
},
{
"_index" : "library_index",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"on_live" : "2021-08-18",
"name" : "Python入门"
},
"sort" : [
1629244800000,
"Python入门"
]
}
]
}
}
多值字段排序
如果你要对一个具有多个值的字段进行排序,那么你需要从多个值中指定哪个值来参与排序。
对于日期等数值型数据,可以通过使用 min 、 max 、 avg 或是 sum 来指定排序。
比如,假如有一个update_at的字段记录了每次更新的时间,我们想按更新时间字段进行降序排序,然后取update_at的最后一次更新时间来排序,那么可以这样进行查询:
GET /library_index/_search
{
"query": {
"match_all": {}
},
"sort": {
"update_at": {
"order": "desc",
"mode": "max"
}
}
}