Elasticsearch script sort

1,127 阅读3分钟

前言

我们在日常开发中经常会遇到一些复杂的排序场景,这种排序往往用简单字段+升降序的方式无法实现。举个例子,对于书本的排序:历史书排在科技书前面,并且历史书需要按照出版时间正序排列,但是科技书又需要按照时间倒叙排列。这个时候,我们就需要利用Es的脚本排序。

场景构建

首先,我们用ES来新建一个简单的索引,并且写入一些数据,我这里用的ES版本是6.5.4。 索引结构如下:publish_time表示发布时间,book_type表示书本类型,我们假设1是历史书,2是科技书。

PUT test_book
{
  "aliases": {
    "book": {}
  },
  "mappings": {
    "_doc": {
      "properties": {
        "id": {
          "type": "integer"
        },
        "brief": {
          "type": "text",
          "analyzer": "ik_smart"
        },
        "name": {
          "type": "text",
          "analyzer": "ik_smart"
        },
        "publish_time": {
          "type": "long"
        },
        "book_type": {
          "type": "integer"
        }
      }
    }
  },
  "settings": {
    "index": {
      "refresh_interval": "1s",
      "number_of_shards": "3",
      "number_of_replicas": "1"
    }
  }
}

有了索引之后,我们来批量插入一些数据。

POST _bulk
{"index":{"_index":"test_book",  "_type":"_doc","_id":1}}
{"brief":"是本很值得收藏的书","name":"资治通鉴","publish_time":1,"id":1,"book_type":1}
{"index":{"_index":"test_book", "_type":"_doc","_id":2}}
{"brief":"是本很值得收藏的书","name":"史记","publish_time":2,"id":2,"book_type":1}
{"index":{"_index":"test_book",  "_type":"_doc","_id":3}}
{"brief":"是本很值得收藏的书","name":"JAVA从入门到放弃","publish_time":3,"id":3,"book_type":2}
{"index":{"_index":"test_book",  "_type":"_doc","_id":4}}
{"brief":"是本很值得收藏的书","name":"C++从入门到放弃","publish_time":4,"id":4,"book_type":2}

排序

我们先来看下最简单的排序,先把课本按照历史在前,科技在后排序。

GET /book/_search
{
  "sort": [
    {
      "book_type": {
        "order": "asc"
      }
    }
  ]
}

这么写,很简单。也很快得到结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "name" : "史记"
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "资治通鉴"
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "name" : "C++从入门到放弃"
        },
        "sort" : [
          2
        ]
      },
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "name" : "JAVA从入门到放弃"
        },
        "sort" : [
          2
        ]
      }
    ]
  }
}

但是现在要求的是历史书要正序,而科技书倒叙,这怎么排呢?

别急,这个时候,我们就可以利用ES的脚本排序,先来公布下答案:

GET /book/_search
{
  "sort": [
    {
      "book_type": {
        "order": "asc"
      }
    },
    {
      "_script": {
        "script": {
          "source": "def type = doc['book_type'].value; def time = doc['publish_time'].value;if(type==1) return time; else return -time",
          "lang": "painless"
        },
        "type": "number",
        "order": "asc"
      }
    }
  ]
}

这么写之后,得到了正确的顺序:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "资治通鉴"
        },
        "sort" : [
          1,
          1.0
        ]
      },
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "name" : "史记"
        },
        "sort" : [
          1,
          2.0
        ]
      },
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "name" : "C++从入门到放弃"
        },
        "sort" : [
          2,
          -4.0
        ]
      },
      {
        "_index" : "test_book",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "name" : "JAVA从入门到放弃"
        },
        "sort" : [
          2,
          -3.0
        ]
      }
    ]
  }
}

这就是ES脚本排序的基本用法,其实这里类型只有两种,历史和科技,可以利用简单排序来实现第一层排序,如果有三种:历史、科技、文学。这三种的type分别为1,2,3。第一层排序要求的是文学>历史>科技,这个时候我们的第一层排序 { "book_type": { "order": "asc" } }其实就不起作用了,有兴趣的可以想想看,怎么利用一个脚本,或者两个脚本,写出这两层排序。

总结

看了这个之后,基本的脚本用法,大家肯定都了解了。但是具体到自己的业务,又不知道怎么写了。 具体的语法,可以多参考官方文档

脚本不仅可以排序,还可以用来查询。

更多精彩内容,请关注公众号