开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第34天,点击查看活动详情
前文参考如下:
9.工具使用:Elasticsearch从入门到放弃(1)-Elasticsearch概念篇
10.工具使用:Elasticsearch从入门到放弃(2)-相关性算法
11.工具使用:Elasticsearch从入门到放弃(3)-权重及打分 (qq.com)
3.Elasticsearch从入门到放弃-Elasticsearch概念篇 - 掘金 (juejin.cn)
4.Elasticsearch从入门到放弃-相关性算法 - 掘金 (juejin.cn)
5.Elasticsearch从入门到放弃-权重及打分 - 掘金 (juejin.cn)
59.工具使用-Elasticsearch从入门到放弃-聚合概述 - 掘金 (juejin.cn)
60.工具使用-Elasticsearch从入门到放弃-聚合方式排序及分页 - 掘金 (juejin.cn)
63.工具使用-Elasticsearch从入门到放弃-nested对象 - 掘金 (juejin.cn)
参考文档; Elasticsearch教程(32) ES 聚合查询后过滤 Distinct Group By Having功能
ElasticSearch实现类SQL的sum,count,group by,having功能
ES分组聚合Agg nested_华生cn的博客-CSDN博客_es nested 聚合
elasticsearch - Elasticsearch在inner_hits上聚合 - Thinbug
ElasticSearch---es之Post Filter,聚合后过滤
1.项目背景
这几天产品提出了一个新需求,提供一个民宿搜索的功能,有点类似于携程飞猪这些平台酒店民宿搜索,设计如图:
具体需求
基于当前平台已有民宿商品,提供民宿搜索。
- 默认定位宁波·余姚,可切换定位,底部商品列表需要更新为对应区域下的酒店民宿类商品
- 默认选中今天和明天搜索,最多搜索范围3个月(单个房间需要当前日期往后推三个月(90)之内的价格和库存,最大查询入住和离店在此范围内间隔两个月的房间)
- 提示文案改成“店名/地点/关键字”(店名商品名) 点击显示输入键盘,可输入搜索关键字,输入键盘有下家显示搜索按钮,点击带日期及关键字跳转酒店民宿搜索结果页
- 显示选中日期有房可订的房间。 (搜索时判断每天库存是否大于0)
- 默认按综合排序, 排序条件 选项包括:综合推荐(默认选中)、离我最近、低价优先
- 增加评分/价格筛选
2. 设计
再确认了需求之后,接下来就是分析难点,做具体设计。
首先,一开始的计划是,根据产品的需求将每个商品最近90天的日期价格和库存,以nested对象形式存储起来。这样在做时间价格的时候,可以将内部价格集合也筛选出来。参考:63.工具使用-Elasticsearch从入门到放弃-nested对象 - 掘金 (juejin.cn)
其次,这一块有一个需求是:用户可以选择入住和离开的时间,并可以筛选出这段时间内每天价格在预算范围,且每天都有库存的民宿。 这样这块就涉及到一个问题,由于90天价格列表是以nested对象列表的子集合存储的,那么判断每天的价格和库存的时候,就有可能遍历每天的库存和价格去做比较。如果选择的天数只有一两天,三四天,那么这样比较最多循环比较几次还是可以的。可是产品的需求偏偏最多可以选择到60天的民宿搜索,那么假设选择90天里面的60天,然后基于价格搜索,那么就得比较60次。这样显然特别消耗性能,也是不太合适的。
于是硬着头皮,找产品同学battle,想改一改需求,间隔时间可以调短一些。交涉的结果就是,就算时间间隔调短了,但是不排除后续用户需求可以搜索到一个月以上间隔的民宿,所以还是得想办法实现掉。
所以,只能硬着头皮往下做了。
想来想去,问题无非就是,如果我选择搜索的时间很长,怎么保证价格预算?以及每天都有房源?而且比较过程相对减少比较次数?
3. 方案
想来想去,最终决定方案如下:
1.每个商品最近90天的日期价格和库存,以nested对象形式存储起来 2.在选择时间范围搜索时,为了减少比较次数,对价格日期地区,搜索框条件进行过滤后,再对均价进行二次筛选,保证均价在搜索范围内,并且保证有库存的天数等于搜索的时间天数 (例如:12.1到12.30号这三十天价格范围100-200的民宿;那么30天均价在100-200即可;** 房间状态的逻辑是库存的判断,根据库存大于0的天数和查询时间范围的天数对比,天数相同则认为每天都有库存;综合条件就是均价在搜索范围内,有库存的天数等于搜索天数的民宿)
上述做的好处是:比较无需遍历每天价格和库存,只需要在汇总时做二次筛选。 上述做的弊端是:分页相对麻烦,二次筛选每页数据不固定。
4. 落地
基于上述方案,其实最终落地效果就类似于先按照时间,价格,搜索名称(不为空)搜索;然后基于商品分组;然后在基于分组后,每个民宿平均价格过滤。
如果用sql实现就类似于 select * from goods_index where time>=? and time<=? and price>=? and price<=? and good_name=? group by good_id having(agv(price)>=? and agv(price)<=?);
也就是如何在es搜索中实现 select where 和group by having?
数据结构
nested嵌套存储民俗价格
{
"goods_index" : {
"mappings" : {
"properties" : {
"ancestryCategoryId" : {
"type" : "long"
},
"createTime" : {
"type" : "long"
},
"district" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"goodCommentRate" : {
"type" : "long"
},
"goodsAttribute" : {
"properties" : {
"netWeight" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"packaging" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"goodsTags" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"hotelPrices" : {
"type" : "nested",
"properties" : {
"goodsId" : {
"type" : "long"
},
"promPrice" : {
"type" : "float"
},
"sellPrice" : {
"type" : "float"
},
"specValue" : {
"type" : "date"
},
"stockQuantity" : {
"type" : "long"
}
}
},
"id" : {
"type" : "long"
},
"imageMainUrl" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"latitude" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"location" : {
"type" : "geo_point"
},
"longitude" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text"
},
"price" : {
"type" : "float"
},
"province" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"salesSum" : {
"type" : "long"
},
"storeId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"storeName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"supplierId" : {
"type" : "long"
}
}
}
}
}
其中在商品索引中维护了nested类的民俗价格列表
"hotelPrices" : {
"type" : "nested",
"properties" : {
"goodsId" : {
"type" : "long"
},
"promPrice" : {
"type" : "float"
},
"sellPrice" : {
"type" : "float"
},
"specValue" : {
"type" : "date"
},
"stockQuantity" : {
"type" : "long"
}
}
},
大概数据格式
{
"_index" : "goods_index",
"_type" : "_doc",
"_id" : "10027",
"_score" : 1.0,
"_source" : {
"ancestryCategoryId" : 2,
"city" : "3301",
"cityName" : "杭州市",
"createTime" : 1604371603000,
"district" : "330122",
"districtName" : "桐庐县",
"goodsAttribute" : { },
"goodsTags" : "[打卡地, 可长住, 视野开阔, 环境安静]",
"hotelPrices" : [
{
"goodsId" : 10027,
"promPrice" : 0.0,
"sellPrice" : 1299.0,
"specValue" : "2022-12-21",
"stockQuantity" : 1
},
{
"goodsId" : 10027,
"promPrice" : 0.0,
"sellPrice" : 1299.0,
"specValue" : "2022-12-22",
"stockQuantity" : 1
},
{
"goodsId" : 10027,
"promPrice" : 0.0,
"sellPrice" : 1299.0,
"specValue" : "2022-12-23",
"stockQuantity" : 1
},
{
"goodsId" : 10027,
"promPrice" : 0.0,
"sellPrice" : 1299.0,
"specValue" : "2022-12-24",
"stockQuantity" : 1
},
...90天价格....
{
"goodsId" : 10027,
"promPrice" : 0.0,
"sellPrice" : 1299.0,
"specValue" : "2023-03-19",
"stockQuantity" : 1
},
{
"goodsId" : 10027,
"promPrice" : 0.0,
"sellPrice" : 1299.0,
"specValue" : "2023-03-20",
"stockQuantity" : 1
}
],
"id" : 10027,
"imageMainUrl" : "https://1111.png",
"imageUrls" : "https://111.png",
"latitude" : "29.684106",
"location" : "29.684106,119.677711",
"longitude" : "119.677711",
"name" : "aaaa大床房",
"price" : 1299.0,
"province" : "3300",
"provinceName" : "浙江省",
"storeId" : "1",
"storeName" : "测试店铺",
"supplierId" : 1,
"totalCommentStar" : 5.0,
"trait" : "舒适",
"videoUrl" : "https://111.mp4",
"visitCount" : 94
}
},
实现group by having效果
第一次尝试-不对
select * from goods_index where ancestryCategoryId=2 and hotelPrices.sellPrice>=200 and hotelPrices.sellPrice<=1000 and hotelPrices.stockQuantity>0 group by hotelPrices.goodsId
GET goods_index/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"term": {
"ancestryCategoryId": {
"value": "2"
}
}
},
{
"nested": {
"path": "hotelPrices",
"ignore_unmapped": true,
"score_mode": "none",
"boost": 1,
"inner_hits": {
"ignore_unmapped": true,
"from": 0,
"size": 3,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"track_scores": false
},
"query": {
"bool": {
"must": [
{
"range": {
"hotelPrices.sellPrice": {
"gte": 200,
"lte": 1000
}
}
},
{
"range": {
"hotelPrices.stockQuantity": {
"gt":0
}
}
}
]
}
}
}
}
]
}
},
"aggregations": {
"salesNested": {
"nested": {
"path": "hotelPrices"
},
"aggregations": {
"group_by": {
"terms": {
"field": "hotelPrices.goodsId",
"size": 10,
"order": {
"_key": "asc"
}
}
}
}
}
}
}
查看结果,好像不太对,因为聚合计算的不是过滤之后的嵌套文档,即不是inner_hits,因为nested类型会将筛选之后的结果单独存在inner_hits,可以看到每个商品都是90天的价格列表,并不是按照时间价格过滤的数据,不合适。结果大致如下:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 11,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_score" : 1.0,
"_source" : {
"ancestryCategoryId" : 2,
"city" : "3301",
"cityName" : "杭州市",
"createTime" : 1607136062000,
"district" : "330122",
"districtName" : "桐庐县",
"goodCommentRate" : 0,
"goodsAttribute" : { },
"goodsTags" : "[免费云停车, 可长住]",
"hotelPrices" : [
{
"goodsId" : 12278,
"promPrice" : 0.0,
"sellPrice" : 258.0,
"specValue" : "2022-12-07",
"stockQuantity" : 10
},
//....90天价格...
{
"goodsId" : 12278,
"promPrice" : 0.0,
"sellPrice" : 258.0,
"specValue" : "2023-03-06",
"stockQuantity" : 10
}
],
"id" : 12278,
"imageMainUrl" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg",
"imageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg",
"isDistribute" : 0,
"name" : "芦茨白云源 双源民宿双床房 落地窗",
"onOff" : 1,
"originImageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg,https://wsnbh-img.hzanchu.com/acimg/52fbc6815688984a6bed22df3373227f.jpeg,https://wsnbh-img.hzanchu.com/acimg/ce2024cc2f221c3f86f162bd970f71ed.jpeg,",
"price" : 258.0,
"storeId" : "1765",
"supplierId" : 2242,
"totalCommentStar" : 0.0,
"trait" : "老板娘独自经营 舒适温馨",
"videoUrl" : "",
"visitCount" : 26
},
"inner_hits" : {
"hotelPrices" : {
"hits" : {
"total" : {
"value" : 90,
"relation" : "eq"
},
"max_score" : 2.0,
"hits" : [
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_nested" : {
"field" : "hotelPrices",
"offset" : 0
},
"_score" : 2.0,
"_source" : {
"goodsId" : 12278,
"specValue" : "2022-12-07",
"stockQuantity" : 10,
"sellPrice" : 258.0,
"promPrice" : 0.0
}
},
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_nested" : {
"field" : "hotelPrices",
"offset" : 1
},
"_score" : 2.0,
"_source" : {
"goodsId" : 12278,
"specValue" : "2022-12-08",
"stockQuantity" : 10,
"sellPrice" : 258.0,
"promPrice" : 0.0
}
},
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_nested" : {
"field" : "hotelPrices",
"offset" : 2
},
"_score" : 2.0,
"_source" : {
"goodsId" : 12278,
"specValue" : "2022-12-09",
"stockQuantity" : 10,
"sellPrice" : 258.0,
"promPrice" : 0.0
}
}
]
}
}
}
}
]
},
"aggregations" : {
"salesNested" : {
"doc_count" : 990,
"group_by" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 90,
"buckets" : [
{
"key" : 12278,
"doc_count" : 90
},
{
"key" : 12304,
"doc_count" : 90
},
{
"key" : 13759,
"doc_count" : 90
},
{
"key" : 13850,
"doc_count" : 90
},
{
"key" : 13856,
"doc_count" : 90
},
{
"key" : 17330,
"doc_count" : 90
},
{
"key" : 17331,
"doc_count" : 90
},
{
"key" : 17332,
"doc_count" : 90
},
{
"key" : 17333,
"doc_count" : 90
},
{
"key" : 17334,
"doc_count" : 90
}
]
}
}
}
}
第二次尝试-在筛选中加入前置过滤
因为第一次尝试中聚合计算的不是 过滤之后的嵌套文档,即不是inner_hits,所以在聚合时也对nested里面的内容做了层层的过滤) select * from goods_index where ancestryCategoryId=2 and hotelPrices.sellPrice>=200 and hotelPrices.sellPrice<=1000 and hotelPrices.stockQuantity>0 group by hotelPrices.goodsId (但是聚合过滤后每个商品符合条件的天数还需要再过滤,目前没做having(count>天数)
这块涉及到一个概念 前置过滤器:有时需要对聚合条件进一步地过滤,但是又不能影响当前的查询条件,可以加入前置过滤器
后置过滤器: 在有些场景中,需要根据条件进行数据查询,但是聚合的结果集不受影响,可以加入后置过滤器
GET goods_test2/_search
{
"from": 0,
"size": 1,
"query": {
"bool": {
"must": [
{
"term": {
"ancestryCategoryId": {
"value": "2"
}
}
},
{
"nested": {
"path": "hotelPrices",
"ignore_unmapped": true,
"score_mode": "none",
"boost": 1,
"inner_hits": {
"ignore_unmapped": true,
"from": 0,
"size": 90,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"track_scores": false
},
"query": {
"bool": {
"must": [
{
"range": {
"hotelPrices.sellPrice": {
"gte": 200,
"lte": 1000
}
}
},
{
"range": {
"hotelPrices.stockQuantity": {
"gt": 0
}
}
}
]
}
}
}
}
]
}
},
"aggs": {
"filtered_nested": {
"nested": {
"path": "hotelPrices"
},
"aggs": {
"where": {
"range": {
"field": "hotelPrices.sellPrice",
"ranges": [
{
"from": 200,
"to": 1000
}
]
},
"aggs": {
"and_where": {
"range": {
"field": "hotelPrices.stockQuantity",
"ranges": [
{
"from": 0.1
}
]
},
"aggs": {
"group_by": {
"terms": {
"field": "hotelPrices.goodsId",
"size": 10,
"order": {
"_key": "asc"
}
}
}
}
}
}
}
}
}
}
}
结果如下:
{
"took" : 31,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 11,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_score" : 1.0,
"_source" : {
"ancestryCategoryId" : 2,
"atmosphereImage" : "https://wsnbh-img.hzanchu.com/test/168650774865616896.png",
"backCatId" : 357,
"city" : "3301",
"cityName" : "杭州市",
"createTime" : 1607136062000,
"district" : "330122",
"districtName" : "桐庐县",
"frontendIds" : "_,2_157",
"goodCommentRate" : 0,
"goodsAttribute" : { },
"goodsTags" : "[免费云停车, 可长住]",
"hotelPrices" : [
{
"goodsId" : 12278,
"promPrice" : 0.0,
"sellPrice" : 258.0,
"specValue" : "2022-12-07",
"stockQuantity" : 10
},
{
"goodsId" : 12278,
"promPrice" : 0.0,
"sellPrice" : 258.0,
"specValue" : "2022-12-08",
"stockQuantity" : 10
},
{
"goodsId" : 12278,
"promPrice" : 0.0,
"sellPrice" : 258.0,
"specValue" : "2022-12-09",
"stockQuantity" : 10
},
//...90天民宿....
{
"goodsId" : 12278,
"promPrice" : 0.0,
"sellPrice" : 258.0,
"specValue" : "2023-03-06",
"stockQuantity" : 10
}
],
"id" : 12278,
"imageMainUrl" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg",
"imageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg",
"isDistribute" : 0,
"name" : "芦茨白云源 双源民宿双床房 落地窗",
"onOff" : 1,
"originImageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg,https://wsnbh-img.hzanchu.com/acimg/52fbc6815688984a6bed22df3373227f.jpeg,https://wsnbh-img.hzanchu.com/acimg/ce2024cc2f221c3f86f162bd970f71ed.jpeg,",
"price" : 258.0,
"projectCode" : 3300,
"promType" : 0,
"province" : "3300",
"provinceName" : "浙江省",
"salesSum" : 1,
"salesSumTrue" : 1,
"sellerCity" : 1,
"sellerDistinct" : 17,
"sellerProvince" : 43282904292360190,
"siteName" : "杭州馆",
"storeId" : "1765",
"supplierId" : 2242,
"totalCommentStar" : 0.0,
"trait" : "老板娘独自经营 舒适温馨",
"videoUrl" : "",
"visitCount" : 26
},
"inner_hits" : {
"hotelPrices" : {
"hits" : {
"total" : {
"value" : 90,
"relation" : "eq"
},
"max_score" : 2.0,
"hits" : [
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_nested" : {
"field" : "hotelPrices",
"offset" : 0
},
"_score" : 2.0,
"_source" : {
"goodsId" : 12278,
"specValue" : "2022-12-07",
"stockQuantity" : 10,
"sellPrice" : 258.0,
"promPrice" : 0.0
}
},
//....90天民宿...
{
"_index" : "goods_test2",
"_type" : "_doc",
"_id" : "12278",
"_nested" : {
"field" : "hotelPrices",
"offset" : 89
},
"_score" : 2.0,
"_source" : {
"goodsId" : 12278,
"specValue" : "2023-03-06",
"stockQuantity" : 10,
"sellPrice" : 258.0,
"promPrice" : 0.0
}
}
]
}
}
}
}
]
},
"aggregations" : {
"filtered_nested" : {
"doc_count" : 990,
"where" : {
"buckets" : [
{
"key" : "200.0-1000.0",
"from" : 200.0,
"to" : 1000.0,
"doc_count" : 904,
"and_where" : {
"buckets" : [
{
"key" : "0.1-*",
"from" : 0.1,
"doc_count" : 903,
"group_by" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 4,
"buckets" : [
{
"key" : 12278,
"doc_count" : 90
},
{
"key" : 12304,
"doc_count" : 90
},
{
"key" : 13759,
"doc_count" : 89
},
{
"key" : 13850,
"doc_count" : 90
},
{
"key" : 13856,
"doc_count" : 90
},
{
"key" : 17330,
"doc_count" : 90
},
{
"key" : 17331,
"doc_count" : 90
},
{
"key" : 17332,
"doc_count" : 90
},
{
"key" : 17333,
"doc_count" : 90
},
{
"key" : 17334,
"doc_count" : 90
}
]
}
}
]
}
}
]
}
}
}
}
可以看到 13759 这个商品符合条件的只有89天符合条件
第三次尝试-group by having
select * from goods_index where ancestryCategoryId=2 and hotelPrices.specValue>? and hotelPrices.specValue<? group by goodId having(avg(hotelPrices.sellPrice)>? and avg(hotelPrices.sellPrice)<? )
聚合过滤后对每个商品库存>0的天数还需要再过滤做having(count>天数);以及计算均价
GET goods_test2/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"term": {
"ancestryCategoryId": {
"value": "2"
}
}
},
{
"nested": {
"path": "hotelPrices",
"ignore_unmapped": true,
"score_mode": "none",
"boost": 1,
"inner_hits": {
"ignore_unmapped": true,
"from": 0,
"size": 90,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"track_scores": false
},
"query": {
"bool": {
"must": [
{
"range": {
"hotelPrices.sellPrice": {
"gte": 200,
"lte": 1000
}
}
},
{
"range": {
"hotelPrices.stockQuantity": {
"gt": 0
}
}
},
{
"range": {
"hotelPrices.specValue": {
"gte": "2022-12-01",
"lte": "2022-12-30"
}
}
}
]
}
}
}
}
]
}
},
"aggs": {
"filtered_nested": {
"nested": {
"path": "hotelPrices"
},
"aggs": {
"where": {
"filter": {
"bool": {
"filter": [
{
"range": {
"hotelPrices.specValue": {
"gte": "2022-12-08",
"lte": "2022-12-30"
}
}
},
{
"range": {
"hotelPrices.stockQuantity": {
"gt": 0
}
}
}
]
}
},
"aggs": {
"group_by": {
"terms": {
"field": "hotelPrices.goodsId",
"size": 90,
"order": {
"_key": "asc"
}
},
"aggs": {
"avg_price": {
"avg": {
"field": "hotelPrices.sellPrice"
}
},
"aggs": {
//"having": {
//基于筛选后的内容算平均值,且平均值大于600小于900
"bucket_selector": {
"buckets_path": {
"avgprice": "avg_price",
"counts": "_count"
},
"script": "params.avgprice>600 && params.avgprice<900 &¶ms.counts==22"
}
// }
}
}
}
}
}
}
}
}
}
注意对于查询之后的均值做二次筛选,需要用到bucket_selector,avg_price和_count均为上述聚合统计算的结果,aggregations从外到里面会分别存各筛选条件的统计维度
最终尝试,带入排序等规则
2022-12-22入住,2022-12-23号离开,均价300-800的民宿,综合排序
GET goods_index/_search
{
"from": 0,
"size": 10,
"timeout": "5s",
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"terms": {
"ancestryCategoryId": [
"2"
],
"boost": 1
}
},
{
"term": {
"projectCode": {
"value": 3300,
"boost": 1
}
}
},
{
"terms": {
"city": [
"3301"
],
"boost": 1
}
},
{
"range": {
"totalCommentStar": {
"from": 0,
"to": null,
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"range": {
"hotelPrices.specValue": {
"from": "2022-12-21T16:00:00.000Z",
"to": "2022-12-22T16:00:00.000Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
{
"range": {
"hotelPrices.stockQuantity": {
"from": 0,
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"path": "hotelPrices",
"ignore_unmapped": true,
"score_mode": "none",
"boost": 1,
"inner_hits": {
"ignore_unmapped": true,
"from": 0,
"size": 90,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"track_scores": false
}
}
},
{
"match_all": {
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
},
"_source": {
"includes": [
"supplierId",
"storeId",
"goodsTags",
"price",
"salesSum",
"imageUrls",
"name",
"trait",
"id",
"ancestryCategoryId",
"province",
"city",
"district"
"imageMainUrl",
"goodsAttributes",
"hotelPrices",
"storeName",
"latitude",
"longitude"
],
"excludes": [
"backCatId"
]
},
"sort": [
{
"salesSum": {
"order": "asc"
}
},
{
"goodCommentRate": {
"order": "desc"
}
}
],
"aggregations": {
"filtered_nested": {
"nested": {
"path": "hotelPrices"
},
"aggregations": {
"where": {
"filter": {
"bool": {
"filter": [
{
"range": {
"hotelPrices.specValue": {
"from": "2022-12-21T16:00:00.000Z",
"to": "2022-12-22T16:00:00.000Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
{
"range": {
"hotelPrices.stockQuantity": {
"from": 0,
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"aggregations": {
"group_by": {
"terms": {
"field": "hotelPrices.goodsId",
"size": 100,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_key": "asc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"avg_price": {
"avg": {
"field": "hotelPrices.sellPrice"
}
},
"having": {
"bucket_selector": {
"buckets_path": {
"counts": "_count",
"avgprice": "avg_price"
},
"script": {
"source": "params.avgprice>=300 && params.avgprice<=800&¶ms.counts==1",
"lang": "painless"
},
"gap_policy": "skip"
}
}
}
}
}
}
}
}
}
}
结果如下:
{
"took" : 15,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1192,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "goods_index",
"_type" : "_doc",
"_id" : "41378",
"_score" : null,
"_source" : {
"supplierId" : 103149696022265856,
"city" : "3301",
"goodsAttribute" : { },
"latitude" : "30.409022",
"siteName" : "杭州馆",
"goodsTags" : "[老友相聚, 有KTV, 卫生优秀, 环境安静]",
"cityName" : "杭州市",
"province" : "3300",
"projectCode" : 3300,
"price" : 580,
"trait" : "阳光房 棋牌室 K歌 烧烤 投影仪 花园",
"storeName" : "xxx民宿",
"ancestryCategoryId" : 2,
"id" : 41378,
"longitude" : "119.805612",
"promType" : 0,
"districtName" : "余杭区",
"hotelPrices" : [
{
"goodsId" : 41378,
"specValue" : "2022-12-21",
"stockQuantity" : 1,
"sellPrice" : 580.0,
"promPrice" : 0.0
},
//90天价格
{
"goodsId" : 41378,
"specValue" : "2023-03-20",
"stockQuantity" : 1,
"sellPrice" : 580.0,
"promPrice" : 0.0
}
],
"imageMainUrl" : "https://111.jpeg",
"salesSum" : 0,
"storeId" : "103887395157442560",
"totalCommentStar" : 5.0,
"district" : "330110",
"imageUrls" : "https://111.jpeg",
"name" : "xxx双床房",
"provinceName" : "浙江省"
},
"sort" : [
0,
1
],
"inner_hits" : {
"hotelPrices" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.0,
"hits" : [
{
"_index" : "goods_pre",
"_type" : "_doc",
"_id" : "41378",
"_nested" : {
"field" : "hotelPrices",
"offset" : 1
},
"_score" : 2.0,
"_source" : {
"goodsId" : 41378,
"specValue" : "2022-12-22",
"stockQuantity" : 1,
"sellPrice" : 580.0,
"promPrice" : 0.0
}
}
]
}
}
}
}]
}
}
}
}
]
},
"aggregations" : {
"filtered_nested" : {
"doc_count" : 107280,
"where" : {
"doc_count" : 1192,
"group_by" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 1092,
"buckets" : [
{
"key" : 12262,
"doc_count" : 1,
"avg_price" : {
"value" : 320.0
}
},
//..........
{
"key" : 19282,
"doc_count" : 1,
"avg_price" : {
"value" : 780.0
}
},
{
"key" : 19283,
"doc_count" : 1,
"avg_price" : {
"value" : 780.0
}
}
]
}
}
}
}
}
结果来看,aggregations从外到里面会分别存各筛选条件的统计维度;先嵌套filtered_nested前置过滤,再group_by分组,avg_price计算均值;bucket_selector基于均值和库存天数二次筛选。然后取bucket_selector结果集作为搜索结果列表
部分代码逻辑
// 设置查询条件
buildHotelSearchQuery(searchHotelRequest, sourceBuilder, boolQueryBuilder);
// 设置排序规则
buildHotelSearchSort(searchHotelRequest, sourceBuilder);
// 设置聚合
buildHotelSearchAggregation(searchHotelRequest,sourceBuilder);
// 设置结果
buildHotelSearchResponse(searchHotelRequest,response);
1. 设置查询条件
private void buildHotelSearchQuery(SearchHotelRequest searchHotelRequest, SearchSourceBuilder sourceBuilder, BoolQueryBuilder boolQueryBuilder) {
//顶级类目
if (searchHotelRequest.getAncesCategory() != null && searchHotelRequest.getAncesCategory() > 0) {
Long ancesCateGory = searchHotelRequest.getAncesCategory();
TermsQueryBuilder supplierIdQueryBuilder = QueryBuilders.termsQuery("ancestryCategoryId", ancesCateGory.toString());
boolQueryBuilder.must(supplierIdQueryBuilder);
}
//省级行政码
if (searchHotelRequest.getProvince() != null && searchHotelRequest.getProvince() != 0L) {
Long province = searchHotelRequest.getProvince();
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("projectCode", province);
boolQueryBuilder.must(termQueryBuilder);
}
//市行政码
if (StringUtils.isNotBlank(searchHotelRequest.getCityCode())) {
TermsQueryBuilder cityCodeQueryBuilder = QueryBuilders.termsQuery("city", searchHotelRequest.getCityCode());
boolQueryBuilder.must(cityCodeQueryBuilder);
}
//县行政码
if (StringUtils.isNotBlank(searchHotelRequest.getDistrictCode())) {
TermsQueryBuilder districtCodeQueryBuilder = QueryBuilders.termsQuery("district", searchHotelRequest.getDistrictCode());
boolQueryBuilder.must(districtCodeQueryBuilder);
}
//评分
if (ObjectUtil.isNotEmpty(searchHotelRequest.getTotalCommentStarRange())) {
RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("totalCommentStar");
//4.8分以上; 2:4.5分以上; 3:4分以上; 4:3.5分以上
switch (searchHotelRequest.getTotalCommentStarRange()){
case 1:
rangeQueryBuilder.from(4.8);
break;
case 2:
rangeQueryBuilder.from(4.5);
break;
case 3:
rangeQueryBuilder.from(4);
break;
case 4:
rangeQueryBuilder.from(3.5);
break;
default:
rangeQueryBuilder.from(0);
break;
}
boolQueryBuilder.must(rangeQueryBuilder);
}
//距离
if(ObjectUtil.isEmpty(searchHotelRequest.getDistance())) {
searchHotelRequest.setDistance(50000L);//默认50000米
}
if (ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude())) {
double latitude = searchHotelRequest.getLatitude();
double longitude = searchHotelRequest.getLongitude();
GeoDistanceQueryBuilder geoDistanceQueryBuilder = QueryBuilders.geoDistanceQuery("location").point(latitude, longitude)
.distance(searchHotelRequest.getDistance(), DistanceUnit.METERS);
boolQueryBuilder.filter(geoDistanceQueryBuilder);
}
//嵌套文档条件:起止时间,价格,库存
BoolQueryBuilder boolQueryBuilder2 = QueryBuilders.boolQuery();
//入住起止时间
if (ObjectUtil.isNotEmpty(searchHotelRequest.getStartTime())&&ObjectUtil.isNotEmpty(searchHotelRequest.getEndTime())) {
RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.specValue");
rangeQueryBuilder.from(searchHotelRequest.getStartTime());
rangeQueryBuilder.to(searchHotelRequest.getEndTime());
boolQueryBuilder2.must(rangeQueryBuilder);
}
//价格区间
if (ObjectUtil.isNotEmpty(searchHotelRequest.getPriceRange())) {
RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.sellPrice");
//1:0-300; 2:300-800; 3:800-1500; 4:1500以上
switch (searchHotelRequest.getPriceRange()){
case 1:
rangeQueryBuilder.from(0);
rangeQueryBuilder.to(300);
break;
case 2:
rangeQueryBuilder.from(300);
rangeQueryBuilder.to(800);
break;
case 3:
rangeQueryBuilder.from(800);
rangeQueryBuilder.to(1500);
break;
case 4:
rangeQueryBuilder.from(1500);
break;
default:
rangeQueryBuilder.from(0);
break;
}
boolQueryBuilder2.must(rangeQueryBuilder);
}
//库存大于0
RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.stockQuantity");
rangeQueryBuilder.gt(0);
boolQueryBuilder2.must(rangeQueryBuilder);
NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("hotelPrices", boolQueryBuilder2, ScoreMode.None);
InnerHitBuilder innerHitBuilder = new InnerHitBuilder();
innerHitBuilder.setIgnoreUnmapped(true).setSize(90);
nestedQueryBuilder.ignoreUnmapped(true).innerHit(innerHitBuilder);
boolQueryBuilder.must(nestedQueryBuilder);
if (ObjectUtil.isNotEmpty(searchHotelRequest.getGoodsTags())) {
MatchPhraseQueryBuilder matchPhraseQuery = QueryBuilders.matchPhraseQuery("goodsTags", searchHotelRequest.getGoodsTags());
boolQueryBuilder.must(matchPhraseQuery);
}
//搜索条件:商品名,店铺名
Map<String, Float> fields = new HashMap<>();
fields.put("name", 3f);
fields.put("storeName", 2f);
fields.put("trait", 1f);
MultiMatchQueryBuilder multiMatchQueryBuilder = new MultiMatchQueryBuilder(searchHotelRequest.getSearchKey(), "name", "trait", "storeName").fields(fields);
boolQueryBuilder.must(multiMatchQueryBuilder);
}
2. 设置排序规则
private void buildHotelSearchSort(SearchHotelRequest searchHotelRequest, SearchSourceBuilder sourceBuilder) {
GeoDistanceSortBuilder geoDistanceSortBuilder = null;
if (ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude())) {
double latitude = searchHotelRequest.getLatitude();
double longitude = searchHotelRequest.getLongitude();
geoDistanceSortBuilder = SortBuilders.geoDistanceSort("location", latitude, longitude)
.point(latitude, longitude).unit(DistanceUnit.METERS);
}
if ("priceDESC".equals(searchHotelRequest.getSortType())) {
sourceBuilder.sort("price", SortOrder.DESC);
sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则
} else if ("priceASC".equals(searchHotelRequest.getSortType())) {
sourceBuilder.sort("price", SortOrder.ASC);
sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则
} else if ("soldDESC".equals(searchHotelRequest.getSortType())) {
sourceBuilder.sort("salesSum", SortOrder.DESC);
sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则
} else if ("soldASC".equals(searchHotelRequest.getSortType())) {
sourceBuilder.sort("salesSum", SortOrder.ASC);
sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则
} else if ("distinctASC".equals(searchHotelRequest.getSortType())) {
if (ObjectUtil.isNotEmpty(geoDistanceSortBuilder)) {
sourceBuilder.sort(geoDistanceSortBuilder.order(SortOrder.ASC));
}
sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则
} else if (" distinctDESC".equals(searchHotelRequest.getSortType())) {
if (ObjectUtil.isNotEmpty(geoDistanceSortBuilder)) {
sourceBuilder.sort(geoDistanceSortBuilder.order(SortOrder.DESC));
}
sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则
}
}
3. 设置聚合 group by having
private void buildHotelSearchAggregation(SearchHotelRequest searchHotelRequest,SearchSourceBuilder sourceBuilder) {
List<AggregationBuilder> aggregationBuilders=new ArrayList<>();
String nestedAggregationName = "filtered_nested";
String whereAggregationName = "where";
String groupByAggregationName = "group_by";
String avgPriceAggregationName = "avg_price";
String havingBucketSelectorName = "having";
String countName = "_count";
//filtered_nested
NestedAggregationBuilder nestedAggregation =
AggregationBuilders.nested(nestedAggregationName,"hotelPrices");
//where
BoolQueryBuilder boolQueryBuilder =QueryBuilders.boolQuery();
boolQueryBuilder.filter();
//入住起止时间
Long days= null;
if (ObjectUtil.isNotEmpty(searchHotelRequest.getStartTime())&&ObjectUtil.isNotEmpty(searchHotelRequest.getEndTime())) {
RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.specValue");
rangeQueryBuilder.from(searchHotelRequest.getStartTime());
rangeQueryBuilder.to(searchHotelRequest.getEndTime());
boolQueryBuilder.filter(rangeQueryBuilder);
days = DateUtil.betweenDay(searchHotelRequest.getStartTime(),searchHotelRequest.getEndTime(),true);
}
//库存大于0
RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.stockQuantity");
rangeQueryBuilder.gt(0);
boolQueryBuilder.filter(rangeQueryBuilder);
FilterAggregationBuilder whereAggregation =
AggregationBuilders.filter(whereAggregationName,boolQueryBuilder);
//group_by true asc;false desc
BucketOrder groupByAggregationOrder = BucketOrder.aggregation("_key",true);
TermsAggregationBuilder groupByAggregation =
AggregationBuilders.terms(groupByAggregationName).
field("hotelPrices.goodsId").
size(searchHotelRequest.getPageSize()).order(groupByAggregationOrder);
//avg_price
AvgAggregationBuilder avgPriceAggregation =
AggregationBuilders.avg(avgPriceAggregationName)
.field("hotelPrices.sellPrice");
//having avg_price bucket_selector
Map<String, String> bucketsPathsMap = new HashMap<>(2);
bucketsPathsMap.put("avgprice", avgPriceAggregationName);
bucketsPathsMap.put("counts", countName);
//计算起始天数差 &¶ms.counts==22
String scriptCmd = "params.avgprice>=0";
//平均价格在价格区间 及天数差
if (ObjectUtil.isNotEmpty(searchHotelRequest.getPriceRange())) {
//1:0-300; 2:300-800; 3:800-1500; 4:1500以上
switch (searchHotelRequest.getPriceRange()){
case 1:
scriptCmd ="params.avgprice>=0 && params.avgprice<=300";
break;
case 2:
scriptCmd ="params.avgprice>=300 && params.avgprice<=800";
break;
case 3:
scriptCmd ="params.avgprice>=800 && params.avgprice<=1500";
break;
case 4:
scriptCmd ="params.avgprice>=1500";
break;
default:
break;
}
}
if(ObjectUtil.isNotEmpty(days)) {
scriptCmd+="&¶ms.counts==" + days;
}
Script script = new Script(scriptCmd);
BucketSelectorPipelineAggregationBuilder havingBucketSelector =
PipelineAggregatorBuilders.bucketSelector(havingBucketSelectorName, bucketsPathsMap, script);
//avgPriceAggregation嵌套在groupByAggregation中
groupByAggregation
.subAggregation(avgPriceAggregation)
.subAggregation(havingBucketSelector)
//注意这里的havingBucketSelector是紧跟在avgPriceAggregation的后面,非嵌套
//.order()
.size(searchHotelRequest.getPageSize());
//groupByAggregation嵌套在whereAggregation中
whereAggregation.subAggregation(groupByAggregation);
//whereAggregation嵌套在nestedAggregation中
nestedAggregation.subAggregation(whereAggregation);
sourceBuilder.aggregation(nestedAggregation);
}
4. 设置结果集,反过来解析第三步的聚合桶
private SearchGoodsResponse buildHotelSearchResponse(SearchHotelRequest searchHotelRequest,SearchResponse searchResponse) {
String nestedAggregationName = "filtered_nested";
String whereAggregationName = "where";
String groupByAggregationName = "group_by";
String avgPriceAggregationName = "avg_price";
String havingBucketSelectorName = "having";
String countName = "_count";
SearchGoodsResponse searchGoodsResponse = new SearchGoodsResponse();
if (ObjectUtil.isEmpty(searchResponse)) {
return null;
}
Aggregations aggregations = searchResponse.getAggregations();
ParsedNested nestedAggregation = aggregations.get(nestedAggregationName);
ParsedFilter whereAggregation = nestedAggregation.getAggregations().get(whereAggregationName);
Terms groupByAggregation=whereAggregation.getAggregations().get(groupByAggregationName);
List<? extends Terms.Bucket> avgBucketList = groupByAggregation.getBuckets();groupByAggregation.getBuckets();
//获取符合条件的商品id列表
List <String> goodsIds =new ArrayList<>();
avgBucketList.forEach(v -> {
if (ObjectUtil.isNotEmpty(v.getKey())) {
goodsIds.add(v.getKey().toString());
}
});
Map res = new HashMap();
SearchHits searchHits = searchResponse.getHits();
ArrayList<Map> searchResult = new ArrayList<>();
if (searchHits != null) {
for (SearchHit hit : searchHits) {
String id= hit.getId();
if(ObjectUtil.isNotEmpty(searchHotelRequest.getStartTime())
&&ObjectUtil.isNotEmpty(searchHotelRequest.getEndTime())) {
if (goodsIds.contains(id)) {
//计算活动价
//距离优先,距离指标在排序的第一位;商品评分优先,距离指标在排序的第二位
//@TODO 距离指标在排序的第一位
if ((ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude()))
&& ("distinctASC".equals(searchHotelRequest.getSortType()) || "distinctDESC".equals(searchHotelRequest.getSortType()))) {
result.put("distance", new BigDecimal(String.valueOf(hit.getSortValues()[0])).setScale(6, RoundingMode.HALF_UP).doubleValue());
} else {
result.put("distance", Double.valueOf(0.00));
}
result.put("hotelPrices", null);
Object goodsTags = hit.getSourceAsMap().get("goodsTags");
if (ObjectUtil.isNotEmpty(goodsTags)) {
result.put("goodsTags", goodsTags.toString().replace("[", "").replace("]", ""));
}
searchResult.add(result);
}
}
//起止时间首页不传需要取所有结果,不做筛选
if(ObjectUtil.isEmpty(searchHotelRequest.getStartTime())
&&ObjectUtil.isEmpty(searchHotelRequest.getEndTime())){
//计算活动价
//距离优先,距离指标在排序的第一位;商品评分优先,距离指标在排序的第二位
//@TODO 距离指标在排序的第一位
if ((ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude()))
&& ("distinctASC".equals(searchHotelRequest.getSortType()) || "distinctDESC".equals(searchHotelRequest.getSortType()))) {
result.put("distance", new BigDecimal(String.valueOf(hit.getSortValues()[0])).setScale(6, RoundingMode.HALF_UP).doubleValue());
} else {
result.put("distance", Double.valueOf(0.00));
}
result.put("hotelPrices", null);
Object goodsTags = hit.getSourceAsMap().get("goodsTags");
if (ObjectUtil.isNotEmpty(goodsTags)) {
result.put("goodsTags", goodsTags.toString().replace("[", "").replace("]", ""));
}
searchResult.add(result);
}
}
}
res.put("searchResult", searchResult);
searchGoodsResponse.setData(res);
return searchGoodsResponse;
}