开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第15天,[点击查看活动详情]
1. ES的复合查询
什么是复合查询呢?复合查询其实就是将你的多个查询条件,用一定的的逻辑组合在一起,通俗来说就是将一些过滤条件组合在一起,进行查询。
复合查询主要有以下几个:
-
bool query:布尔查询,将多个叶或复合查询子句组合为
must、should、must_not或filter子句的默认查询 -
boosting query:提高高查询,返回与
positive查询匹配的文档,但降低也与negative查询匹配的文档的分数。 -
constant_score:固定分数查询,包装另一个查询的查询,但在过滤器上下文中执行它。所有匹配的文档都被赋予相同的“常量”
_score。 -
dis_max:最佳匹配查询,接受多个查询并返回与任何查询子句匹配的任何文档的查询。虽然
bool查询组合了所有匹配查询的分数,但dis_max查询使用单个最佳匹配查询子句的分数。 -
function_score:函数查询,使用函数修改主查询返回的分数,以考虑流行度、新近度、距离或使用脚本实现的自定义算法等因素。
接下来用两小节文章加上几个案例来实现以上几个查询
2. Bool Query布尔查询
2.1布尔查询的子查询
布尔查询是最常用的组合查询,根据自查询的规则,只有当文档满足所有自查询条件时,es才会返回结果。bool查询允许你在查询的时候自由组合,可以组合一些像必须匹配(must),应该(should)匹配或是必须不能匹配(must_not)。
- must表示是必须匹配,只有当文档内容匹配上这些查询的结果才回被返回,相当于逻辑查询当中的and
- should表示应该匹配,只有当文档内容满足should查询条件时,才返回,相当于逻辑查询当中的or
如果查询当中已经有了filter或者must,那么should只影响评分,查询当中没有匹配should当中的查询条件也会正确返回,若查询当中没有filter跟must,那么查询必须满足should条件当中的一项。
- must_not表示必须不匹配,,只有文档内容没有匹配上这些查询的结果才回被返回
- filter,和must—样 ,匹配filter选项下的查询条件的文档才会被返回;跟must的区别是:filter不评分(即:不影响score),只起到过滤功能。(相当于逻辑与)
2.2 DSL实现bool复合查询
- must查询
需求时这样我们需要查询出商品的基础必须价格是800的商品信息
GET /goods_item_index/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"basePrice": {
"value": "800.00"
}
}
}
]
}
}
}
term是查询条件,表示匹配字段值
通过上述命令,查询出必须满足basePrice等于800.00的商品信息。
- must_not
接下来通过must_not查询出,必须不匹配的信息
GET /goods_item_index/_search
{
"query": {
"bool": {
"must_not": [
{
"term": {
"basePrice": {
"value": "800.00"
}
}
}
]
}
}
}
- should查询
现在我们要查询出basePrice等于800.00,商品名称应该满足skuName等于"书"的商品信息
GET /goods_item_index/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"basePrice": {
"value": "800.00"
}
}
}
],
"should": [
{"term": {
"skuName": {
"value": "书"
}
}}
]
}
}
}
查询结果如下
"hits": [
{
"_index": "goods_item_index",
"_id": "96788",
"_score": 6.0290394,
"_source": {
"id": 96788,
"category": "10",
"basePrice": 800,
"marketPrice": 650.09,
"stockNum": 3481,
"skuImgUrl": "https://lorempixel.com/1920/1200/nature/",
"skuId": "74050301848",
"skuName": "书架",
"createTime": "1993-10-27 07:46:37",
"updateTime": "1985-08-22 06:20:54"
}
},
{
"_index": "goods_item_index",
"_id": "81113",
"_score": 1,
"_source": {
"id": 81113,
"category": "3",
"basePrice": 800,
"marketPrice": 295.82,
"stockNum": 727,
"skuImgUrl": "https://lorempixel.com/g/1680/1050/city/",
"skuId": "35020730408",
"skuName": "各种型号电池",
"createTime": "1995-05-04 15:53:58",
"updateTime": "1997-02-22 10:02:17"
}
}
...
...
...
}
结果只展示了一部分,可以看到结果有"书"字的内容_score分数较高,其他的不含书的都是1
- filter查询不再演示,上一节已经有相应的案例
2.3 Java Client实现Bool查询
在service新建类复合查询
/**
* bool查询
* @param basePrice 价格
* @param skuName sku
* @return
*/
List<GoodsItemRep> boolQuery(String basePrice, String skuName);
实现方法
@Override
public List<GoodsItemRep> boolQuery(String basePrice, String skuName) {
List<GoodsItemRep> goodsItemReps = new ArrayList<>();
//查询条件组合
Query query = BoolQuery.of(b -> b
.must(q -> q.term(t -> t.field("basePrice").value(basePrice)))//must
.should(q -> q.term(t -> t.field("skuName").value(skuName))) //should
)._toQuery();
List<GoodsItem> result = client.search(index, GoodsItem.class, query);
result.stream().forEach(goodsItem -> {
GoodsItemRep goodsItemRep = new GoodsItemRep();
BeanUtils.copyProperties(goodsItem, goodsItemRep);
goodsItemReps.add(goodsItemRep);
});
return goodsItemReps;
}
在postman进行调用
curl --location --request POST 'http://localhost:8089/goods/boolQuery?basePrice=800&skuName=书'
查询结果
3.1什么是boosting 查询
在我们的实际查询当中,无论你用何种合数据库做存储,都会碰到一些数据库查询排序的场景,把一些我们最想看到的结果排到最前边,其他的内容放到最后边,这里我们就要引出 boosting query了,es这里用相关性来表示查询结果匹配的关联性,boosting query是把不想看到的文档进行减分,把排配程度较高的排在前边,匹配程度较低的排在后边。
3.2 DSL实现boosting查询
下边我们来进行下演示,现在index里边插入几条数据,用于测试
POST /goods_item_index/_bulk
{"index":{"_id":1000002}}
{"id":1000002,"category":"8","basePrice":778.35,"marketPrice":53.66,"stockNum":3303,"skuImgUrl":"tmall","skuId":"13242377281","skuName":"apple phone","createTime":"1994-03-31 07:03:44","updateTime":"1988-09-10 00:45:46"}
{"index":{"_id":1000003}}
{"id":1000003,"category":"8","basePrice":778.35,"marketPrice":53.66,"stockNum":3303,"skuImgUrl":"taobao","skuId":"13242377281","skuName":"apple phone","createTime":"1994-03-31 07:03:44","updateTime":"1988-09-10 00:45:46"}
{"index":{"_id":1000004}}
{"id":1000004,"category":"8","basePrice":778.35,"marketPrice":53.66,"stockNum":3303,"skuImgUrl":"jd","skuId":"13242377281","skuName":"apple phone","createTime":"1994-03-31 07:03:44","updateTime":"1988-09-10 00:45:46"}
我们先用DSL在kibana做一个简单的查询
GET /goods_item_index/_search
{
"query": {
"match": {
"skuName": "apple"
}
}
}
查询结果如下,发现文档的得分差不多都一样大
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 27.226557,
"hits": [
{
"_index": "goods_item_index",
"_id": "1000002",
"_score": 27.226557,
"_source": {
"id": 1000002,
"category": "8",
"basePrice": 778.35,
"marketPrice": 53.66,
"stockNum": 3303,
"skuImgUrl": "tmall",
"skuId": "13242377281",
"skuName": "apple phone",
"createTime": "1994-03-31 07:03:44",
"updateTime": "1988-09-10 00:45:46"
}
},
{
"_index": "goods_item_index",
"_id": "1000003",
"_score": 27.099957,
"_source": {
"id": 1000003,
"category": "8",
"basePrice": 778.35,
"marketPrice": 53.66,
"stockNum": 3303,
"skuImgUrl": "taobao",
"skuId": "13242377281",
"skuName": "apple phone",
"createTime": "1994-03-31 07:03:44",
"updateTime": "1988-09-10 00:45:46"
}
},
{
"_index": "goods_item_index",
"_id": "1000004",
"_score": 27.096817,
"_source": {
"id": 1000004,
"category": "8",
"basePrice": 778.35,
"marketPrice": 53.66,
"stockNum": 3303,
"skuImgUrl": "jd",
"skuId": "13242377281",
"skuName": "apple phone",
"createTime": "1994-03-31 07:03:44",
"updateTime": "1988-09-10 00:45:46"
}
}
]
}
}
然后我们想要查询skuName是apple phone,然后把skuImgUrl是tmall的放到后边,查询DSL如下,positive是想放到前边的,negative是想放到后边的,negative_boost就是想放到后边的一个权重。
GET /goods_item_index/_search
{
"query": {
"boosting": {
"positive": {
"match": {
"skuName": "apple phone"
}
},
"negative": {
"match": {
"skuImgUrl": "tmall"
}
},
"negative_boost": 0.5
}
}
}
再次查询,结果如下,可以发现,最后一条的分数明显小了很多,将近一半。
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 27.677588,
"hits": [
{
"_index": "goods_item_index",
"_id": "1000004",
"_score": 27.677588,
"_source": {
"id": 1000004,
"category": "8",
"basePrice": 778.35,
"marketPrice": 53.66,
"stockNum": 3303,
"skuImgUrl": "jd",
"skuId": "13242377281",
"skuName": "apple phone",
"createTime": "1994-03-31 07:03:44",
"updateTime": "1988-09-10 00:45:46"
}
},
{
"_index": "goods_item_index",
"_id": "1000003",
"_score": 27.457375,
"_source": {
"id": 1000003,
"category": "8",
"basePrice": 778.35,
"marketPrice": 53.66,
"stockNum": 3303,
"skuImgUrl": "taobao",
"skuId": "13242377281",
"skuName": "apple phone",
"createTime": "1994-03-31 07:03:44",
"updateTime": "1988-09-10 00:45:46"
}
},
{
"_index": "goods_item_index",
"_id": "1000002",
"_score": 13.582527,
"_source": {
"id": 1000002,
"category": "8",
"basePrice": 778.35,
"marketPrice": 53.66,
"stockNum": 3303,
"skuImgUrl": "tmall",
"skuId": "13242377281",
"skuName": "apple phone",
"createTime": "1994-03-31 07:03:44",
"updateTime": "1988-09-10 00:45:46"
}
}
]
}
}
3.3 Java Client实现
现在service新建boosting查询接口
/**
* bool查询
* @param skuImgUrl 图片url
* @param skuName sku
* @return
*/
List<GoodsItemRep> boostingQuery(String skuImgUrl, String skuName);
实现方法
@Override
public List<GoodsItemRep> boostingQuery(String skuImgUrl, String skuName) {
List<GoodsItemRep> goodsItemReps = new ArrayList<>();
//查询条件组合
Query query = BoostingQuery.of(b->b
.positive(q->q.match(m->m.field("skuName").query(skuName)))
.negative(q->q.match(m->m.field("skuImgUrl").query(skuImgUrl)))
.negativeBoost(0.5))._toQuery();
List<GoodsItem> result = client.search(index, GoodsItem.class, query);
result.stream().forEach(goodsItem -> {
GoodsItemRep goodsItemRep = new GoodsItemRep();
BeanUtils.copyProperties(goodsItem, goodsItemRep);
goodsItemReps.add(goodsItemRep);
});
return goodsItemReps;
}
然后通过postman等工具,对接口进行调用
curl --location --request POST 'http://localhost:8089/goods/boostingQuery?skuImgUrl=tmall&skuName=apple phone'
调用结果