什么是聚合 聚合的概念有点类似mysql中group by,sum(...),这么说大家可能就有点印象了, 但是在es中聚合操作功能更强大。 在了解es中聚合的概念之前,先来看下这两个概念, 聚合就是一个或多个桶和零个或多个指标的组合。
什么是聚合
聚合的概念有点类似mysql中group by,sum(...),这么说大家可能就有点印象了, 但是在es中聚合操作功能更强大。
在了解es中聚合的概念之前,先来看下这两个概念, 聚合就是一个或多个桶和零个或多个指标的组合。
聚合有着非常多的场景应用,比如后台报表,通常要做非常多的复杂统计,而且数据量庞大,如果单纯依靠数据库统计,速度会非常慢甚至拖垮数据库,而使用es就会相对容易一些
桶(Buckets)
可以理解为es中存储的文档,通过满足特定条件的集合,这就叫做桶。
当聚合开始被执行,每个文档里面的值通过计算来决定符合哪个桶的条件。如果匹配到,文档将放入相应的桶并接着进行聚合操作。桶也可以被嵌套在其他桶里面,提供层次化的或者有条件的划分方案。
指标(Metrics)
对桶内的文档进行计算,比如通过文档的值计算平均值,计算最大值最小值, 下面就带大家看看如何去进行聚合操作
聚合指令
通常语法如下:
GET index_name/_search
{
"aggs": {
"NAME": {
"AGG_TYPE": {}
}
}
}
index_name: 索引名称aggs: 聚合修饰符NAME:自定义变量名称,用于返回聚合结果时的变量名AGG_TYPE:聚合类型
注意NAME, AGG_TYPE是特定参数, 下面我们看看有哪些聚合类型:
terms:按照匹配条件进行聚合,可以按照条件将文档存入不同的桶中,进行后续操作histogram:条形图(折方图),可以指定步长,按照步长递增进行聚合date_histogram:时间条形图(折方图),可以指定时间频率,按照时间频率进行聚合cardinality:去重计算,存在一定的误差值percentiles:获取字段不同百分比数对应的值percentile_ranks:获取值对应的百分比数filter:对聚合结果进行过滤,对查询结果不过滤post_filter:对聚合结果不过滤,对查询结果过滤avg:计算平均值sum:求和min:最小值max:最大值
下面看一个示例:
本节我们新建索引,下面是一个简单的请求日志索引, 定义了请求方法,路径,耗时,日志创建时间这几个字段
PUT req_log
{
"mappings": {
"properties" : {
"method" : {
"type" : "keyword"
},
"path" : {
"type" : "keyword"
},
"times" : {
"type" : "long"
},
"created" : {
"type" : "date"
}
}
}
}
紧接着,往里边塞点数据
POST /req_log/_bulk
{ "index": {}}
{ "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" }
{ "index": {}}
{ "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" }
{ "index": {}}
{ "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" }
{ "index": {}}
{ "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" }
{ "index": {}}
{ "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" }
{ "index": {}}
{ "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" }
{ "index": {}}
{ "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" }
{ "index": {}}
{ "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" }
{ "index": {}}
{ "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" }
{ "index": {}}
{ "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" }
{ "index": {}}
{ "times" : 89, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-11" }
{ "index": {}}
{ "times" : 380, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-12" }
{ "index": {}}
{ "times" : 270, "method" : "GET", "path" : "/api/post/10", "created" : "2023-02-13" }
{ "index": {}}
{ "times" : 630, "method" : "GET", "path" : "/api/post/12", "created" : "2023-02-14" }
{ "index": {}}
{ "times" : 210 , "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-15" }
{ "index": {}}
{ "times" : 900, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-16" }
{ "index": {}}
{ "times" : 870, "method" : "GET", "path" : "/api/post/7", "created" : "2023-02-17" }
查询每个请求路径(path)下的请求数量
GET req_log/_search
{
"aggs": {
"counts": {
"terms": {
"field": "path"
}
}
}
}
返回:
{
"took" : 995,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 17,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-09"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 30,
"method" : "GET",
"path" : "/api/post/2",
"created" : "2023-02-07"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "G0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 20,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-08"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 120,
"method" : "GET",
"path" : "/api/post/20",
"created" : "2023-02-06"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 150,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-05"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-04"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "H0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 960,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-03"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 9000,
"method" : "GET",
"path" : "/api/post/8",
"created" : "2023-02-02"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 1300,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-01"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 400,
"method" : "GET",
"path" : "/api/post/4",
"created" : "2023-02-10"
}
}
]
},
"aggregations" : {
"counts" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "/api/post/3",
"doc_count" : 3
},
{
"key" : "/api/post/6",
"doc_count" : 3
},
{
"key" : "/api/post/1",
"doc_count" : 2
},
{
"key" : "/api/post/2",
"doc_count" : 2
},
{
"key" : "/api/post/4",
"doc_count" : 2
},
{
"key" : "/api/post/10",
"doc_count" : 1
},
{
"key" : "/api/post/12",
"doc_count" : 1
},
{
"key" : "/api/post/20",
"doc_count" : 1
},
{
"key" : "/api/post/7",
"doc_count" : 1
},
{
"key" : "/api/post/8",
"doc_count" : 1
}
]
}
}
}
可以看到自定义返回的字段counts,聚合的类型为terms,聚合字段为path,也就是按照path进行桶划分。
从结果来看也很明显,buckets分为了4个桶,key代表聚合的字段名称,doc_count代表文档的数量
terms还支持以下命令格式:
GET req_log/_search
{
"aggs": {
"counts": {
"terms": {
"field": "path",
"size": 10,
"collect_mode": "depth_first",
"order": {
"_count": "desc"
}
}
}
}
}
返回:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
......,
"aggregations" : {
"counts" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "/api/post/3",
"doc_count" : 3
},
{
"key" : "/api/post/6",
"doc_count" : 3
},
{
"key" : "/api/post/1",
"doc_count" : 2
},
{
"key" : "/api/post/2",
"doc_count" : 2
},
{
"key" : "/api/post/4",
"doc_count" : 2
},
{
"key" : "/api/post/10",
"doc_count" : 1
},
{
"key" : "/api/post/12",
"doc_count" : 1
},
{
"key" : "/api/post/20",
"doc_count" : 1
},
{
"key" : "/api/post/7",
"doc_count" : 1
},
{
"key" : "/api/post/8",
"doc_count" : 1
}
]
}
}
}
size:返回桶中的多少个数据,通常可以结合排序模式进行使用,默认值=10collect_mode: 集合模式,包括深度优先遍历(depth_first)和广度优先遍历(breadth_first)两种。对于数组类型的字段,在使用深度优先遍历的情况下,可能会导致占用内存过多的情况。因为深度优先遍历会将数据全部加载到内存中后再进行操作order排序,默认按照doc_count倒序排列,可以指定默认字段或子聚合字段进行排序
嵌套聚合
es中默认支持聚合的嵌套,可以在一个桶中再次进行桶的划分, 嵌套有分为同级和子级,下面看一个例子:
同级嵌套
GET req_log/_search
{
"aggs": {
"path_count": {
"terms": {
"field": "path"
}
},
"method_count": {
"terms": {
"field": "method"
}
}
}
}
返回:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 17,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-09"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 30,
"method" : "GET",
"path" : "/api/post/2",
"created" : "2023-02-07"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "G0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 20,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-08"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 120,
"method" : "GET",
"path" : "/api/post/20",
"created" : "2023-02-06"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 150,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-05"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-04"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "H0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 960,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-03"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 9000,
"method" : "GET",
"path" : "/api/post/8",
"created" : "2023-02-02"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 1300,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-01"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 400,
"method" : "GET",
"path" : "/api/post/4",
"created" : "2023-02-10"
}
}
]
},
"aggregations" : {
"method_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "GET",
"doc_count" : 17
}
]
},
"path_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "/api/post/3",
"doc_count" : 3
},
{
"key" : "/api/post/6",
"doc_count" : 3
},
{
"key" : "/api/post/1",
"doc_count" : 2
},
{
"key" : "/api/post/2",
"doc_count" : 2
},
{
"key" : "/api/post/4",
"doc_count" : 2
},
{
"key" : "/api/post/10",
"doc_count" : 1
},
{
"key" : "/api/post/12",
"doc_count" : 1
},
{
"key" : "/api/post/20",
"doc_count" : 1
},
{
"key" : "/api/post/7",
"doc_count" : 1
},
{
"key" : "/api/post/8",
"doc_count" : 1
}
]
}
}
}
从结果来看,很明显的看出,不同method和path下的请求数量
子级嵌套
假如,我现在有一个需求:
- 同级各个请求方法下的请求数量
- 各个请求
method下各个path的请求数量 - 各个
path下请求耗时的平均值
查询示例:
GET req_log/_search
{
"aggs": {
"method_count": {
"terms": {
"field": "method"
},
"aggs": {
"path_count": {
"terms": {
"field": "path"
},
"aggs": {
"avg_times": {
"avg": {
"field": "times"
}
}
}
}
}
}
}
}
返回:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 17,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-09"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 30,
"method" : "GET",
"path" : "/api/post/2",
"created" : "2023-02-07"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "G0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 20,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-08"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 120,
"method" : "GET",
"path" : "/api/post/20",
"created" : "2023-02-06"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 150,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-05"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-04"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "H0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 960,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-03"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 9000,
"method" : "GET",
"path" : "/api/post/8",
"created" : "2023-02-02"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 1300,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-01"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 400,
"method" : "GET",
"path" : "/api/post/4",
"created" : "2023-02-10"
}
}
]
},
"aggregations" : {
"method_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "GET",
"doc_count" : 17,
"path_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "/api/post/3",
"doc_count" : 3,
"avg_times" : {
"value" : 63.0
}
},
{
"key" : "/api/post/6",
"doc_count" : 3,
"avg_times" : {
"value" : 1053.3333333333333
}
},
{
"key" : "/api/post/1",
"doc_count" : 2,
"avg_times" : {
"value" : 115.0
}
},
{
"key" : "/api/post/2",
"doc_count" : 2,
"avg_times" : {
"value" : 205.0
}
},
{
"key" : "/api/post/4",
"doc_count" : 2,
"avg_times" : {
"value" : 305.0
}
},
{
"key" : "/api/post/10",
"doc_count" : 1,
"avg_times" : {
"value" : 270.0
}
},
{
"key" : "/api/post/12",
"doc_count" : 1,
"avg_times" : {
"value" : 630.0
}
},
{
"key" : "/api/post/20",
"doc_count" : 1,
"avg_times" : {
"value" : 120.0
}
},
{
"key" : "/api/post/7",
"doc_count" : 1,
"avg_times" : {
"value" : 870.0
}
},
{
"key" : "/api/post/8",
"doc_count" : 1,
"avg_times" : {
"value" : 9000.0
}
}
]
}
}
]
}
}
}
从结果来看,桶按照层级嵌套关系
聚合过滤
查询和聚合过滤
这种是最常见的过滤方法,就是对查询结果和聚合结果都进行过滤,在aggs同级加上一个query即可,query前几节都给大家讲过, 下面看一个示例:
GET req_log/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"method": "POST"
}
}
}
},
"aggs": {
"path_count": {
"terms": {
"field": "path"
}
}
}
}
返回:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"path_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
}
可以看到结果是空的,原因是我们过滤了方法,因为整个文档都不存在POST所以为空
有时候我们的需求是这样的,想要拿某个数据和整个文档数据做比较,这个怎么做呢?global:{}可以很方便的聚合全部文档,下面看一个示例:
查询path=/api/post/3下的平均请求耗时和整个请求下的平均请求耗时
GET req_log/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"path": "/api/post/3"
}
}
}
},
"aggs": {
"path_avg": {
"avg": {
"field": "times"
}
},
"all_order":{
"global": {},
"aggs": {
"all_avg": {
"avg": {
"field": "times"
}
}
}
}
}
}
返回:
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "G0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 20,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-08"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-04"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "I0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 89,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-11"
}
}
]
},
"aggregations" : {
"path_avg" : {
"value" : 63.0
},
"all_order" : {
"doc_count" : 17,
"all_avg" : {
"value" : 911.1176470588235
}
}
}
}
从结果比较来看api/post/3下的接口请求速度还是很快的,平均是63, 整个接口平均耗时是911
聚合过滤
有时候,我们不需要过滤查询结果,只需要过滤聚合结果,这个怎么做呢?下面接着看一个示例:
查询method=GET下的请求并计算出path=/api/post/3下的请求平均耗时
-
filter
GET req_log/_search { "query":{ "constant_score":{ "filter":{ "term":{ "method":"GET" } } } }, "aggs":{ "req_count":{ "aggs":{ "req_path_order":{ "terms":{ "field":"path" }, "aggs":{ "avg_times":{ "avg":{ "field":"times" } } } } }, "filter":{ "term":{ "path":"/api/post/3" } } } } }
来看返回结果:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 17,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-09"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "GkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 30,
"method" : "GET",
"path" : "/api/post/2",
"created" : "2023-02-07"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "G0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 20,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-08"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 120,
"method" : "GET",
"path" : "/api/post/20",
"created" : "2023-02-06"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 150,
"method" : "GET",
"path" : "/api/post/1",
"created" : "2023-02-05"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-04"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "H0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 960,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-03"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IEK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 9000,
"method" : "GET",
"path" : "/api/post/8",
"created" : "2023-02-02"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IUK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 1300,
"method" : "GET",
"path" : "/api/post/6",
"created" : "2023-02-01"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "IkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 400,
"method" : "GET",
"path" : "/api/post/4",
"created" : "2023-02-10"
}
}
]
},
"aggregations" : {
"req_count" : {
"doc_count" : 3,
"req_path_order" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "/api/post/3",
"doc_count" : 3,
"avg_times" : {
"value" : 63.0
}
}
]
}
}
}
}
从结果来看,请求查询的结果并没有被过滤,只有聚合的结果被过滤了
查询过滤 & 聚合不过滤
与上相反,值过滤查询,不过滤聚合结果, 下面看个示例:
查询path=/api/post/3并且method=GET下的请求,并聚合结果各个path下的请求数
-
post_filter
GET req_log/_search { "aggs": { "path_count": { "terms": { "field": "path" } } }, "post_filter": { "bool": { "must": [ { "term": { "path": { "value": "/api/post/3" } } }, { "term": { "method": { "value": "GET" } } } ] } } }
返回:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "G0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 20,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-08"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "HkK3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 80,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-04"
}
},
{
"_index" : "req_log",
"_type" : "_doc",
"_id" : "I0K3NIYBdXrpvlCF01bz",
"_score" : 1.0,
"_source" : {
"times" : 89,
"method" : "GET",
"path" : "/api/post/3",
"created" : "2023-02-11"
}
}
]
},
"aggregations" : {
"path_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "/api/post/3",
"doc_count" : 3
},
{
"key" : "/api/post/6",
"doc_count" : 3
},
{
"key" : "/api/post/1",
"doc_count" : 2
},
{
"key" : "/api/post/2",
"doc_count" : 2
},
{
"key" : "/api/post/4",
"doc_count" : 2
},
{
"key" : "/api/post/10",
"doc_count" : 1
},
{
"key" : "/api/post/12",
"doc_count" : 1
},
{
"key" : "/api/post/20",
"doc_count" : 1
},
{
"key" : "/api/post/7",
"doc_count" : 1
},
{
"key" : "/api/post/8",
"doc_count" : 1
}
]
}
}
}
从结果来看,查询结果只有api/post/3但是聚合结果并没有过滤这个条件
转载自:阿里云社区:程序员皮卡秋(公众号程序员皮卡秋)