背景
统计周期内累计实时订单量,要求每分钟一个点,展示累计订单量
数据表
Doris聚合模型表
CREATE TABLE rt_aggr_submit_cancel_order(
statusINT COMMENT '订单状态 1、提单 2、推单 4、接单 8、成单 9 已取消',
categoryINT COMMENT '订单分类,14010 ;14060',
minINT COMMENT '分钟',
hourINT COMMENT '小时',
dtINT COMMENT '日期',
real_priceDOUBLE SUM COMMENT '实付交易额',
order_idBIGINT SUM COMMENT '订单ID'
) AGGREGATE KEY(status,category,poiCate,shipping_service,min,hour,dt) COMMENT '订单数据'
PARTITION BY RANGE(dt)( PARTITION p20220804 VALUES LESS THAN ("20220804") )DISTRIBUTED BY HASH(min) BUCKETS 32 PROPERTIES ( "replication_num" = "3" )
解决方案
累计求和解决方案
sum over 窗口函数
select dt,hour,min,finish_cnt,
sum(finish_cnt) over ( order by dt,hour,min ) as finishcnt
from (
select dt,hour,min,
sum(order_id) as finish_cnt,
from rt_aggr_submit_cancel_order
where dt=20220803
-- status=1,状态过滤
and status=1
group by dt,hour,min,category
order by 1,2,3
) f order by 1,2,3
查询结果: finish_cnt表示当前分钟的统计结果,finishcnt表示累计到当前分钟的统计结果
分类别累计求和解决方案
sum over(partition by )
分类别分时累计求和:按分钟累计,并给出累计至当前时间点的和
select dt,hour,min,category,finish_cnt,
sum(finish_cnt) over ( PARTITION by category order by dt,hour,min ) as finishcnt
from (
select dt,hour,min,category,
sum(order_id) as finish_cnt
from rt_aggr_submit_cancel_order
where dt=20220803
-- status=1,状态过滤
and status=1
group by dt,hour,min,category
order by 1,2,3,4
) f order by 1,2,3,4
查询结果:
问题
统计出现断点
现象
连续时序数据中,出现断点统计的情况。如下图,没有第0分钟,第2分钟的数据。
原因
明细数据中,缺失了第0分钟,第2分钟的数据。里层的SQL查询出来的结果不存在0,2分钟的数据,导致外层累计求和时,第0分钟,第2分钟的统计结果出现断点。
解决方案
补全里面sql查询结果的时间。如何补全?找一个有这个时间点数据的表,join一下,就可以实现补全操作。具体SQL如下
select c.dt,c.hour,c.min,finish_cnt,
sum(finish_cnt) over ( order by c.dt,c.hour,c.min ) as finishcnt
from (
select dt,hour,min
from rt_aggr_submit_cancel_order
where dt=20220803
group by dt,hour,min
)c
left join (
select dt,hour,min,
sum(order_id) as finish_cnt
from rt_aggr_wm_finish_order_v2
where dt=20220803
-- status=1,状态过滤
and status=1
group by dt,hour,min
order by 1,2,3
)d on c.dt=d.dt and c.hour=d.hour and c.min=d.min
order by 1,2,3
查询结果
如图:没有数据的时间点,最终显示为null,我们可以处理一下,使数据更加人性化。
(case when cancelcnt is null then 0 else finishcnt end) order_cnt