MySQL 实现分时累计统计

744 阅读2分钟

背景

统计周期内累计实时订单量,要求每分钟一个点,展示累计订单量

数据表

Doris聚合模型表

CREATE TABLE rt_aggr_submit_cancel_order(
status INT COMMENT '订单状态 1、提单 2、推单 4、接单 8、成单 9 已取消',
category INT COMMENT '订单分类,14010 ;14060',
min INT COMMENT '分钟',
hour INT COMMENT '小时',
dt INT COMMENT '日期',
real_price DOUBLE SUM COMMENT '实付交易额',
order_id BIGINT SUM COMMENT '订单ID'
) AGGREGATE KEY(status,category,poiCate,shipping_service,min,hour,dt) COMMENT '订单数据'
PARTITION BY RANGE(dt)( PARTITION p20220804 VALUES LESS THAN ("20220804") )DISTRIBUTED BY HASH(min) BUCKETS 32 PROPERTIES ( "replication_num" = "3" )

解决方案

累计求和解决方案

sum over 窗口函数

select dt,hour,min,finish_cnt,
        sum(finish_cnt) over ( order by dt,hour,min ) as finishcnt
from (
        select dt,hour,min,
            sum(order_id) as finish_cnt,
        from rt_aggr_submit_cancel_order 
        where dt=20220803   
            -- status=1,状态过滤 
            and status=1 
        group by dt,hour,min,category
        order by 1,2,3
    ) f order by 1,2,3

查询结果: finish_cnt表示当前分钟的统计结果,finishcnt表示累计到当前分钟的统计结果

image.png

分类别累计求和解决方案

sum over(partition by )

分类别分时累计求和:按分钟累计,并给出累计至当前时间点的和

select dt,hour,min,category,finish_cnt,
        sum(finish_cnt) over ( PARTITION by category order by dt,hour,min ) as finishcnt
from (
        select dt,hour,min,category,
            sum(order_id) as finish_cnt
        from rt_aggr_submit_cancel_order 
        where dt=20220803   
            -- status=1,状态过滤 
            and status=1 
        group by dt,hour,min,category
        order by 1,2,3,4
    ) f order by 1,2,3,4

查询结果:

image.png

问题

统计出现断点

现象

连续时序数据中,出现断点统计的情况。如下图,没有第0分钟,第2分钟的数据。

image.png

原因

明细数据中,缺失了第0分钟,第2分钟的数据。里层的SQL查询出来的结果不存在0,2分钟的数据,导致外层累计求和时,第0分钟,第2分钟的统计结果出现断点。

解决方案

补全里面sql查询结果的时间。如何补全?找一个有这个时间点数据的表,join一下,就可以实现补全操作。具体SQL如下

select c.dt,c.hour,c.min,finish_cnt,
        sum(finish_cnt) over ( order by c.dt,c.hour,c.min ) as finishcnt
from (
	select dt,hour,min
	from rt_aggr_submit_cancel_order 
	where dt=20220803
	group by dt,hour,min
)c
left join (
	select dt,hour,min,
            sum(order_id) as finish_cnt
        from rt_aggr_wm_finish_order_v2 
        where dt=20220803   
        -- status=1,状态过滤 
            and status=1 
        group by dt,hour,min
        order by 1,2,3
)d on c.dt=d.dt and c.hour=d.hour and c.min=d.min
order by 1,2,3

查询结果

image.png

如图:没有数据的时间点,最终显示为null,我们可以处理一下,使数据更加人性化。

    (case when cancelcnt is null then 0 else finishcnt end) order_cnt

参考资料