SQL案例分析:商品销量排行榜

476 阅读3分钟

1.问题描述

  1. 某一时间段各个商品的销量排行。
  2. 销量飙升榜。

2.初始化脚本

 -- 创建示例表
 create table products(
   product_id integer not null primary key,
   product_name varchar(100) not null unique,
   product_subcategory varchar(100) not null,
   product_category varchar(100) not null
 );
 ​
 insert into products values(1, 'iPhone 11', '手机', '手机通讯');
 insert into products values(2, 'HUAWEI P40', '手机', '手机通讯');
 insert into products values(3, '小米10', '手机', '手机通讯');
 insert into products values(4, 'OPPO Reno4', '手机', '手机通讯');
 insert into products values(5, 'vivo Y70s', '手机', '手机通讯');
 insert into products values(6, '海尔BCD-216STPT', '冰箱', '大家电');
 insert into products values(7, '康佳BCD-155C2GBU', '冰箱', '大家电');
 insert into products values(8, '容声BCD-529WD11HP', '冰箱', '大家电');
 insert into products values(9, '美的BCD-213TM(E)', '冰箱', '大家电');
 insert into products values(10, '格力BCD-230WETCL', '冰箱', '大家电');
 insert into products values(11, '格力KFR-35GW', '空调', '大家电');
 insert into products values(12, '美的KFR-35GW', '空调', '大家电');
 insert into products values(13, 'TCLKFRd-26GW', '空调', '大家电');
 insert into products values(14, '奥克斯KFR-35GW', '空调', '大家电');
 insert into products values(15, '海尔KFR-35GW', '空调', '大家电');
 ​
 create table sales(
   product_id integer not null,
   sale_time timestamp not null,
   quantity integer not null
 );
 ​
 -- 生成模拟销量数据
 insert into sales
 with recursive s(product_id, sale_time, quantity) as (
   select product_id, '2022-04-01 00:00:00', floor(10*rand(0)) from products
   union all
   select product_id, sale_time + interval 1 minute, floor(10*rand(0))
   from s 
   where sale_time < '2022-04-01 10:00:00'
 )
 select * from s;

3.思路分析

1.某一时间段各个商品的销量排行
  1. 首先查询出某一时间段某商品的销售情况,以9点钟为例

     select s.product_id ,date_format(s.sale_time,'%Y%m%d%H') ymdh  ,sum(s.quantity)  
     from sales s where s.sale_time between '2022-04-01 09:00:00' and '2022-04-01 09:59:59'
     group by product_id,ymdh
    
  2. 以此为基础,通过rank()函数进行排行。

     with hourly_sales (product_id,ymdh,quantity) as(
     select s.product_id ,date_format(s.sale_time,'%Y%m%d%H') ymdh  ,sum(s.quantity)  
     from sales s where s.sale_time between '2022-04-01 09:00:00' and '2022-04-01 09:59:59'
     group by product_id,ymdh),
     hourly_rank as (
     select P.product_id,hs .quantity ,p .product_category ,p.product_name ,p.product_subcategory ,rank() over(partition by ymdh,p.product_category  ,p.product_subcategory order by hs .quantity desc) AS rk
     from hourly_sales hs
     left join products p on p.product_id =hs .product_id )
     select * from hourly_rank
    
2.销量飙升榜
  1. 在第一个问题的基础上进行改进,例如筛选出8-9点商品销量排行。

     with hourly_sales (product_id,ymdh,quantity) as(
     select s.product_id ,date_format(s.sale_time,'%Y%m%d%H') ymdh  ,sum(s.quantity)  
     from sales s where s.sale_time between '2022-04-01 08:00:00' and '2022-04-01 09:59:59'
     group by product_id,ymdh),
     hourly_rank as (
     select P.product_id,hs .quantity ,p .product_category ,p.product_name ,hs.ymdh  ,rank() over(partition by ymdh,p.product_category  ,p.product_subcategory order by hs .quantity desc) AS rk
     from hourly_sales hs
     left join products p on p.product_id =hs .product_id )
     select * from hourly_rank;
    
  2. 在此基础上可以通过lag函数进行处理,来对比9点某商品的排名是上升还是下降

     with hourly_sales (product_id,ymdh,quantity) as(
     select s.product_id ,date_format(s.sale_time,'%Y%m%d%H') ymdh  ,sum(s.quantity)  
     from sales s where s.sale_time between '2022-04-01 08:00:00' and '2022-04-01 09:59:59'
     group by product_id,ymdh),
     hourly_rank as (
     select P.product_id,hs .quantity ,p .product_category ,p.product_name ,hs.ymdh  ,rank() over(partition by ymdh,p.product_category  ,p.product_subcategory order by hs .quantity desc) AS rk
     from hourly_sales hs
     left join products p on p.product_id =hs .product_id ),
     rank_gain as(
     select product_name,product_category,ymdh,quantity,rk,
     lag(rk,1) over(partition by product_name,product_category order by ymdh) pre_rk,
     (ifnull(lag(rk,1) over(partition by product_name,product_category order by ymdh),9999)-rk)/rk as gain 
     from hourly_rank )
     select * ,rank () over (partition by product_category order by gain desc) gain_rk 
     from rank_gain 
     where pre_rk is not null
    

    这里需要注意两个null的处理

    1. pre_rk值为空时,说明时间段是8点,可以通过pre_rk is not null来进行过滤,从而只得到9点时间段商品的信息。
    2. gain字段,在该商品八点时间段时使用lag函数得到的pre_rk值为null,在与rk比较时应给一个默认值。