今天也是来自牛客的试题。
SQL143每份试卷每月作答数和截止当月的作答总数
题目链接:每份试卷每月作答数和截止当月的作答总数。_牛客题霸_牛客网 (nowcoder.com)
题干信息
现有试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分):
| id | uid | exam_id | start_time | submit_time | score |
|---|---|---|---|---|---|
| 1 | 1001 | 9001 | 2020-01-01 09:01:01 | 2020-01-01 09:21:59 | 90 |
| 2 | 1002 | 9001 | 2020-01-20 10:01:01 | 2020-01-20 10:10:01 | 89 |
| 3 | 1002 | 9001 | 2020-02-01 12:11:01 | 2020-02-01 12:31:01 | 83 |
| 4 | 1003 | 9001 | 2020-03-01 19:01:01 | 2020-03-01 19:30:01 | 75 |
| 5 | 1004 | 9001 | 2020-03-01 12:01:01 | 2020-03-01 12:11:01 | 60 |
| 6 | 1003 | 9001 | 2020-03-01 12:01:01 | 2020-03-01 12:41:01 | 90 |
| 7 | 1002 | 9001 | 2020-05-02 19:01:01 | 2020-05-02 19:32:00 | 90 |
| 8 | 1001 | 9002 | 2020-01-02 19:01:01 | 2020-01-02 19:59:01 | 69 |
| 9 | 1004 | 9002 | 2020-02-02 12:01:01 | 2020-02-02 12:20:01 | 99 |
| 10 | 1003 | 9002 | 2020-02-02 12:01:01 | 2020-02-02 12:31:01 | 68 |
| 11 | 1001 | 9002 | 2020-02-02 12:01:01 | 2020-02-02 12:43:01 | 81 |
| 12 | 1001 | 9002 | 2020-03-02 12:11:01 | (NULL) | (NULL) |
请输出每份试卷每月作答数和截止当月的作答总数。
由示例数据结果输出如下:
| exam_id | start_month | month_cnt | cum_exam_cnt |
|---|---|---|---|
| 9001 | 202001 | 2 | 2 |
| 9001 | 202002 | 1 | 3 |
| 9001 | 202003 | 3 | 6 |
| 9001 | 202005 | 1 | 7 |
| 9002 | 202001 | 1 | 1 |
| 9002 | 202002 | 3 | 4 |
| 9002 | 202003 | 1 | 5 |
解释:试卷9001在202001、202002、202003、202005共4个月有被作答记录,每个月被作答数分别为2、1、3、1,截止当月累积作答总数为2、3、6、7。
测试数据
drop table if exists exam_record;
CREATE TABLE exam_record (
id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
uid int NOT NULL COMMENT '用户ID',
exam_id int NOT NULL COMMENT '试卷ID',
start_time datetime NOT NULL COMMENT '开始时间',
submit_time datetime COMMENT '提交时间',
score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;
INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2020-01-01 09:01:01', '2020-01-01 09:21:59', 90),
(1002, 9001, '2020-01-20 10:01:01', '2020-01-20 10:10:01', 89),
(1002, 9001, '2020-02-01 12:11:01', '2020-02-01 12:31:01', 83),
(1003, 9001, '2020-03-01 19:01:01', '2020-03-01 19:30:01', 75),
(1004, 9001, '2020-03-01 12:01:01', '2020-03-01 12:11:01', 60),
(1003, 9001, '2020-03-01 12:01:01', '2020-03-01 12:41:01', 90),
(1002, 9001, '2020-05-02 19:01:01', '2020-05-02 19:32:00', 90),
(1001, 9002, '2020-01-02 19:01:01', '2020-01-02 19:59:01', 69),
(1004, 9002, '2020-02-02 12:01:01', '2020-02-02 12:20:01', 99),
(1003, 9002, '2020-02-02 12:01:01', '2020-02-02 12:31:01', 68),
(1001, 9002, '2020-02-02 12:01:01', '2020-02-02 12:43:01', 81),
(1001, 9002, '2020-03-02 12:11:01', null, null);
题干分析
这道题不难,先分组统计每个月份的答题数量(注意需要过滤掉未提交的试卷记录),然后再按照分组累加每月的答题数量。
累加SQL代码可以参考下这个博客: blog.csdn.net/weixin_3614…
作答
先分组统计各个月份的答题数量
select
distinct exam_id,
date_format(start_time, "%Y%m") as start_month,
count(start_time) over(partition by exam_id, date_format(start_time, "%Y%m")) as month_cnt,
from exam_record
;
如果基于上述的临时表进行分组累加会发现最终的结果是无法满足需求的(这里代码我已经删掉了)。然后我构造了一个序号列,用于辅助进行分组累加操作。如下:
select t.exam_id, t.start_month, t.month_cnt, row_number() over(partition by t.exam_id) as rn
from
(select
distinct exam_id,
date_format(start_time, "%Y%m") as start_month,
count(start_time) over(partition by exam_id, date_format(start_time, "%Y%m")) as month_cnt
from exam_record) as t
;
然后就可以计算最终的结果了
with temp as (
select t.exam_id, t.start_month, t.month_cnt, row_number() over(partition by t.exam_id) as rn
from
(select
distinct exam_id,
date_format(start_time, "%Y%m") as start_month,
count(start_time) over(partition by exam_id, date_format(start_time, "%Y%m")) as month_cnt
from exam_record) as t
)
select
tb.exam_id,
tb.start_month,
tb.month_cnt,
(select sum(month_cnt) from temp where exam_id=tb.exam_id and rn<=tb.rn) as cum_exam_cnt
from temp as tb
;