来自牛客的窗口函数SQL练习题。 近三个月未完成试卷数为0的用户完成情况_牛客题霸_牛客网 (nowcoder.com) SQL139 近三个月未完成试卷数为0的用户完成情况
题干
现有试卷作答记录表exam_record(uid:用户ID, exam_id:试卷ID, start_time:开始作答时间, submit_time:交卷时间,为空的话则代表未完成, score:得分):
| id | uid | exam_id | start_time | submit_time | score |
|---|---|---|---|---|---|
| 1 | 1006 | 9003 | 2021-09-06 10:01:01 | 2021-09-06 10:21:02 | 84 |
| 2 | 1006 | 9001 | 2021-08-02 12:11:01 | 2021-08-02 12:31:01 | 89 |
| 3 | 1006 | 9002 | 2021-06-06 10:01:01 | 2021-06-06 10:21:01 | 81 |
| 4 | 1006 | 9002 | 2021-05-06 10:01:01 | 2021-05-06 10:21:01 | 81 |
| 5 | 1006 | 9001 | 2021-05-01 12:01:01 | (NULL) | (NULL) |
| 6 | 1001 | 9001 | 2021-09-05 10:31:01 | 2021-09-05 10:51:01 | 81 |
| 7 | 1001 | 9003 | 2021-08-01 09:01:01 | 2021-08-01 09:51:11 | 78 |
| 8 | 1001 | 9002 | 2021-07-01 09:01:01 | 2021-07-01 09:31:00 | 81 |
| 9 | 1001 | 9002 | 2021-07-01 12:01:01 | 2021-07-01 12:31:01 | 81 |
| 10 | 1001 | 9002 | 2021-07-01 12:01:01 | (NULL) | (NULL) |
找到每个人近三个有试卷作答记录的月份中没有试卷是未完成状态的用户的试卷作答完成数,按试卷完成数和用户ID降序排名。由示例数据结果输出如下:
| uid | exam_complete_cnt |
|---|---|
| 1006 | 3 |
解释:用户1006近三个有作答试卷的月份为202109、202108、202106,作答试卷数为3,全部完成;用户1001近三个有作答试卷的月份为202109、202108、202107,作答试卷数为5,完成试卷数为4,因为有未完成试卷,故过滤掉。
测试数据
drop table if exists exam_record;
CREATE TABLE exam_record (
id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
uid int NOT NULL COMMENT '用户ID',
exam_id int NOT NULL COMMENT '试卷ID',
start_time datetime NOT NULL COMMENT '开始时间',
submit_time datetime COMMENT '提交时间',
score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;
INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1006, 9003, '2021-09-06 10:01:01', '2021-09-06 10:21:02', 84),
(1006, 9001, '2021-08-02 12:11:01', '2021-08-02 12:31:01', 89),
(1006, 9002, '2021-06-06 10:01:01', '2021-06-06 10:21:01', 81),
(1006, 9002, '2021-05-06 10:01:01', '2021-05-06 10:21:01', 81),
(1006, 9001, '2021-05-01 12:01:01', null, null),
(1001, 9001, '2021-09-05 10:31:01', '2021-09-05 10:51:01', 81),
(1001, 9003, '2021-08-01 09:01:01', '2021-08-01 09:51:11', 78),
(1001, 9002, '2021-07-01 09:01:01', '2021-07-01 09:31:00', 81),
(1001, 9002, '2021-07-01 12:01:01', '2021-07-01 12:31:01', 81),
(1001, 9002, '2021-07-01 12:01:01', null, null);
题干分析
用户在最近作答的3个月内没有未提交的试卷的用户,统计该用户3个月内的作答试卷个数。 按试卷完成数和用户ID降序排序。
作答
-- 按照用户作答试卷的月份进行降序分组排序
select
uid,
exam_id,
submit_time,
dense_rank() over(partition by uid order by date_format(start_time, "%Y%m") desc) as ranking
from exam_record
;
-- 过滤出近3个月的数据
select * from
(select
uid,
exam_id,
submit_time,
dense_rank() over(partition by uid order by date_format(start_time, "%Y%m") desc) as ranking
from exam_record) as t where ranking<=3
;
-- 这里应该将存在null的对应的uid去掉的,但是这里不好处理,而且用where submit_time is not null只过滤掉一行数据,不符合要求。
-- 所以这里建了一个临时表,用临时表来做跳板,避免过度的嵌套影响代码的可读性
-- 最终的答案:
-- 创建临时表
with temp as (
select
uid,
exam_id,
submit_time,
dense_rank() over(partition by uid order by date_format(start_time, "%Y%m") desc) as ranking
from exam_record)
-- 解答sql
select
uid,
count(exam_id) as exam_complete_cnt
from temp
where
ranking<=3
and
-- 查询出存在null的对应的uid,然后通过not in的方式过滤掉这些uid
uid not in (select uid from temp where ranking<=3 and submit_time is null)
group by uid
order by exam_complete_cnt desc, uid desc
;
这里的窗口函数只用到了排序,具体row_number、rank、dense_rank这三个函数的区别可以看下这个博客: zhuanlan.zhihu.com/p/514345120