牛客SQL.1.29

119 阅读2分钟

「这是我参与11月更文挑战的第13天,活动详情查看:2021最后一次更文挑战

描述

题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。

示例:question_practice_detail

img

根据示例,你的查询应返回以下结果:

img

示例1

drop table if exists `user_profile`;
drop table if  exists `question_practice_detail`;
drop table if  exists `question_detail`;
CREATE TABLE `user_profile` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`gender` varchar(14) NOT NULL,
`age` int ,
`university` varchar(32) NOT NULL,
`gpa` float,
`active_days_within_30` int ,
`question_cnt` int ,
`answer_cnt` int 
);
CREATE TABLE `question_practice_detail` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`question_id`int NOT NULL,
`result` varchar(32) NOT NULL,
`date` date NOT NULL
);
CREATE TABLE `question_detail` (
`id` int NOT NULL,
`question_id`int NOT NULL,
`difficult_level` varchar(32) NOT NULL
);
​
INSERT INTO user_profile VALUES(1,2138,'male',21,'北京大学',3.4,7,2,12);
INSERT INTO user_profile VALUES(2,3214,'male',null,'复旦大学',4.0,15,5,25);
INSERT INTO user_profile VALUES(3,6543,'female',20,'北京大学',3.2,12,3,30);
...
INSERT INTO question_practice_detail VALUES(1,2138,111,'wrong','2021-05-03');
INSERT INTO question_practice_detail VALUES(2,3214,112,'wrong','2021-05-09');
INSERT INTO question_practice_detail VALUES(3,3214,113,'wrong','2021-06-15');
...
INSERT INTO question_detail VALUES(1,111,'hard');
INSERT INTO question_detail VALUES(2,112,'medium');
INSERT INTO question_detail VALUES(3,113,'easy');

0.3000

方法一

题目所求第二天刷题的平均概率,首先我们先将表中的所有数据看成是第一天刷题,在此基础上,求出每条记录是否第二天有做题记录,并用外连接将记录连接起来;若确实存在则不为NULL那么用COUNT统计时,统计一个具体的字段时,NULL字段是不会被统计,有值则会被统计进去;SQL语句如下:

SELECT date1, date2, device_id AS avg_ret
FROM (
    SELECT DISTINCT qpd.device_id, qpd.date AS date1, uni_qpd.date AS date2
    FROM (
        question_practice_detail AS qpd
        LEFT JOIN 
        (
            SELECT device_id, date FROM question_practice_detail
        ) AS uni_qpd
        ON qpd.device_id=uni_qpd.device_id AND DATE_ADD(qpd.date, INTERVAL 1 DAY)=uni_qpd.date
    )  
) AS last_t

运行效果

image-20211114225549348

执行过程为:以qpd驱动表,对于该表采用的是全表扫描,对于被驱动表uni_qpd采用同样是全表扫描,最终的开销为56.4;