「这是我参与11月更文挑战的第13天,活动详情查看:2021最后一次更文挑战」
描述
题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。
示例:question_practice_detail
根据示例,你的查询应返回以下结果:
示例1
drop table if exists `user_profile`;
drop table if exists `question_practice_detail`;
drop table if exists `question_detail`;
CREATE TABLE `user_profile` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`gender` varchar(14) NOT NULL,
`age` int ,
`university` varchar(32) NOT NULL,
`gpa` float,
`active_days_within_30` int ,
`question_cnt` int ,
`answer_cnt` int
);
CREATE TABLE `question_practice_detail` (
`id` int NOT NULL,
`device_id` int NOT NULL,
`question_id`int NOT NULL,
`result` varchar(32) NOT NULL,
`date` date NOT NULL
);
CREATE TABLE `question_detail` (
`id` int NOT NULL,
`question_id`int NOT NULL,
`difficult_level` varchar(32) NOT NULL
);
INSERT INTO user_profile VALUES(1,2138,'male',21,'北京大学',3.4,7,2,12);
INSERT INTO user_profile VALUES(2,3214,'male',null,'复旦大学',4.0,15,5,25);
INSERT INTO user_profile VALUES(3,6543,'female',20,'北京大学',3.2,12,3,30);
...
INSERT INTO question_practice_detail VALUES(1,2138,111,'wrong','2021-05-03');
INSERT INTO question_practice_detail VALUES(2,3214,112,'wrong','2021-05-09');
INSERT INTO question_practice_detail VALUES(3,3214,113,'wrong','2021-06-15');
...
INSERT INTO question_detail VALUES(1,111,'hard');
INSERT INTO question_detail VALUES(2,112,'medium');
INSERT INTO question_detail VALUES(3,113,'easy');
0.3000
方法一
题目所求第二天刷题的平均概率,首先我们先将表中的所有数据看成是第一天刷题,在此基础上,求出每条记录是否第二天有做题记录,并用外连接将记录连接起来;若确实存在则不为NULL那么用COUNT统计时,统计一个具体的字段时,NULL字段是不会被统计,有值则会被统计进去;SQL语句如下:
SELECT date1, date2, device_id AS avg_ret
FROM (
SELECT DISTINCT qpd.device_id, qpd.date AS date1, uni_qpd.date AS date2
FROM (
question_practice_detail AS qpd
LEFT JOIN
(
SELECT device_id, date FROM question_practice_detail
) AS uni_qpd
ON qpd.device_id=uni_qpd.device_id AND DATE_ADD(qpd.date, INTERVAL 1 DAY)=uni_qpd.date
)
) AS last_t
运行效果
执行过程为:以qpd驱动表,对于该表采用的是全表扫描,对于被驱动表uni_qpd采用同样是全表扫描,最终的开销为56.4;