LeetCode--1107. 每日新用户统计

110 阅读2分钟

1. 题目描述

Traffic 表:

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| activity      | enum    |
| activity_date | date    |
+---------------+---------+

该表可能有重复的行 activity 列是 ENUM 类型, 可能取 ('login', 'logout', 'jobs', 'groups', 'homepage') 几个值之一 编写解决方案, 找出从今天起最多90天内, 每个日期该日期首次登录的用户数, 假设今天是2019-06-30

2. 测试用例

输入:

Traffic 表:

+---------+----------+---------------+
| user_id | activity | activity_date |
+---------+----------+---------------+
| 1       | login    | 2019-05-01    |
| 1       | homepage | 2019-05-01    |
| 1       | logout   | 2019-05-01    |
| 2       | login    | 2019-06-21    |
| 2       | logout   | 2019-06-21    |
| 3       | login    | 2019-01-01    |
| 3       | jobs     | 2019-01-01    |
| 3       | logout   | 2019-01-01    |
| 4       | login    | 2019-06-21    |
| 4       | groups   | 2019-06-21    |
| 4       | logout   | 2019-06-21    |
| 5       | login    | 2019-03-01    |
| 5       | logout   | 2019-03-01    |
| 5       | login    | 2019-06-21    |
| 5       | logout   | 2019-06-21    |
+---------+----------+---------------+

输出:

+------------+-------------+
| login_date | user_count  |
+------------+-------------+
| 2019-05-01 | 1           |
| 2019-06-21 | 2           |
+------------+-------------+

解释:

请注意,我们只关心用户数非零的日期 ID 为 5 的用户第一次登陆于 2019-03-01, 因此他不算在 2019-06-21 的的统计内

3. 解题思路

  1. 查询每个用户最终登录日期
select user_id, min(activity_date) as first_login_date
from Traffic
where activity = 'login'
group by user_id

查询结果

+-------+----------------+
|user_id|first_login_date|
+-------+----------------+
|1      |2019-05-01      |
|2      |2019-06-21      |
|3      |2019-01-01      |
|4      |2019-06-21      |
|5      |2019-03-01      |
+-------+----------------+
  1. 查找用户的最早登录日期在2019-06-3090天内, 需要使用到*DATEDIFF函数*, 然后按照登录日期分组统计用户数量
select t.first_login_date as login_date, count(t.user_id) as user_count
from (select user_id, min(activity_date) as first_login_date
      from Traffic
      where activity = 'login'
      group by user_id) as t
where DATEDIFF('2019-06-30', t.first_login_date) <= 90
group by t.first_login_date;

查询结果

+----------+----------+
|login_date|user_count|
+----------+----------+
|2019-05-01|     1    |
|2019-06-21|     2    |
+----------+----------+