数据
第一列是登录日期,第二列是登录用户名
查询连续3天登录的用户有几个
(开发软件:datagrip)
思路一
- 按照name分组,order by日期排序作为rn(row_number)
- 计算临时日期date2 = date - rn,date2有几个相同,说明有几天连续
- 统计相同用户且相同date2的个数记录为cnt
- 筛选个数cnt >= 3的组,cnt = 3说明该name有3天连续登录
--核心代码
distinct
-> row_number
-> date_sub(dt,rn) as dt2
-> group by dt2,name
-> having count(1)>=N天
-> distinct name 明细
-> count(name) 人数
题目一 OPPO
查询连续三天登录的人员姓名
create or replace temporary view
game(name,date) as values
('张三','2021-01-01'),
('张三','2021-01-02'),
('张三','2021-01-03'),
('张三','2021-01-02'),
('李四','2021-01-01'),
('李四','2021-01-02'),
('王五','2021-01-03'),
('王五','2021-01-02'),
('王五','2021-01-02');
思路
需求:人员明细 易错点:
- 必须保证同一日期下只有一个name 如果出现同一天内有同一用户的多条记录,则需要去重
- with t1 as:子查询
with t1 as (
select distinct name,`date` from game
)
select * from t1
- 增加rn字段,分组编号日期
with t1 as (
select distinct name,`date` from game
),
t2 as (
select *,
row_number() over (partion by name order by `data`) as rn
from t1
)
select * from t2
- 计算data2 = data_sub(
date,rn)
with t1 as (
select distinct name,`date` from game
),
t2 as (
select *,
row_number() over (partion by name order by `data`) as rn
from t1
)
t3 as (
select *,
data_sub(`date`,rn) as temp
from t2
)
select * from t3
- 相同的name和temp是一组,count(*)
with t1 as (
select distinct name,`date` from game
),
t2 as (
select *,
row_number() over (partion by name order by `data`) as rn
from t1
)
t3 as (
select *,
data_sub(`date`,rn) as temp
from t2
)
t4 as (
select name,temp,
count(*) as cnt
from t3
group by name,temp
)
select * from t4
- 筛选出cnt大于3的name
with t1 as (
select distinct name,`date` from game
),
t2 as (
select *,
row_number() over (partion by name order by `data`) as rn
from t1
)
t3 as (
select *,
data_sub(`date`,rn) as temp
from t2
)
t4 as (
select name,temp,
count(*) as cnt
from t3
group by name,temp
)
t5 as (
select distinct name,
where cnt >= 3
from t4
)
select * from t5
简化版
with t1 as ( select distinct name,date from game),
t2 as ( select *, (row_number() over (partition by name order by date) rn
from t1) ,
t3 as ( select *,date_sub(date,rn) date2 from t2 )
select distinct name from t3 group by name,date2 having count(1)>=3;
思路二
- N是连续天数
- lag(repay_amount,n,0) over(order by cur_stage) 是根据字段cur_stage排序,看cur_stage的前n期,repay_amount, 0为缺省值,默认为null
- lead(a,n,default = null) over(order by B) 是根据字段B看后n期