背景
工作中经常遇见求top N或者top N%的需求,下面针对这些做个小小的总结
数据准备
``create table t_score`
(
``student_id varchar(``10``) comment ``'学生ID'``,
``course_id varchar(``10``) comment ``'课程ID'``,
``course_score decimal(``18``, ``1``) comment ``'课程成绩'
);
insert into t_score
values (``'01'``, ``'01'``, ``80``);
insert into t_score
values (``'01'``, ``'02'``, ``90``);
insert into t_score
values (``'01'``, ``'03'``, ``99``);
insert into t_score
values (``'02'``, ``'01'``, ``70``);
insert into t_score
values (``'02'``, ``'02'``, ``60``);
insert into t_score
values (``'02'``, ``'03'``, ``80``);
insert into t_score
values (``'03'``, ``'01'``, ``80``);
insert into t_score
values (``'03'``, ``'02'``, ``80``);
insert into t_score
values (``'03'``, ``'03'``, ``80``);
insert into t_score
values (``'04'``, ``'01'``, ``50``);
insert into t_score
values (``'04'``, ``'02'``, ``30``);
insert into t_score
values (``'04'``, ``'03'``, ``20``);
insert into t_score
values (``'05'``, ``'01'``, ``76``);
insert into t_score
values (``'05'``, ``'02'``, ``87``);
insert into t_score
values (``'06'``, ``'01'``, ``31``);
insert into t_score
values (``'06'``, ``'03'``, ``34``);
insert into t_score
values (``'07'``, ``'02'``, ``89``);
insert into t_score
`values ('07', '03', 98);``
TOP1 求每门课程中成绩最好的人
1:利用表链接
``SELECT a.*, b.max_score`
from t_score a
``INNER JOIN (
``SELECT course_id, max(course_score) max_score
``from t_score
``GROUP BY course_id) b
` on a.course_id = b.course_id and a.course_score = b.max_score;
这种办法在效率上还算比较高
2:使用子查询
``SELECT *`
from t_score a
`where course_score = (SELECT max(course_score) from t_score where a.course_id = course_id);`` 虽然看起来简洁,但是数据量大的时候会很慢
3:使用exists
``SELECT *`
from t_score a
where not EXISTS
``(SELECT ``1 `from t_score where a.course_score < course_score and a.course_id = course_id);``
使用这个方法效率还可以,比较执行的次数 = count(*) from t_score 但是这种方法一定要保证,对比的列不为NULL,否则不能返回正确答案。
上面三种方法都可以考虑进去分数重复的情况。
4:窗口函数
#mysql 8开始支持
#mysql 8``开始支持`
SELECT student_id, course_id, course_score
from (
``SELECT student_id,
``course_id,
``course_score,
``dense_rank() over (partition by student_id ORDER BY course_score desc) as ranking
``from t_score) ``new
`where ranking = 1;``
因为成绩表有重复,这里只能用dense_rank() 和rank()
TOP N 求每门课程成绩前两人
1:使用自身左链接
``SELECT student_id, course_id, course_score`
from (
``SELECT a.*
``from t_score a
``left JOIN t_score b
``on a.course_id = b.course_id and a.course_score < b.course_score
``ORDER BY a.course_score DESC) ``new
GROUP BY student_id, course_id, course_score
HAVING count(course_id) < ``2
`ORDER BY course_id;``
2:使用子查询
``SELECT *`
from t_score a
where ``2 > (SELECT count(*) from t_score where a.course_id = course_id and a.course_score < course_score)
`ORDER BY course_id`` 数据量大的情况就会很慢
3:使用窗口函数
``SELECT student_id, course_id, course_score`
from (
``SELECT student_id,
``course_id,
``course_score,
``rank() over (PARTITION by course_id ORDER BY course_score DESC) as ranking
``from t_score) b
`where ranking <= 2;``
写法与TOP1的写法一样,就是where后面跟的条件要变一下。
三种排名的方式
``SELECT student_id as Sid,`
``course_id as Cid,
``course_score as score,
``row_number() over (partition by course_id ORDER BY course_score desc) as row_number2,
``rank() over (partition by course_id ORDER BY course_score desc) as rank2,
``dense_rank() over (partition by course_id ORDER BY course_score desc) as dense_rank2
from t_score
`where course_score != '03';``
上图中右侧3列就是3中不同的排名方式,对应sql窗口函数的三种
row_number : 就是按1-2-3顺序排
rank:就是按 1-1-3 排列--这就是 美国式排名
dense_rank:就是按 1-1-2-2-3排列--这就是中国式排名
row_number(): 不间断,序号不重复,如1、2、3、4、5
rank(): 间断,相同值同序号,如 1、1、3、4、5
dense_rank(): 不间断,相同值同序号,如1、1、2、3、4
请牢记这三种排名哦!!!
你的点赞和评论是作者继续下去的动力