SQL窗口函数

238 阅读3分钟

语法

关键子
over 用于为行定义一个“窗口”
选项:选项用以改变进行聚合运算的窗口范围,有如下关键字
partition by,order by,rows

rows含义如下

rows between 2 preceding and current row -- 取前两行与当前行之间
rows between unbounded preceding and current row -- 取第一行到当前行
rows between current row and 2 following -- 取当前行与后两行之间
rows between current row and unbounded following --取当前行与最后一行之间
rows between 2 preceding and 1 following --取当前行前2行与当前行后1行之间,包括当前行

当前行的概念,生成表格时,生成到哪一行,该行就是当前行

使用

使用如下 学生表

学号班级成绩
0001186
0002195
0003289
0004183
0005286
0006392
0007386
0008188
需求:按班级分组排名,并统计每个班级累计总成绩
-- 函数名(列) OVER(选项)
select *,
rank() over (PARTITION BY cid ORDER BY score asc) as ranking,
sum(score) over (rows between unbounded preceding and current row) as score_total
from class

结果如下:

学号班级成绩rankingscore_total
0004183183
00011862169
00081883257
00021954352
0005286186
00032892175
0007386186
00063922178
rank dense_rank row_number 的区别

在表中向一班中添加多一名也靠了86分的同学,如下

学号班级成绩
0001186
0002195
0003289
0004183
0005286
0006392
0007386
0008188
0009186
select *,
rank() over (PARTITION BY cid ORDER BY score asc) as ranking,
dense_rank() over (PARTITION BY cid ORDER BY score asc) as d_ranking,
row_number() over (PARTITION BY cid ORDER BY score asc) as r_ranking,
sum(score) over (rows between unbounded preceding and current row) as score_total
from class

结果如下:(省略2,3班方便观察)

学号班级成绩rankingd_rankingr_rankingscore_total
000418311183
0001186222169
0009186333255
0008188334343
0002195545438
结论:
rank 连续两个3排名后直接到5,跳过了4
rank 连续两个3排名后到4,没有跳号(正常人排名)
row_number 1-5连续排名,不分先后,相当于加一个行号

通常窗口函数会搭配聚合函数使用,常用聚合函数有

常用聚合函数
(1row_number() over(partition byorder by …)增加一列,类似与增加伪列
(2rank() over(partition byorder by …)
(3dense_rank() over(partition byorder by …)
rank(): 跳跃排序,如果有两个第一级时,接下来就是第三级。
dense_rank(): 连续排序,如果有两个第一级时,接下来仍然是第二级。
(4count() over(partition byorder by …)
(5max() over(partition byorder by …)
(6min() over(partition byorder by …)
(7sum() over(partition byorder by …)
(8avg() over(partition byorder by …)
(9first_value() over(partition byorder by …)
(10last_value() over(partition byorder by …)
与函数的功能一致,只是是开窗函数
(11lag() over(partition byorder by …)
(12lead() over(partition byorder by …)
lag 和lead 可以 获取结果集中,按一定排序所排列的当前行的上下相邻若干offset 的某个行的某个列(不用结果集的自关联);
lag ,lead 分别是向前,向后;
lag 和lead 有三个参数,第一个参数是列名,第二个参数是偏移的offset,第三个参数是 超出记录窗口时的默认值