SQL Server 开发系列(第五期):窗口函数——轻松实现排名、累计、移动计算
上一期我们深入了 JOIN 和子查询。这一期,我们来到 SQL Server 开发中最强大也最优雅的工具——窗口函数。你是否曾为“分组内排序后取前 N 条”写过复杂的子查询?为“计算累计销售额”写过自连接?为“找每个分类的最新产品”绞尽脑汁?窗口函数让这些需求变得简单、高效、可读。从 SQL Server 2005 引入到现在,它已成为每个开发人员必备的技能。
一、什么是窗口函数?
窗口函数在不改变行数的前提下,对每一行执行基于“窗口”(一组行)的计算。
核心语法:
函数名(列) OVER (PARTITION BY 分组列 ORDER BY 排序列 窗口范围)
与传统 GROUP BY 的区别:
- GROUP BY:分组后每组返回一行,原始数据被压缩
- 窗口函数:分组后每行都保留,并附加计算结果
-- GROUP BY:返回每组一行
SELECT CustomerID, SUM(Amount) AS Total
FROM Orders
GROUP BY CustomerID;
-- 结果:100 个客户 → 100 行
-- 窗口函数:每行都附加总额
SELECT OrderID, CustomerID, Amount,
SUM(Amount) OVER (PARTITION BY CustomerID) AS CustomerTotal
FROM Orders;
-- 结果:1000 个订单 → 1000 行,每行都有该客户的总额
二、窗口函数三大类别
2.1 排名函数
| 函数 | 行为 | 示例 |
|---|---|---|
ROW_NUMBER() | 连续排名,相同值不同序号 | 1,2,3,4 |
RANK() | 相同值同排名,下一个跳过 | 1,2,2,4 |
DENSE_RANK() | 相同值同排名,下一个连续 | 1,2,2,3 |
NTILE(N) | 分到 N 个桶,返回桶号 | 1,1,2,2,3,3 |
-- 实战:员工薪资排名
SELECT
EmployeeName,
Department,
Salary,
ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum,
RANK() OVER (ORDER BY Salary DESC) AS Rank,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank,
NTILE(4) OVER (ORDER BY Salary DESC) AS Quartile
FROM Employees;
| EmployeeName | Salary | RowNum | Rank | DenseRank | Quartile |
|---|---|---|---|---|---|
| 张三 | 10000 | 1 | 1 | 1 | 1 |
| 李四 | 9000 | 2 | 2 | 2 | 1 |
| 王五 | 9000 | 3 | 2 | 2 | 2 |
| 赵六 | 8000 | 4 | 4 | 3 | 2 |
使用场景:
ROW_NUMBER():分页、去重(每组取第一条)RANK():成绩排名(有并列时保留名次间隙)DENSE_RANK():奖学金排名(并列不占后续名额)NTILE():百分位数、数据分片
2.2 聚合函数(作为窗口函数)
所有聚合函数都可以用作窗口函数:SUM、AVG、COUNT、MIN、MAX
SELECT
OrderID,
CustomerID,
Amount,
SUM(Amount) OVER (PARTITION BY CustomerID) AS CustomerTotal,
AVG(Amount) OVER (PARTITION BY CustomerID) AS CustomerAvg,
COUNT(*) OVER (PARTITION BY CustomerID) AS CustomerOrderCount,
MIN(Amount) OVER (PARTITION BY CustomerID) AS CustomerMinAmount,
MAX(Amount) OVER (PARTITION BY CustomerID) AS CustomerMaxAmount
FROM Orders;
2.3 偏移函数
| 函数 | 行为 | 说明 |
|---|---|---|
LAG(列, offset, default) | 访问当前行之前的行 | 环比、同比 |
LEAD(列, offset, default) | 访问当前行之后的行 | 未来值对比 |
FIRST_VALUE(列) | 窗口内第一行的值 | 基准值 |
LAST_VALUE(列) | 窗口内最后一行的值 | 注意默认范围问题 |
-- 实战:计算环比增长率
SELECT
OrderDate,
Amount,
LAG(Amount, 1, 0) OVER (ORDER BY OrderDate) AS PrevDayAmount,
CASE
WHEN LAG(Amount, 1, 0) OVER (ORDER BY OrderDate) = 0 THEN NULL
ELSE (Amount - LAG(Amount) OVER (ORDER BY OrderDate))
/ LAG(Amount) OVER (ORDER BY OrderDate) * 100
END AS GrowthRate
FROM DailySales
ORDER BY OrderDate;
三、窗口范围(ROWS/RANGE)精讲
3.1 默认窗口范围
-- 默认范围:从分组第一行到当前行
SUM(Amount) OVER (PARTITION BY CustomerID ORDER BY OrderDate)
-- 等价于
SUM(Amount) OVER (PARTITION BY CustomerID ORDER BY OrderDate
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
3.2 自定义窗口范围
| 范围定义 | 含义 |
|---|---|
UNBOUNDED PRECEDING | 分组第一行 |
N PRECEDING | 当前行前 N 行 |
CURRENT ROW | 当前行 |
N FOLLOWING | 当前行后 N 行 |
UNBOUNDED FOLLOWING | 分组最后一行 |
-- 滚动窗口:最近 3 天移动平均
SELECT
OrderDate,
Amount,
AVG(Amount) OVER (ORDER BY OrderDate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg3Day
FROM DailySales;
-- 累计到当前(默认)
SELECT
OrderDate,
Amount,
SUM(Amount) OVER (ORDER BY OrderDate) AS RunningTotal
FROM DailySales;
-- 分组内累计百分比
SELECT
ProductID,
Amount,
SUM(Amount) OVER (PARTITION BY ProductID ORDER BY OrderDate) AS RunningTotal,
SUM(Amount) OVER (PARTITION BY ProductID) AS Total,
SUM(Amount) OVER (PARTITION BY ProductID ORDER BY OrderDate)
/ SUM(Amount) OVER (PARTITION BY ProductID) * 100 AS RunningPercent
FROM Sales;
3.3 ROWS vs RANGE
| 选项 | 行为 | 性能 |
|---|---|---|
ROWS | 基于物理行数 | 更快 |
RANGE | 基于逻辑值(相同 ORDER BY 值视为同一行) | 较慢 |
-- 假设 OrderDate 有多行相同
-- ROWS:精确取前 3 行
SUM(Amount) OVER (ORDER BY OrderDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
-- RANGE:取 OrderDate <= 当前日期 + 2 天的所有行(日期差)
SUM(Amount) OVER (ORDER BY OrderDate RANGE BETWEEN 2 PRECEDING AND CURRENT ROW)
四、实战场景
4.1 分页查询(替代 OFFSET FETCH)
-- 每页 20 条,取第 3 页
WITH OrderedOrders AS (
SELECT
OrderID, OrderDate, Amount,
ROW_NUMBER() OVER (ORDER BY OrderDate DESC) AS RowNum
FROM Orders
)
SELECT * FROM OrderedOrders
WHERE RowNum BETWEEN 41 AND 60;
4.2 分组内取 Top N
-- 每个分类下销售额最高的前 3 个产品
WITH RankedProducts AS (
SELECT
CategoryID,
ProductName,
SalesAmount,
ROW_NUMBER() OVER (PARTITION BY CategoryID ORDER BY SalesAmount DESC) AS rn
FROM Products
)
SELECT * FROM RankedProducts WHERE rn <= 3;
4.3 去重(保留每组第一条)
-- 删除重复的客户记录(保留 ID 最小的)
WITH DupCustomers AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Email ORDER BY CustomerID) AS rn
FROM Customers
)
DELETE FROM DupCustomers WHERE rn > 1;
4.4 累计值计算
-- 累计销售额(按日期)
SELECT
OrderDate,
SUM(Amount) AS DailyAmount,
SUM(SUM(Amount)) OVER (ORDER BY OrderDate) AS RunningTotal
FROM Orders
GROUP BY OrderDate
ORDER BY OrderDate;
4.5 同比环比计算
-- 月环比、同比
WITH MonthlySales AS (
SELECT
YEAR(OrderDate) AS Year,
MONTH(OrderDate) AS Month,
SUM(Amount) AS Amount
FROM Orders
GROUP BY YEAR(OrderDate), MONTH(OrderDate)
)
SELECT
Year,
Month,
Amount,
LAG(Amount, 1) OVER (ORDER BY Year, Month) AS PrevMonthAmount,
(Amount - LAG(Amount) OVER (ORDER BY Year, Month))
/ LAG(Amount) OVER (ORDER BY Year, Month) * 100 AS MonthOverMonth,
LAG(Amount, 12) OVER (ORDER BY Year, Month) AS PrevYearAmount,
(Amount - LAG(Amount, 12) OVER (ORDER BY Year, Month))
/ LAG(Amount, 12) OVER (ORDER BY Year, Month) * 100 AS YearOverYear
FROM MonthlySales
ORDER BY Year, Month;
4.6 FIRST_VALUE 和 LAST_VALUE
-- 每个订单的第一笔和最后一笔明细
SELECT
OrderID,
ProductID,
Quantity,
FIRST_VALUE(ProductID) OVER (PARTITION BY OrderID ORDER BY Quantity DESC) AS MostProduct,
LAST_VALUE(ProductID) OVER (PARTITION BY OrderID
ORDER BY Quantity DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LeastProduct
FROM OrderDetails;
五、性能优化要点
5.1 窗口函数的性能特征
| 操作 | 性能 | 说明 |
|---|---|---|
ROW_NUMBER() | 快 | 只需排序 |
RANK()/DENSE_RANK() | 快 | 同上 |
SUM()/AVG() 默认窗口 | 快 | 一次扫描 |
SUM() 自定义窗口 | 中 | 需要回溯 |
LAG()/LEAD() | 快 | 访问前后行 |
FIRST_VALUE()/LAST_VALUE() | 中 | 需扫描窗口 |
| 多窗口函数 | 慢 | 可能多次排序 |
5.2 减少多次排序
-- ❌ 差:两个不同的 ORDER BY 导致两次排序
SELECT
ROW_NUMBER() OVER (PARTITION BY CategoryID ORDER BY Price) AS rn1,
ROW_NUMBER() OVER (PARTITION BY CategoryID ORDER BY Sales) AS rn2
FROM Products;
-- ✅ 好:使用相同的排序
SELECT
ROW_NUMBER() OVER (PARTITION BY CategoryID ORDER BY Price) AS rn1,
RANK() OVER (PARTITION BY CategoryID ORDER BY Price) AS rn2 -- 同排序
FROM Products;
5.3 窗口函数 vs 自连接
-- ❌ 慢:自连接计算累计
SELECT a.OrderID, a.Amount, SUM(b.Amount) AS RunningTotal
FROM Orders a
JOIN Orders b ON a.OrderDate >= b.OrderDate
GROUP BY a.OrderID, a.Amount;
-- ✅ 快:窗口函数(1 次扫描)
SELECT OrderID, Amount,
SUM(Amount) OVER (ORDER BY OrderDate) AS RunningTotal
FROM Orders;
六、常见错误与陷阱
6.1 忘记指定 ORDER BY
-- ❌ 错误:没有 ORDER BY,窗口默认是整个分区
SUM(Amount) OVER (PARTITION BY CustomerID) -- 正确,是分组总和
ROW_NUMBER() OVER (PARTITION BY CustomerID) -- 错误!结果不确定
-- ✅ 正确:排名必须有 ORDER BY
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate)
6.2 LAST_VALUE 的默认范围问题
-- ❌ 错误:LAST_VALUE 默认范围是到当前行,不是到最后一行
SELECT
OrderID,
Amount,
LAST_VALUE(Amount) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS LastAmount
FROM Orders;
-- 结果:每行的 LastAmount 等于当前行 Amount(因为当前行是窗口最后一行)
-- ✅ 正确:明确指定范围到分组末尾
SELECT
OrderID,
Amount,
LAST_VALUE(Amount) OVER (PARTITION BY CustomerID ORDER BY OrderDate
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LastAmount
FROM Orders;
6.3 在 WHERE 中使用窗口函数
-- ❌ 错误:窗口函数不能直接在 WHERE 中使用
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY Amount) AS rn
FROM Orders
) t
WHERE rn <= 10; -- 正确:使用子查询包裹
-- 或使用 CTE
WITH Ranked AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Amount) AS rn FROM Orders
)
SELECT * FROM Ranked WHERE rn <= 10;
七、完整实战案例
场景:电商平台客户价值分析
-- 需求:
-- 1. 每个客户的总消费额、订单数、平均客单价
-- 2. 每个客户按时间排名的订单序号
-- 3. 每个客户的累计消费额(按时间)
-- 4. 每个客户上一单的金额(用于复购分析)
-- 5. 每个客户消费额排名(全局)
-- 6. 每个客户消费额在其所在城市的排名
WITH CustomerStats AS (
SELECT
c.CustomerID,
c.CustomerName,
c.City,
o.OrderID,
o.OrderDate,
o.Amount,
-- 客户维度统计
SUM(o.Amount) OVER (PARTITION BY c.CustomerID) AS CustomerTotalAmount,
COUNT(o.OrderID) OVER (PARTITION BY c.CustomerID) AS CustomerOrderCount,
AVG(o.Amount) OVER (PARTITION BY c.CustomerID) AS CustomerAvgAmount,
-- 订单维度排名
ROW_NUMBER() OVER (PARTITION BY c.CustomerID ORDER BY o.OrderDate) AS OrderSeq,
-- 累计消费
SUM(o.Amount) OVER (PARTITION BY c.CustomerID ORDER BY o.OrderDate) AS RunningTotal,
-- 上一单金额
LAG(o.Amount, 1, 0) OVER (PARTITION BY c.CustomerID ORDER BY o.OrderDate) AS PrevAmount,
-- 全局排名
RANK() OVER (ORDER BY o.Amount DESC) AS GlobalRank,
-- 城市内排名
RANK() OVER (PARTITION BY c.City ORDER BY o.Amount DESC) AS CityRank
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
)
SELECT
CustomerID,
CustomerName,
City,
OrderID,
OrderDate,
Amount,
OrderSeq,
RunningTotal,
PrevAmount,
CASE WHEN PrevAmount > 0 THEN (Amount - PrevAmount) / PrevAmount * 100 ELSE NULL END AS GrowthRate,
GlobalRank,
CityRank,
CustomerTotalAmount,
CustomerOrderCount,
CustomerAvgAmount
FROM CustomerStats
ORDER BY CustomerID, OrderDate;
八、核心总结
| 知识点 | 核心要点 |
|---|---|
| 窗口函数语法 | 函数 OVER (PARTITION BY 分组 ORDER BY 排序 窗口范围) |
| ROW_NUMBER | 唯一序号,适合分页、去重 |
| RANK vs DENSE_RANK | RANK 跳过占位,DENSE_RANK 连续 |
| 累计求和 | SUM OVER (ORDER BY) 默认到当前行 |
| 移动平均 | ROWS BETWEEN N PRECEDING AND CURRENT ROW |
| LAG/LEAD | 访问前后行,适合环比同比 |
| FIRST/LAST_VALUE | 注意 LAST_VALUE 的默认范围 |
| 性能 | 窗口函数通常优于自连接和相关子查询 |
一句话记住本期内容:
窗口函数让你在“不减少行数”的前提下做“分组计算”——ROW_NUMBER 去重分页,SUM OVER 累计求和,LAG 算环比,一个 OVER 胜过千行子查询。
动手练习
-- 问题1:写出查询,计算每个产品在每个月的销售额,以及该产品截止到当月的累计销售额
-- 表:Sales (SaleID, ProductID, SaleDate, Amount)
-- 问题2:找出每个分类下销售额排名前 2 的产品
-- 问题3:计算每个客户每次购买距离上一次购买的天数(复购间隔)
-- 问题4:如何用窗口函数高效分页(第 11-20 条)?
点击查看参考答案
问题1:
WITH MonthlySales AS (
SELECT
ProductID,
YEAR(SaleDate) AS Year,
MONTH(SaleDate) AS Month,
SUM(Amount) AS MonthlyAmount
FROM Sales
GROUP BY ProductID, YEAR(SaleDate), MONTH(SaleDate)
)
SELECT
ProductID,
Year,
Month,
MonthlyAmount,
SUM(MonthlyAmount) OVER (PARTITION BY ProductID ORDER BY Year, Month) AS RunningTotal
FROM MonthlySales;
问题2:
WITH Ranked AS (
SELECT
CategoryID,
ProductID,
SalesAmount,
ROW_NUMBER() OVER (PARTITION BY CategoryID ORDER BY SalesAmount DESC) AS rn
FROM Products
)
SELECT * FROM Ranked WHERE rn <= 2;
问题3:
SELECT
CustomerID,
OrderID,
OrderDate,
LAG(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS PrevOrderDate,
DATEDIFF(day, LAG(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate), OrderDate) AS DaysSinceLastOrder
FROM Orders;
问题4:
WITH Paginated AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY OrderDate) AS RowNum
FROM Orders
)
SELECT * FROM Paginated WHERE RowNum BETWEEN 11 AND 20;
下一期预告
存储过程与函数——模块化开发的正确姿势
- 存储过程 vs 函数:什么时候用哪个?
- 参数化与重编译问题
- 表值函数 vs 标量函数的性能差异
- 递归 CTE 的使用场景
- 实战:构建可维护的数据访问层
📌 本文代码已在 SQL Server 2019+ 验证。窗口函数是现代 SQL 的核心技能,建议在实际开发中尽量使用,替代复杂的自连接和相关子查询。
本系列持续更新中,点击关注不错过第六期。