SQL Server 开发系列(第六期):存储过程与函数——模块化开发的正确姿势
上一期我们学习了窗口函数,让复杂计算变得优雅。这一期,我们进入模块化开发的核心——存储过程和函数。你是否遇到过:标量函数导致查询慢 10 倍?存储过程突然变慢,重启后恢复?多语句表值函数被多次调用?这些问题的根源,都藏在存储过程和函数的使用方式里。这一期,我们从“会用”进阶到“用好”。
一、存储过程 vs 函数:核心区别
| 维度 | 存储过程 | 标量函数 | 内联表值函数 | 多语句表值函数 |
|---|---|---|---|---|
| 返回类型 | 结果集 + 输出参数 | 单个值 | 表 | 表 |
| 在 SELECT 中使用 | ❌ | ✅ | ✅ | ✅ |
| 事务控制 | ✅ 可 COMMIT/ROLLBACK | ❌ | ❌ | ❌ |
| 副作用(UPDATE/INSERT) | ✅ | ❌ | ❌ | ❌ |
| 性能 | 高 | ⚠️ 可能差 | ✅ 高 | ⚠️ 可能差 |
| 计划缓存重用 | ✅ | ✅ | ✅ | ❌ 每次重编译 |
| 典型场景 | 业务流程 | 简单计算 | 参数化视图 | 复杂逻辑后返回表 |
-- 存储过程:可执行任意操作
CREATE PROC sp_TransferMoney @From INT, @To INT, @Amount DECIMAL
AS
BEGIN
BEGIN TRANSACTION
UPDATE Accounts SET Balance -= @Amount WHERE ID = @From
UPDATE Accounts SET Balance += @Amount WHERE ID = @To
COMMIT
END
-- 标量函数:只能返回一个值,不能有副作用
CREATE FUNCTION dbo.fn_GetAge(@BirthDate DATE)
RETURNS INT
AS
BEGIN
RETURN DATEDIFF(YEAR, @BirthDate, GETDATE())
END
-- 使用:SELECT dbo.fn_GetAge('1990-01-01')
-- 内联表值函数:性能最好,相当于参数化视图
CREATE FUNCTION dbo.fn_GetOrdersByCustomer(@CustomerID INT)
RETURNS TABLE
AS
RETURN
(
SELECT * FROM Orders WHERE CustomerID = @CustomerID
)
-- 使用:SELECT * FROM dbo.fn_GetOrdersByCustomer(100)
-- 多语句表值函数:可以有多条语句,但性能差
CREATE FUNCTION dbo.fn_GetComplexReport(@Date DATE)
RETURNS @Result TABLE (ID INT, Total DECIMAL)
AS
BEGIN
INSERT @Result SELECT OrderID, SUM(Amount) FROM Orders WHERE OrderDate = @Date GROUP BY OrderID
IF @@ROWCOUNT = 0
INSERT @Result VALUES (0, 0)
RETURN
END
二、存储过程深度优化
2.1 参数嗅探(Parameter Sniffing)—— 最常见性能问题
问题:第一次执行时传入的参数决定了执行计划,后续即使参数不同也使用相同计划。
-- 假设 Orders 表有 5000 万行
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
END
-- 第一次执行:查询 2024-01-01 到 2024-01-02(2天,1000行)
-- 优化器生成 Nested Loops + Index Seek(适合少量数据)
-- 第二次执行:查询 2020-01-01 到 2024-01-01(4年,4000万行)
-- 仍然使用 Nested Loops → 性能灾难!
解决方案:
-- 方案1:使用本地变量(强制编译时未知)
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
DECLARE @LocalStart DATE = @StartDate, @LocalEnd DATE = @EndDate
SELECT * FROM Orders WHERE OrderDate BETWEEN @LocalStart AND @LocalEnd
END
-- 缺点:执行计划基于平均分布,可能不是最优
-- 方案2:OPTION (RECOMPILE)(每次重新编译)
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
OPTION (RECOMPILE)
END
-- 优点:每次都得到最优计划
-- 缺点:编译开销(适合查询时间 > 编译时间的场景)
-- 方案3:OPTION (OPTIMIZE FOR UNKNOWN)
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
OPTION (OPTIMIZE FOR UNKNOWN)
END
-- 优化器使用平均分布生成计划
-- 方案4:OPTION (OPTIMIZE FOR @StartDate = '2024-01-01')
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
OPTION (OPTIMIZE FOR (@StartDate = '2024-01-01'))
END
-- 强制使用特定值的计划
2.2 重编译的原因与监控
-- 查看存储过程重编译次数
SELECT
OBJECT_NAME(object_id) AS ProcName,
cacheobjtype,
usecounts,
refcounts
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE objtype = 'Proc'
AND text LIKE '%sp_GetOrders%'
-- 查看重编译事件
SELECT * FROM sys.trace_events WHERE name LIKE '%recompile%'
-- Event 164: SQL:StmtRecompile
触发重编译的常见操作:
- 临时表(
#temp)创建或结构变化 - 表变量(
@table)不触发重编译(好坏参半) OPTION (RECOMPILE)提示- 统计信息更新
SET选项变化(如ANSI_NULLS)
2.3 存储过程最佳实践模板
CREATE PROC [dbo].[usp_GetOrderDetails]
@OrderID INT = NULL,
@CustomerID INT = NULL,
@StartDate DATE = NULL,
@EndDate DATE = NULL,
@PageSize INT = 20,
@PageIndex INT = 1
AS
BEGIN
SET NOCOUNT ON; -- 减少网络流量
SET TRANSACTION ISOLATION LEVEL READ COMMITTED; -- 明确隔离级别
-- 参数验证
IF @PageSize < 1 SET @PageSize = 20;
IF @PageIndex < 1 SET @PageIndex = 1;
-- 使用本地变量避免参数嗅探(针对范围查询)
DECLARE @LocalStart DATE = @StartDate, @LocalEnd DATE = @EndDate
-- 分页查询
SELECT OrderID, OrderDate, CustomerID, Amount
FROM Orders
WHERE (@OrderID IS NULL OR OrderID = @OrderID)
AND (@CustomerID IS NULL OR CustomerID = @CustomerID)
AND (@LocalStart IS NULL OR OrderDate >= @LocalStart)
AND (@LocalEnd IS NULL OR OrderDate < @LocalEnd)
ORDER BY OrderDate DESC
OFFSET (@PageIndex - 1) * @PageSize ROWS
FETCH NEXT @PageSize ROWS ONLY
-- 返回总行数(可选,第二个结果集)
IF @PageIndex = 1
BEGIN
SELECT COUNT(*) AS TotalCount
FROM Orders
WHERE (@OrderID IS NULL OR OrderID = @OrderID)
AND (@CustomerID IS NULL OR CustomerID = @CustomerID)
AND (@LocalStart IS NULL OR OrderDate >= @LocalStart)
AND (@LocalEnd IS NULL OR OrderDate < @LocalEnd)
END
END
三、函数的性能陷阱
3.1 标量函数(Scalar Function)—— 性能杀手
问题:对标量函数的每行调用,就像隐式游标,逐行执行。
-- ❌ 坏:标量函数
CREATE FUNCTION dbo.GetCustomerLevel(@CustomerID INT)
RETURNS VARCHAR(20)
AS
BEGIN
DECLARE @Total DECIMAL
SELECT @Total = SUM(Amount) FROM Orders WHERE CustomerID = @CustomerID
RETURN CASE WHEN @Total > 10000 THEN 'Gold' WHEN @Total > 5000 THEN 'Silver' ELSE 'Bronze' END
END
-- 查询:调用 1000 次函数 → 1000 次子查询
SELECT OrderID, dbo.GetCustomerLevel(CustomerID) FROM Orders -- 极慢!
解决方案:
-- 方案1:改为内联表值函数(推荐)
CREATE FUNCTION dbo.GetCustomerLevelInline(@CustomerID INT)
RETURNS TABLE
AS
RETURN
(
SELECT
CASE
WHEN SUM(Amount) > 10000 THEN 'Gold'
WHEN SUM(Amount) > 5000 THEN 'Silver'
ELSE 'Bronze'
END AS CustomerLevel
FROM Orders
WHERE CustomerID = @CustomerID
)
-- 使用 CROSS APPLY
SELECT o.OrderID, l.CustomerLevel
FROM Orders o
CROSS APPLY dbo.GetCustomerLevelInline(o.CustomerID) l
-- 方案2:使用窗口函数(最佳)
SELECT
OrderID,
CASE
WHEN SUM(Amount) OVER (PARTITION BY CustomerID) > 10000 THEN 'Gold'
WHEN SUM(Amount) OVER (PARTITION BY CustomerID) > 5000 THEN 'Silver'
ELSE 'Bronze'
END AS CustomerLevel
FROM Orders
3.2 多语句表值函数(MSTVF)
问题:优化器无法预估行数,总是估算为 1 行,导致 JOIN 算法错误。
-- ❌ 坏:多语句表值函数
CREATE FUNCTION GetRecentOrders(@Days INT)
RETURNS @Orders TABLE (OrderID INT, OrderDate DATE)
AS
BEGIN
INSERT @Orders
SELECT OrderID, OrderDate FROM Orders WHERE OrderDate > DATEADD(day, -@Days, GETDATE())
RETURN
END
-- 查询时,优化器认为函数返回 1 行,可能选择 Nested Loops,实际返回 10 万行 → 性能灾难
SELECT * FROM GetRecentOrders(30) g
JOIN Customers c ON g.CustomerID = c.CustomerID
解决方案:
-- ✅ 好:改为内联表值函数
CREATE FUNCTION GetRecentOrdersInline(@Days INT)
RETURNS TABLE
AS
RETURN
(
SELECT OrderID, CustomerID, OrderDate
FROM Orders
WHERE OrderDate > DATEADD(day, -@Days, GETDATE())
)
-- 优化器能准确估算行数
四、SET NOCOUNT ON 的重要性
-- ❌ 坏:每次 DML 都返回影响行数
CREATE PROC sp_UpdateOrders
AS
BEGIN
UPDATE Orders SET Status = 'Processed' WHERE Status = 'Pending'
-- 返回 (5000 rows affected) 到客户端
UPDATE OrderDetails SET Processed = 1 WHERE OrderID IN (SELECT OrderID FROM Orders WHERE Status = 'Processed')
-- 再次返回行数
END
-- 网络流量增加,客户端处理额外结果集
-- ✅ 好:禁用影响行数消息
CREATE PROC sp_UpdateOrders
AS
BEGIN
SET NOCOUNT ON;
UPDATE Orders SET Status = 'Processed' WHERE Status = 'Pending'
UPDATE OrderDetails SET Processed = 1 WHERE OrderID IN (SELECT OrderID FROM Orders WHERE Status = 'Processed')
END
五、递归 CTE(Common Table Expression)
5.1 基础用法
-- 查询组织结构树
WITH OrgCTE AS (
-- 锚点成员:根节点
SELECT EmployeeID, ManagerID, EmployeeName, 0 AS Level
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
-- 递归成员
SELECT e.EmployeeID, e.ManagerID, e.EmployeeName, Level + 1
FROM Employees e
INNER JOIN OrgCTE o ON e.ManagerID = o.EmployeeID
)
SELECT * FROM OrgCTE
OPTION (MAXRECURSION 100); -- 默认 100,可设置 0 表示无限制
5.2 递归深度限制与性能
-- 查询物料清单(BOM)展开
WITH BOMCTE AS (
SELECT
ProductID,
ComponentID,
Quantity,
1 AS Level,
CAST(ComponentID AS VARCHAR(MAX)) AS Path
FROM BOM
WHERE ProductID = @TopProduct
UNION ALL
SELECT
b.ProductID,
b.ComponentID,
b.Quantity,
Level + 1,
Path + '->' + CAST(b.ComponentID AS VARCHAR)
FROM BOM b
INNER JOIN BOMCTE c ON b.ProductID = c.ComponentID
WHERE Level < 20 -- 防止无限递归
)
SELECT * FROM BOMCTE
性能注意事项:
- 递归 CTE 逐层处理,深度过大会很慢
- 确保锚点成员有索引(ManagerID、ParentID)
- 考虑使用临时表 + 循环替代深层递归
六、动态 SQL 在存储过程中的使用
CREATE PROC sp_SearchProducts
@Keyword NVARCHAR(100) = NULL,
@CategoryID INT = NULL,
@MinPrice DECIMAL = NULL,
@MaxPrice DECIMAL = NULL,
@SortColumn NVARCHAR(50) = 'ProductID',
@SortDirection NVARCHAR(4) = 'ASC'
AS
BEGIN
SET NOCOUNT ON;
-- 白名单验证
IF @SortColumn NOT IN ('ProductID', 'ProductName', 'Price', 'CreateDate')
SET @SortColumn = 'ProductID';
IF @SortDirection NOT IN ('ASC', 'DESC')
SET @SortDirection = 'ASC';
DECLARE @sql NVARCHAR(MAX) = N'SELECT ProductID, ProductName, Price FROM Products WHERE 1=1'
DECLARE @params NVARCHAR(MAX) = N'@Keyword NVARCHAR(100), @CategoryID INT, @MinPrice DECIMAL, @MaxPrice DECIMAL'
IF @Keyword IS NOT NULL
SET @sql += N' AND ProductName LIKE @Keyword'
IF @CategoryID IS NOT NULL
SET @sql += N' AND CategoryID = @CategoryID'
IF @MinPrice IS NOT NULL
SET @sql += N' AND Price >= @MinPrice'
IF @MaxPrice IS NOT NULL
SET @sql += N' AND Price <= @MaxPrice'
SET @sql += N' ORDER BY ' + QUOTENAME(@SortColumn) + N' ' + @SortDirection
-- 添加分页
SET @sql += N' OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY'
EXEC sp_executesql @sql, @params,
@Keyword = '%' + @Keyword + '%', -- 支持模糊查询
@CategoryID = @CategoryID,
@MinPrice = @MinPrice,
@MaxPrice = @MaxPrice
END
七、监控与调试工具
-- 1. 查看存储过程执行统计
SELECT
OBJECT_NAME(object_id) AS ProcName,
cached_time,
execution_count,
total_worker_time / execution_count AS avg_cpu_us,
total_logical_reads / execution_count AS avg_logical_reads
FROM sys.dm_exec_procedure_stats
WHERE database_id = DB_ID()
ORDER BY avg_cpu_us DESC
-- 2. 查看函数使用情况
SELECT
OBJECT_NAME(object_id) AS FunctionName,
execution_count,
total_worker_time,
total_logical_reads
FROM sys.dm_exec_function_stats
WHERE database_id = DB_ID()
ORDER BY total_worker_time DESC
-- 3. 查看当前运行的存储过程
SELECT
r.session_id,
OBJECT_NAME(st.objectid) AS ProcName,
r.cpu_time,
r.total_elapsed_time,
SUBSTRING(st.text, (r.statement_start_offset/2)+1,
((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) AS statement
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) st
WHERE r.command = 'SELECT'
八、核心总结
| 知识点 | 核心要点 |
|---|---|
| 存储过程 vs 函数 | 存储过程适合业务流程,函数适合计算 |
| 内联表值函数 | 性能最好,相当于参数化视图 |
| 标量函数 | 性能杀手,尽量用内联函数或窗口函数替代 |
| 参数嗅探 | 本地变量、OPTION (RECOMPILE) 解决 |
| SET NOCOUNT ON | 减少网络流量,每个存储过程都应该加 |
| 递归 CTE | 深度有限制,注意性能 |
| 动态 SQL | 必须用参数化 + QUOTENAME 防注入 |
一句话记住本期内容:
存储过程用 NOCOUNT ON 和本地变量防嗅探,内联表值函数优于标量函数,动态 SQL 必须参数化。
动手练习
-- 问题1:为什么下面的标量函数会导致查询极慢?如何优化?
CREATE FUNCTION dbo.GetOrderCount(@CustomerID INT)
RETURNS INT
AS
BEGIN
DECLARE @Count INT
SELECT @Count = COUNT(*) FROM Orders WHERE CustomerID = @CustomerID
RETURN @Count
END
SELECT CustomerID, dbo.GetOrderCount(CustomerID) FROM Customers
-- 问题2:这个存储过程有参数嗅探风险吗?如何修复?
CREATE PROC sp_GetSales @Year INT, @Month INT
AS
SELECT SUM(Amount) FROM Sales WHERE YEAR(SaleDate) = @Year AND MONTH(SaleDate) = @Month
-- 问题3:如何用递归 CTE 查询所有下属员工?
-- 表:Employees (EmployeeID, ManagerID, Name)
-- 给定 @ManagerID = 1,返回所有直接和间接下属
点击查看参考答案
问题1:标量函数导致每行执行一次子查询(全表扫描或索引查找)。优化方案:
-- 方案1:窗口函数
SELECT CustomerID, COUNT(*) OVER (PARTITION BY CustomerID) AS OrderCount FROM Orders
-- 方案2:内联表值函数
CREATE FUNCTION GetOrderCountInline(@CustomerID INT) RETURNS TABLE AS RETURN (SELECT COUNT(*) AS OrderCount FROM Orders WHERE CustomerID = @CustomerID)
SELECT c.CustomerID, oc.OrderCount FROM Customers C CROSS APPLY GetOrderCountInline(c.CustomerID) oc
问题2:有风险,函数包裹列导致索引失效且参数嗅探。修复:
CREATE PROC sp_GetSales @Year INT, @Month INT
AS
DECLARE @StartDate DATE = DATEFROMPARTS(@Year, @Month, 1)
DECLARE @EndDate DATE = DATEADD(MONTH, 1, @StartDate)
SELECT SUM(Amount) FROM Sales WHERE SaleDate >= @StartDate AND SaleDate < @EndDate
问题3:
WITH Subordinates AS (
SELECT EmployeeID, ManagerID, Name, 0 AS Level
FROM Employees WHERE EmployeeID = @ManagerID
UNION ALL
SELECT e.EmployeeID, e.ManagerID, e.Name, Level + 1
FROM Employees e INNER JOIN Subordinates s ON e.ManagerID = s.EmployeeID
)
SELECT * FROM Subordinates WHERE Level > 0
下一期预告
触发器与约束——数据完整性的守护者
- AFTER vs INSTEAD OF 触发器的选择
- 触发器的嵌套与递归
- 使用触发器实现审计日志
- 约束 vs 触发器:性能与功能对比
- 替代触发器的现代方案
📌 本文代码已在 SQL Server 2019+ 验证。存储过程和函数是模块化的基础,但使用不当会成为性能瓶颈。
本系列持续更新中,点击关注不错过第七期。