SQL Server 开发系列(第六期):存储过程与函数——模块化开发的正确姿势

0 阅读9分钟

SQL Server 开发系列(第六期):存储过程与函数——模块化开发的正确姿势

上一期我们学习了窗口函数,让复杂计算变得优雅。这一期,我们进入模块化开发的核心——存储过程和函数。你是否遇到过:标量函数导致查询慢 10 倍?存储过程突然变慢,重启后恢复?多语句表值函数被多次调用?这些问题的根源,都藏在存储过程和函数的使用方式里。这一期,我们从“会用”进阶到“用好”。

一、存储过程 vs 函数:核心区别

维度存储过程标量函数内联表值函数多语句表值函数
返回类型结果集 + 输出参数单个值
在 SELECT 中使用
事务控制✅ 可 COMMIT/ROLLBACK
副作用(UPDATE/INSERT)
性能⚠️ 可能差✅ 高⚠️ 可能差
计划缓存重用❌ 每次重编译
典型场景业务流程简单计算参数化视图复杂逻辑后返回表
-- 存储过程:可执行任意操作
CREATE PROC sp_TransferMoney @From INT, @To INT, @Amount DECIMAL
AS
BEGIN
    BEGIN TRANSACTION
    UPDATE Accounts SET Balance -= @Amount WHERE ID = @From
    UPDATE Accounts SET Balance += @Amount WHERE ID = @To
    COMMIT
END

-- 标量函数:只能返回一个值,不能有副作用
CREATE FUNCTION dbo.fn_GetAge(@BirthDate DATE)
RETURNS INT
AS
BEGIN
    RETURN DATEDIFF(YEAR, @BirthDate, GETDATE())
END
-- 使用:SELECT dbo.fn_GetAge('1990-01-01')

-- 内联表值函数:性能最好,相当于参数化视图
CREATE FUNCTION dbo.fn_GetOrdersByCustomer(@CustomerID INT)
RETURNS TABLE
AS
RETURN
(
    SELECT * FROM Orders WHERE CustomerID = @CustomerID
)
-- 使用:SELECT * FROM dbo.fn_GetOrdersByCustomer(100)

-- 多语句表值函数:可以有多条语句,但性能差
CREATE FUNCTION dbo.fn_GetComplexReport(@Date DATE)
RETURNS @Result TABLE (ID INT, Total DECIMAL)
AS
BEGIN
    INSERT @Result SELECT OrderID, SUM(Amount) FROM Orders WHERE OrderDate = @Date GROUP BY OrderID
    IF @@ROWCOUNT = 0
        INSERT @Result VALUES (0, 0)
    RETURN
END

二、存储过程深度优化

2.1 参数嗅探(Parameter Sniffing)—— 最常见性能问题

问题:第一次执行时传入的参数决定了执行计划,后续即使参数不同也使用相同计划。

-- 假设 Orders 表有 5000 万行
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
    SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
END

-- 第一次执行:查询 2024-01-01 到 2024-01-02(2天,1000行)
-- 优化器生成 Nested Loops + Index Seek(适合少量数据)

-- 第二次执行:查询 2020-01-01 到 2024-01-01(4年,4000万行)
-- 仍然使用 Nested Loops → 性能灾难!

解决方案

-- 方案1:使用本地变量(强制编译时未知)
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
    DECLARE @LocalStart DATE = @StartDate, @LocalEnd DATE = @EndDate
    SELECT * FROM Orders WHERE OrderDate BETWEEN @LocalStart AND @LocalEnd
END
-- 缺点:执行计划基于平均分布,可能不是最优

-- 方案2:OPTION (RECOMPILE)(每次重新编译)
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
    SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
    OPTION (RECOMPILE)
END
-- 优点:每次都得到最优计划
-- 缺点:编译开销(适合查询时间 > 编译时间的场景)

-- 方案3:OPTION (OPTIMIZE FOR UNKNOWN)
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
    SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
    OPTION (OPTIMIZE FOR UNKNOWN)
END
-- 优化器使用平均分布生成计划

-- 方案4:OPTION (OPTIMIZE FOR @StartDate = '2024-01-01')
CREATE PROC sp_GetOrders @StartDate DATE, @EndDate DATE
AS
BEGIN
    SELECT * FROM Orders WHERE OrderDate BETWEEN @StartDate AND @EndDate
    OPTION (OPTIMIZE FOR (@StartDate = '2024-01-01'))
END
-- 强制使用特定值的计划

2.2 重编译的原因与监控

-- 查看存储过程重编译次数
SELECT 
    OBJECT_NAME(object_id) AS ProcName,
    cacheobjtype,
    usecounts,
    refcounts
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE objtype = 'Proc'
  AND text LIKE '%sp_GetOrders%'

-- 查看重编译事件
SELECT * FROM sys.trace_events WHERE name LIKE '%recompile%'
-- Event 164: SQL:StmtRecompile

触发重编译的常见操作

  • 临时表(#temp)创建或结构变化
  • 表变量(@table)不触发重编译(好坏参半)
  • OPTION (RECOMPILE) 提示
  • 统计信息更新
  • SET 选项变化(如 ANSI_NULLS

2.3 存储过程最佳实践模板

CREATE PROC [dbo].[usp_GetOrderDetails]
    @OrderID INT = NULL,
    @CustomerID INT = NULL,
    @StartDate DATE = NULL,
    @EndDate DATE = NULL,
    @PageSize INT = 20,
    @PageIndex INT = 1
AS
BEGIN
    SET NOCOUNT ON;  -- 减少网络流量
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED;  -- 明确隔离级别
    
    -- 参数验证
    IF @PageSize < 1 SET @PageSize = 20;
    IF @PageIndex < 1 SET @PageIndex = 1;
    
    -- 使用本地变量避免参数嗅探(针对范围查询)
    DECLARE @LocalStart DATE = @StartDate, @LocalEnd DATE = @EndDate
    
    -- 分页查询
    SELECT OrderID, OrderDate, CustomerID, Amount
    FROM Orders
    WHERE (@OrderID IS NULL OR OrderID = @OrderID)
      AND (@CustomerID IS NULL OR CustomerID = @CustomerID)
      AND (@LocalStart IS NULL OR OrderDate >= @LocalStart)
      AND (@LocalEnd IS NULL OR OrderDate < @LocalEnd)
    ORDER BY OrderDate DESC
    OFFSET (@PageIndex - 1) * @PageSize ROWS
    FETCH NEXT @PageSize ROWS ONLY
    
    -- 返回总行数(可选,第二个结果集)
    IF @PageIndex = 1
    BEGIN
        SELECT COUNT(*) AS TotalCount
        FROM Orders
        WHERE (@OrderID IS NULL OR OrderID = @OrderID)
          AND (@CustomerID IS NULL OR CustomerID = @CustomerID)
          AND (@LocalStart IS NULL OR OrderDate >= @LocalStart)
          AND (@LocalEnd IS NULL OR OrderDate < @LocalEnd)
    END
END

三、函数的性能陷阱

3.1 标量函数(Scalar Function)—— 性能杀手

问题:对标量函数的每行调用,就像隐式游标,逐行执行。

-- ❌ 坏:标量函数
CREATE FUNCTION dbo.GetCustomerLevel(@CustomerID INT)
RETURNS VARCHAR(20)
AS
BEGIN
    DECLARE @Total DECIMAL
    SELECT @Total = SUM(Amount) FROM Orders WHERE CustomerID = @CustomerID
    RETURN CASE WHEN @Total > 10000 THEN 'Gold' WHEN @Total > 5000 THEN 'Silver' ELSE 'Bronze' END
END

-- 查询:调用 1000 次函数 → 1000 次子查询
SELECT OrderID, dbo.GetCustomerLevel(CustomerID) FROM Orders  -- 极慢!

解决方案

-- 方案1:改为内联表值函数(推荐)
CREATE FUNCTION dbo.GetCustomerLevelInline(@CustomerID INT)
RETURNS TABLE
AS
RETURN
(
    SELECT 
        CASE 
            WHEN SUM(Amount) > 10000 THEN 'Gold'
            WHEN SUM(Amount) > 5000 THEN 'Silver'
            ELSE 'Bronze'
        END AS CustomerLevel
    FROM Orders
    WHERE CustomerID = @CustomerID
)

-- 使用 CROSS APPLY
SELECT o.OrderID, l.CustomerLevel
FROM Orders o
CROSS APPLY dbo.GetCustomerLevelInline(o.CustomerID) l

-- 方案2:使用窗口函数(最佳)
SELECT 
    OrderID,
    CASE 
        WHEN SUM(Amount) OVER (PARTITION BY CustomerID) > 10000 THEN 'Gold'
        WHEN SUM(Amount) OVER (PARTITION BY CustomerID) > 5000 THEN 'Silver'
        ELSE 'Bronze'
    END AS CustomerLevel
FROM Orders

3.2 多语句表值函数(MSTVF)

问题:优化器无法预估行数,总是估算为 1 行,导致 JOIN 算法错误。

-- ❌ 坏:多语句表值函数
CREATE FUNCTION GetRecentOrders(@Days INT)
RETURNS @Orders TABLE (OrderID INT, OrderDate DATE)
AS
BEGIN
    INSERT @Orders
    SELECT OrderID, OrderDate FROM Orders WHERE OrderDate > DATEADD(day, -@Days, GETDATE())
    RETURN
END

-- 查询时,优化器认为函数返回 1 行,可能选择 Nested Loops,实际返回 10 万行 → 性能灾难
SELECT * FROM GetRecentOrders(30) g
JOIN Customers c ON g.CustomerID = c.CustomerID

解决方案

-- ✅ 好:改为内联表值函数
CREATE FUNCTION GetRecentOrdersInline(@Days INT)
RETURNS TABLE
AS
RETURN
(
    SELECT OrderID, CustomerID, OrderDate
    FROM Orders 
    WHERE OrderDate > DATEADD(day, -@Days, GETDATE())
)
-- 优化器能准确估算行数

四、SET NOCOUNT ON 的重要性

-- ❌ 坏:每次 DML 都返回影响行数
CREATE PROC sp_UpdateOrders
AS
BEGIN
    UPDATE Orders SET Status = 'Processed' WHERE Status = 'Pending'
    -- 返回 (5000 rows affected) 到客户端
    UPDATE OrderDetails SET Processed = 1 WHERE OrderID IN (SELECT OrderID FROM Orders WHERE Status = 'Processed')
    -- 再次返回行数
END
-- 网络流量增加,客户端处理额外结果集

-- ✅ 好:禁用影响行数消息
CREATE PROC sp_UpdateOrders
AS
BEGIN
    SET NOCOUNT ON;
    UPDATE Orders SET Status = 'Processed' WHERE Status = 'Pending'
    UPDATE OrderDetails SET Processed = 1 WHERE OrderID IN (SELECT OrderID FROM Orders WHERE Status = 'Processed')
END

五、递归 CTE(Common Table Expression)

5.1 基础用法

-- 查询组织结构树
WITH OrgCTE AS (
    -- 锚点成员:根节点
    SELECT EmployeeID, ManagerID, EmployeeName, 0 AS Level
    FROM Employees
    WHERE ManagerID IS NULL
    
    UNION ALL
    
    -- 递归成员
    SELECT e.EmployeeID, e.ManagerID, e.EmployeeName, Level + 1
    FROM Employees e
    INNER JOIN OrgCTE o ON e.ManagerID = o.EmployeeID
)
SELECT * FROM OrgCTE
OPTION (MAXRECURSION 100);  -- 默认 100,可设置 0 表示无限制

5.2 递归深度限制与性能

-- 查询物料清单(BOM)展开
WITH BOMCTE AS (
    SELECT 
        ProductID,
        ComponentID,
        Quantity,
        1 AS Level,
        CAST(ComponentID AS VARCHAR(MAX)) AS Path
    FROM BOM
    WHERE ProductID = @TopProduct
    
    UNION ALL
    
    SELECT 
        b.ProductID,
        b.ComponentID,
        b.Quantity,
        Level + 1,
        Path + '->' + CAST(b.ComponentID AS VARCHAR)
    FROM BOM b
    INNER JOIN BOMCTE c ON b.ProductID = c.ComponentID
    WHERE Level < 20  -- 防止无限递归
)
SELECT * FROM BOMCTE

性能注意事项

  • 递归 CTE 逐层处理,深度过大会很慢
  • 确保锚点成员有索引(ManagerID、ParentID)
  • 考虑使用临时表 + 循环替代深层递归

六、动态 SQL 在存储过程中的使用

CREATE PROC sp_SearchProducts
    @Keyword NVARCHAR(100) = NULL,
    @CategoryID INT = NULL,
    @MinPrice DECIMAL = NULL,
    @MaxPrice DECIMAL = NULL,
    @SortColumn NVARCHAR(50) = 'ProductID',
    @SortDirection NVARCHAR(4) = 'ASC'
AS
BEGIN
    SET NOCOUNT ON;
    
    -- 白名单验证
    IF @SortColumn NOT IN ('ProductID', 'ProductName', 'Price', 'CreateDate')
        SET @SortColumn = 'ProductID';
    IF @SortDirection NOT IN ('ASC', 'DESC')
        SET @SortDirection = 'ASC';
    
    DECLARE @sql NVARCHAR(MAX) = N'SELECT ProductID, ProductName, Price FROM Products WHERE 1=1'
    DECLARE @params NVARCHAR(MAX) = N'@Keyword NVARCHAR(100), @CategoryID INT, @MinPrice DECIMAL, @MaxPrice DECIMAL'
    
    IF @Keyword IS NOT NULL
        SET @sql += N' AND ProductName LIKE @Keyword'
    IF @CategoryID IS NOT NULL
        SET @sql += N' AND CategoryID = @CategoryID'
    IF @MinPrice IS NOT NULL
        SET @sql += N' AND Price >= @MinPrice'
    IF @MaxPrice IS NOT NULL
        SET @sql += N' AND Price <= @MaxPrice'
    
    SET @sql += N' ORDER BY ' + QUOTENAME(@SortColumn) + N' ' + @SortDirection
    
    -- 添加分页
    SET @sql += N' OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY'
    
    EXEC sp_executesql @sql, @params, 
        @Keyword = '%' + @Keyword + '%',  -- 支持模糊查询
        @CategoryID = @CategoryID,
        @MinPrice = @MinPrice,
        @MaxPrice = @MaxPrice
END

七、监控与调试工具

-- 1. 查看存储过程执行统计
SELECT 
    OBJECT_NAME(object_id) AS ProcName,
    cached_time,
    execution_count,
    total_worker_time / execution_count AS avg_cpu_us,
    total_logical_reads / execution_count AS avg_logical_reads
FROM sys.dm_exec_procedure_stats
WHERE database_id = DB_ID()
ORDER BY avg_cpu_us DESC

-- 2. 查看函数使用情况
SELECT 
    OBJECT_NAME(object_id) AS FunctionName,
    execution_count,
    total_worker_time,
    total_logical_reads
FROM sys.dm_exec_function_stats
WHERE database_id = DB_ID()
ORDER BY total_worker_time DESC

-- 3. 查看当前运行的存储过程
SELECT 
    r.session_id,
    OBJECT_NAME(st.objectid) AS ProcName,
    r.cpu_time,
    r.total_elapsed_time,
    SUBSTRING(st.text, (r.statement_start_offset/2)+1,
        ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text)
         ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) AS statement
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) st
WHERE r.command = 'SELECT'

八、核心总结

知识点核心要点
存储过程 vs 函数存储过程适合业务流程,函数适合计算
内联表值函数性能最好,相当于参数化视图
标量函数性能杀手,尽量用内联函数或窗口函数替代
参数嗅探本地变量、OPTION (RECOMPILE) 解决
SET NOCOUNT ON减少网络流量,每个存储过程都应该加
递归 CTE深度有限制,注意性能
动态 SQL必须用参数化 + QUOTENAME 防注入

一句话记住本期内容

存储过程用 NOCOUNT ON 和本地变量防嗅探,内联表值函数优于标量函数,动态 SQL 必须参数化。

动手练习

-- 问题1:为什么下面的标量函数会导致查询极慢?如何优化?
CREATE FUNCTION dbo.GetOrderCount(@CustomerID INT)
RETURNS INT
AS
BEGIN
    DECLARE @Count INT
    SELECT @Count = COUNT(*) FROM Orders WHERE CustomerID = @CustomerID
    RETURN @Count
END

SELECT CustomerID, dbo.GetOrderCount(CustomerID) FROM Customers

-- 问题2:这个存储过程有参数嗅探风险吗?如何修复?
CREATE PROC sp_GetSales @Year INT, @Month INT
AS
SELECT SUM(Amount) FROM Sales WHERE YEAR(SaleDate) = @Year AND MONTH(SaleDate) = @Month

-- 问题3:如何用递归 CTE 查询所有下属员工?
-- 表:Employees (EmployeeID, ManagerID, Name)
-- 给定 @ManagerID = 1,返回所有直接和间接下属
点击查看参考答案

问题1:标量函数导致每行执行一次子查询(全表扫描或索引查找)。优化方案:

-- 方案1:窗口函数
SELECT CustomerID, COUNT(*) OVER (PARTITION BY CustomerID) AS OrderCount FROM Orders
-- 方案2:内联表值函数
CREATE FUNCTION GetOrderCountInline(@CustomerID INT) RETURNS TABLE AS RETURN (SELECT COUNT(*) AS OrderCount FROM Orders WHERE CustomerID = @CustomerID)
SELECT c.CustomerID, oc.OrderCount FROM Customers C CROSS APPLY GetOrderCountInline(c.CustomerID) oc

问题2:有风险,函数包裹列导致索引失效且参数嗅探。修复:

CREATE PROC sp_GetSales @Year INT, @Month INT
AS
DECLARE @StartDate DATE = DATEFROMPARTS(@Year, @Month, 1)
DECLARE @EndDate DATE = DATEADD(MONTH, 1, @StartDate)
SELECT SUM(Amount) FROM Sales WHERE SaleDate >= @StartDate AND SaleDate < @EndDate

问题3

WITH Subordinates AS (
    SELECT EmployeeID, ManagerID, Name, 0 AS Level
    FROM Employees WHERE EmployeeID = @ManagerID
    UNION ALL
    SELECT e.EmployeeID, e.ManagerID, e.Name, Level + 1
    FROM Employees e INNER JOIN Subordinates s ON e.ManagerID = s.EmployeeID
)
SELECT * FROM Subordinates WHERE Level > 0

下一期预告

触发器与约束——数据完整性的守护者

  • AFTER vs INSTEAD OF 触发器的选择
  • 触发器的嵌套与递归
  • 使用触发器实现审计日志
  • 约束 vs 触发器:性能与功能对比
  • 替代触发器的现代方案

📌 本文代码已在 SQL Server 2019+ 验证。存储过程和函数是模块化的基础,但使用不当会成为性能瓶颈。

本系列持续更新中,点击关注不错过第七期。