人生无意义,难得有情人
这章的意义就不多说了,懂得都懂
目录
1. 基础架构与查询流程
1.1 MySQL 逻辑架构
flowchart TD
A[客户端] --> B[连接层<br/>Connection Layer]
B --> C[服务层<br/>SQL Layer]
C --> D[存储引擎层<br/>Storage Engine Layer]
D --> E[文件系统<br/>File System]
subgraph 连接层
B1[连接池<br/>线程管理]
B2[身份验证]
B3[安全检查]
end
subgraph 服务层
C1[解析器 Parser]
C2[优化器 Optimizer]
C3[执行器 Executor]
C4[查询缓存 Query Cache<br/>MySQL 8.0已移除]
C5[内置函数]
end
subgraph 存储引擎层
D1[InnoDB<br/>默认/事务安全]
D2[MyISAM<br/>读多写少]
D3[Memory<br/>内存表]
D4[Archive<br/>归档]
end
B --> B1
C --> C1 --> C2 --> C3
D --> D1
D --> D2
1.2 SQL 执行流程详解
flowchart LR
A[SQL语句] --> B[词法/语法解析<br/>生成解析树]
B --> C[预处理<br/>验证表/列存在性]
C --> D[查询优化器<br/>生成执行计划]
D --> E[执行引擎<br/>调用存储引擎]
E --> F[返回结果]
D --> D1[选择最优索引]
D --> D2[决定JOIN顺序]
D --> D3[选择访问方法<br/>全表/索引扫描]
style D fill:#ff9999
style D1 fill:#ffcc99
style D2 fill:#ffcc99
style D3 fill:#ffcc99
为什么了解架构很重要?
- 优化器选择:90% 的性能问题源于优化器选择了错误的执行计划
- 存储引擎差异:InnoDB 支持事务和行级锁,MyISAM 表级锁更快但并发差
2. 数据定义语言-DDL
2.1 高效建表策略
-- ============================================
-- 最优实践:电商订单表设计
-- ============================================
CREATE TABLE `orders` (
`order_id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '订单ID',
`user_id` INT UNSIGNED NOT NULL COMMENT '用户ID',
`order_no` VARCHAR(32) NOT NULL COMMENT '订单编号',
`total_amount` DECIMAL(10,2) NOT NULL DEFAULT '0.00' COMMENT '订单总金额',
`status` TINYINT UNSIGNED NOT NULL DEFAULT '0' COMMENT '状态:0待支付,1已支付,2已发货,3已完成,4已取消',
`pay_time` DATETIME DEFAULT NULL COMMENT '支付时间',
`created_at` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`updated_at` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
-- 主键设计:为什么用BIGINT而不用INT?
-- 1. INT最大21亿,电商系统可能不够用
-- 2. BIGINT 8字节 vs INT 4字节,空间换扩展性
PRIMARY KEY (`order_id`),
-- 唯一索引:订单号必须唯一,且用于查询
-- 为什么放在这?唯一索引检查代价高,放在常用查询列
UNIQUE KEY `uk_order_no` (`order_no`),
-- 组合索引:最左前缀原则
-- 查询场景:WHERE user_id=? AND status=? ORDER BY created_at DESC
-- 索引顺序:等值查询列在前,范围/排序列在后
KEY `idx_user_status_created` (`user_id`, `status`, `created_at`),
-- 单独索引:支付时间范围查询
KEY `idx_pay_time` (`pay_time`)
) ENGINE=InnoDB
DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_unicode_ci
COMMENT='订单表'
-- 分区策略:按创建时间范围分区,提升历史数据查询效率
PARTITION BY RANGE (YEAR(created_at)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
性能对比示例:
-- 低效:全表扫描
EXPLAIN SELECT * FROM orders WHERE user_id = 10086;
-- type: ALL, rows: 1000000 (扫描100万行)
-- 高效:索引扫描
-- 已建立 idx_user_status_created
EXPLAIN SELECT * FROM orders WHERE user_id = 10086 AND status = 1;
-- type: ref, rows: 10 (只扫描10行)
2.2 索引创建流程图
flowchart TD
A[分析查询需求] --> B{查询类型?}
B -->|等值查询| C[单列索引]
B -->|多条件AND| D[组合索引<br/>最左前缀]
B -->|排序| E[索引包含排序列]
B -->|覆盖查询| F[索引包含SELECT列]
C --> G[评估选择性<br/>基数/总行数]
D --> G
E --> G
F --> G
G --> H{选择性>0.1?}
H -->|是| I[创建索引]
H -->|否| J[考虑不建索引<br/>或联合其他列]
I --> K[EXPLAIN验证]
J --> K
K --> L{type=ref/range?}
L -->|是| M[优化成功]
L -->|否| N[调整索引顺序或列]
N --> A
3. 数据操作语言-DML
3.1 INSERT 优化策略
-- ============================================
-- 场景:批量导入100万条用户数据
-- ============================================
-- ❌ 低效:逐条插入(每次事务提交,IO开销大)
INSERT INTO users (name, email) VALUES ('张三', 'zhangsan@example.com');
INSERT INTO users (name, email) VALUES ('李四', 'lisi@example.com');
-- 耗时:约 1000秒(每秒1000次磁盘IO)
-- ✅ 高效1:批量插入(减少网络往返和事务开销)
INSERT INTO users (name, email) VALUES
('张三', 'zhangsan@example.com'),
('李四', 'lisi@example.com'),
('王五', 'wangwu@example.com'),
-- 一次插入1000条
('赵六', 'zhaoliu@example.com');
-- 耗时:约 10秒(1000倍提升)
-- ✅ 高效2:LOAD DATA(最快)
-- 准备CSV文件:users.csv
-- 张三,zhangsan@example.com
-- 李四,lisi@example.com
LOAD DATA INFILE '/var/lib/mysql-files/users.csv'
INTO TABLE users
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(name, email);
-- 耗时:约 3秒(关闭唯一性检查更快)
-- ✅ 高效3:关闭索引和唯一性检查(大数据量)
SET autocommit=0;
SET unique_checks=0; -- 关闭唯一性检查
SET foreign_key_checks=0; -- 关闭外键检查
ALTER TABLE users DISABLE KEYS; -- MyISAM
-- 批量插入操作...
ALTER TABLE users ENABLE KEYS;
SET unique_checks=1;
SET foreign_key_checks=1;
COMMIT;
批量插入流程图:
flowchart LR
A[数据准备] --> B{数据量?}
B -->|< 1万条| C[普通INSERT<br/>事务包裹]
B -->|1万-100万| D[批量INSERT<br/>1000条/次]
B -->|> 100万| E[LOAD DATA<br/>或关闭检查]
C --> F[单事务提交]
D --> G[分批次提交<br/>避免大事务]
E --> H[重建索引<br/>一次性]
F --> I[完成]
G --> I
H --> I
style E fill:#90EE90
style H fill:#90EE90
3.2 UPDATE 与 DELETE 优化
-- ============================================
-- 场景:更新3个月前的订单状态为"已完成"
-- ============================================
-- ❌ 低效:一次性更新(大事务,锁表时间长)
UPDATE orders
SET status = 3, updated_at = NOW()
WHERE created_at < DATE_SUB(NOW(), INTERVAL 3 MONTH);
-- 问题:锁定100万行,其他查询阻塞,Undo Log膨胀
-- ✅ 高效:分批更新(控制事务大小)
DELIMITER //
CREATE PROCEDURE batch_update_orders()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE affected_rows INT;
REPEAT
-- 每次只更新1000行
UPDATE orders
SET status = 3, updated_at = NOW()
WHERE created_at < DATE_SUB(NOW(), INTERVAL 3 MONTH)
AND status != 3
LIMIT 1000;
SET affected_rows = ROW_COUNT();
COMMIT; -- 及时提交释放锁
DO SLEEP(0.1); -- 给其他查询留时间
UNTIL affected_rows = 0 END REPEAT;
END //
DELIMITER ;
CALL batch_update_orders();
-- ============================================
-- DELETE 优化:软删除 vs 硬删除
-- ============================================
-- ✅ 推荐:软删除(保留数据,添加删除标记)
ALTER TABLE orders ADD COLUMN is_deleted TINYINT DEFAULT 0;
ALTER TABLE orders ADD COLUMN deleted_at DATETIME DEFAULT NULL;
ALTER TABLE orders ADD KEY idx_is_deleted_created (is_deleted, created_at);
-- 删除操作变成更新(毫秒级)
UPDATE orders
SET is_deleted = 1, deleted_at = NOW()
WHERE order_id = 12345;
-- 查询时过滤(利用索引)
SELECT * FROM orders
WHERE is_deleted = 0 -- 走索引
AND user_id = 10086;
-- 定期归档硬删除(低峰期执行)
CREATE TABLE orders_archive LIKE orders;
-- 归档3年前已删除订单
INSERT INTO orders_archive
SELECT * FROM orders
WHERE is_deleted = 1
AND deleted_at < DATE_SUB(NOW(), INTERVAL 3 YEAR);
DELETE FROM orders
WHERE is_deleted = 1
AND deleted_at < DATE_SUB(NOW(), INTERVAL 3 YEAR);
-- 分批删除,同上面的存储过程
分批处理流程图:
flowchart TD
A[开始更新/删除] --> B[设置批次大小<br/>1000-5000行]
B --> C[执行LIMIT批次]
C --> D{是否有数据?}
D -->|是| E[提交事务]
E --> F[休眠10-100ms]
F --> C
D -->|否| G[完成]
H[监控] --> I[检查慢查询]
H --> J[检查锁等待]
H --> K[Undo表空间]
style E fill:#90EE90
style F fill:#FFD700
4. 数据查询语言-DQL-重点优化
4.1 SELECT 执行计划分析
-- ============================================
-- 示例表结构
-- users: 100万行
-- orders: 500万行
-- order_items: 2000万行
-- ============================================
-- 准备测试数据
CREATE TABLE users (
user_id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50),
email VARCHAR(100),
created_at DATETIME,
KEY idx_created_at (created_at)
);
CREATE TABLE orders (
order_id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
order_no VARCHAR(32) NOT NULL,
total_amount DECIMAL(10,2),
status TINYINT,
created_at DATETIME,
KEY idx_user_id (user_id),
KEY idx_created_at (created_at),
KEY idx_status_created (status, created_at)
);
-- 插入测试数据(使用存储过程生成)
DELIMITER //
CREATE PROCEDURE generate_data()
BEGIN
DECLARE i INT DEFAULT 1;
WHILE i <= 1000000 DO
INSERT INTO users VALUES (i, CONCAT('user', i), CONCAT('user', i, '@test.com'), DATE_SUB(NOW(), INTERVAL FLOOR(RAND()*365) DAY));
SET i = i + 1;
END WHILE;
END //
DELIMITER ;
4.2 EXPLAIN 详解与优化
-- ============================================
-- 案例1:单表查询优化
-- ============================================
-- ❌ 低效:SELECT * + 函数索引列
EXPLAIN SELECT * FROM users
WHERE YEAR(created_at) = 2024;
-- type: ALL (全表扫描)
-- rows: 1000000
-- Extra: Using where
-- 为什么低效?
-- 1. 函数 YEAR(created_at) 导致索引失效
-- 2. SELECT * 需要回表查询所有列
-- ✅ 高效:范围查询 + 覆盖索引
EXPLAIN SELECT user_id, username FROM users
WHERE created_at >= '2024-01-01'
AND created_at < '2025-01-01';
-- type: range (索引范围扫描)
-- rows: 100000
-- Extra: Using index (覆盖索引,无需回表)
-- ============================================
-- 案例2:JOIN 优化(Nested Loop Join)
-- ============================================
-- ❌ 低效:大表驱动小表
EXPLAIN SELECT o.*, u.username
FROM orders o
JOIN users u ON o.user_id = u.user_id
WHERE o.created_at > '2024-01-01';
-- 可能选择 orders 作为驱动表(500万行)
-- ✅ 高效:小表驱动大表 + STRAIGHT_JOIN
EXPLAIN SELECT o.*, u.username
FROM users u
STRAIGHT_JOIN orders o ON o.user_id = u.user_id
WHERE u.created_at > '2024-06-01' -- 最近注册用户少
AND o.created_at > '2024-01-01';
-- 强制 users 作为驱动表
-- 优化原理:
-- Nested Loop Join 复杂度 = 驱动表行数 × 被驱动表每次查询代价
-- 小表驱动:1000 × log(500万) = 1000 × 23 = 23,000
-- 大表驱动:500万 × log(100万) = 500万 × 20 = 1亿
JOIN 算法选择流程图:
flowchart TD
A[解析JOIN查询] --> B{表大小?}
B -->|小表+大表<br/>有索引| C[Nested Loop Join<br/>小表驱动]
B -->|大表+大表| D[Hash Join<br/>MySQL 8.0.18+]
B -->|排序需求| E[Sort Merge Join]
C --> F[驱动表全表/索引扫描]
F --> G[被驱动表索引查找<br/>ref/eq_ref]
D --> H[构建Hash表<br/>小表]
H --> I[探测Hash表<br/>大表]
E --> J[两表排序]
J --> K[顺序合并]
G --> L[返回结果]
I --> L
K --> L
style C fill:#90EE90
style D fill:#90EE90
4.3 索引优化实战
-- ============================================
-- 案例3:最左前缀原则
-- ============================================
-- 索引:idx_status_created (status, created_at)
-- ✅ 有效:使用索引第1列
EXPLAIN SELECT * FROM orders WHERE status = 1;
-- type: ref, key: idx_status_created
-- ✅ 有效:使用索引第1、2列
EXPLAIN SELECT * FROM orders
WHERE status = 1 AND created_at > '2024-01-01';
-- type: range, key: idx_status_created
-- ❌ 失效:跳过第1列
EXPLAIN SELECT * FROM orders WHERE created_at > '2024-01-01';
-- type: ALL, key: NULL (全表扫描)
-- ❌ 失效:第1列范围查询,第2列无法使用索引排序
EXPLAIN SELECT * FROM orders
WHERE status > 0 ORDER BY created_at;
-- type: range, Extra: Using filesort
-- ✅ 优化:改写为IN枚举(多个等值查询)
EXPLAIN SELECT * FROM orders
WHERE status IN (1, 2, 3) ORDER BY created_at;
-- type: range, Extra: Using index condition
-- ============================================
-- 案例4:覆盖索引(Covering Index)
-- ============================================
-- 场景:只查询订单号和金额,不查询其他字段
-- 添加覆盖索引
ALTER TABLE orders ADD KEY idx_covering (user_id, status, order_no, total_amount);
-- ✅ 完全覆盖:所有字段都在索引中
EXPLAIN SELECT order_no, total_amount
FROM orders
WHERE user_id = 10086 AND status = 1;
-- Extra: Using index (不回表)
-- 对比:需要回表的查询
EXPLAIN SELECT order_no, total_amount, pay_time -- pay_time不在索引中
FROM orders
WHERE user_id = 10086 AND status = 1;
-- Extra: NULL (需要回表查询pay_time)
-- 性能对比:
-- 覆盖索引:内存中完成,0.1ms
-- 回表查询:随机IO读取数据页,1-5ms
索引选择流程图:
flowchart TD
A[查询语句] --> B{有合适索引?}
B -->|否| C[全表扫描<br/>type: ALL]
B -->|是| D{索引包含所有查询列?}
D -->|是| E[覆盖索引<br/>Using index]
D -->|否| F{查询条件类型?}
F -->|等值| G[ref<br/>常量/等值查询]
F -->|范围| H[range<br/>索引范围扫描]
F -->|最左匹配| I[index<br/>索引全扫描]
F -->|回表排序| J[filesort<br/>额外排序]
E --> K[最优]
G --> L[优]
H --> M[良]
I --> N[中]
C --> O[差]
J --> P[需优化]
style E fill:#90EE90
style K fill:#90EE90
style O fill:#ff6666
style P fill:#ff6666
4.4 分页查询优化
-- ============================================
-- 案例5:深分页问题
-- ============================================
-- ❌ 低效:LIMIT 1000000, 10
-- 原理:扫描1000010行,丢弃前1000000行
EXPLAIN SELECT * FROM orders
ORDER BY created_at DESC
LIMIT 1000000, 10;
-- rows: 1000010
-- 耗时:10秒+
-- ✅ 优化1:延迟关联(Deferred Join)
-- 原理:先查ID,再关联详情
EXPLAIN SELECT o.*
FROM orders o
JOIN (
SELECT order_id
FROM orders
ORDER BY created_at DESC
LIMIT 1000000, 10
) tmp ON o.order_id = tmp.order_id;
-- 子查询只扫描索引:rows: 1000010, Using index
-- 关联后只查10行:rows: 10
-- 耗时:0.5秒
-- ✅ 优化2:书签记录(业务层优化)
-- 上一页最后一条的created_at = '2024-03-15 10:30:00'
EXPLAIN SELECT * FROM orders
WHERE created_at < '2024-03-15 10:30:00'
ORDER BY created_at DESC
LIMIT 10;
-- type: range, rows: 10
-- 耗时:1ms
-- ✅ 优化3:覆盖索引 + 延迟关联
EXPLAIN SELECT o.*
FROM orders o
JOIN (
SELECT order_id
FROM orders
WHERE created_at < '2024-03-15 10:30:00'
ORDER BY created_at DESC
LIMIT 10
) tmp ON o.order_id = tmp.order_id;
-- 最优方案:覆盖索引范围扫描 + 精确回表
-- ============================================
-- 案例6:排序优化
-- ============================================
-- 索引:idx_user_created (user_id, created_at)
-- ✅ 高效:索引天然有序
EXPLAIN SELECT * FROM orders
WHERE user_id = 10086
ORDER BY created_at DESC;
-- Extra: NULL (使用索引排序,无filesort)
-- ❌ 低效:反向排序 + 多列不同方向
EXPLAIN SELECT * FROM orders
WHERE user_id = 10086
ORDER BY created_at ASC; -- 索引是DESC
-- Extra: Using filesort
-- 解决:创建反向索引(MySQL 8.0支持DESC索引)
ALTER TABLE orders ADD KEY idx_user_created_asc (user_id, created_at ASC);
分页优化流程图:
flowchart TD
A[分页查询] --> B{页码深度?}
B -->|< 1000页| C[普通LIMIT<br/>性能可接受]
B -->|> 1000页| D{是否有连续条件?}
D -->|是| E[书签记录法<br/>WHERE + LIMIT]
D -->|否| F[延迟关联法<br/>子查询+JOIN]
C --> G[直接查询]
E --> H[利用索引定位]
F --> I[减少回表次数]
G --> J[返回结果]
H --> J
I --> J
style E fill:#90EE90
style F fill:#90EE90
style H fill:#90EE90
5. 索引设计与优化
5.1 索引类型选择
-- ============================================
-- B+Tree 索引(默认)
-- ============================================
-- 主键索引:聚簇索引(数据按主键顺序存储)
CREATE TABLE clustered_demo (
id INT PRIMARY KEY, -- 聚簇索引
name VARCHAR(50)
);
-- 聚簇索引叶子节点存储整行数据
-- 二级索引:非聚簇索引,叶子存储主键值
CREATE INDEX idx_name ON clustered_demo(name);
-- 查询:先查二级索引得主键,再回表查聚簇索引
-- ============================================
-- 哈希索引(Memory引擎/自适应哈希)
-- ============================================
-- 自适应哈希索引(InnoDB自动维护)
-- 适用于:等值查询,且频繁访问
SHOW VARIABLES LIKE 'innodb_adaptive_hash_index'; -- 默认ON
-- 模拟哈希索引(B+Tree模拟)
-- 场景:超长URL查询
CREATE TABLE urls (
url VARCHAR(2083),
url_hash BIGINT UNSIGNED, -- 存储URL的CRC32或MD5哈希
INDEX idx_hash (url_hash) -- 短索引,快速定位
);
-- 插入时计算哈希
INSERT INTO urls VALUES ('https://...', CRC32('https://...'));
-- 查询时用哈希过滤
SELECT * FROM urls
WHERE url_hash = CRC32('https://...')
AND url = 'https://...'; -- 哈希冲突时精确匹配
-- ============================================
-- 全文索引(Full-text)
-- ============================================
-- 场景:文章内容搜索
CREATE TABLE articles (
id INT PRIMARY KEY,
title VARCHAR(200),
content TEXT,
FULLTEXT INDEX ft_content (title, content) WITH PARSER ngram
) ENGINE=InnoDB;
-- 中文分词查询(需ngram解析器)
SELECT * FROM articles
WHERE MATCH(title, content) AGAINST('数据库优化' IN NATURAL LANGUAGE MODE);
-- 对比LIKE:全文索引利用倒排表,LIKE '%xxx%'全表扫描
5.2 索引设计原则
-- ============================================
-- 原则1:选择性高的列放前面
-- ============================================
-- 假设:status只有0/1/2/3(选择性低),created_at几乎唯一(选择性高)
-- 错误:idx_status_created (status, created_at)
-- 正确:idx_created_status (created_at, status) -- 如果created_at是查询条件
-- 但如果是 WHERE status=1 AND created_at>'2024-01-01',status在前更好
-- 结论:看查询条件,等值查询列在前,范围查询列在后
-- ============================================
-- 原则2:避免冗余和重复索引
-- ============================================
-- 重复索引(浪费空间,拖慢写入)
CREATE TABLE bad_index (
id INT PRIMARY KEY,
a INT,
b INT,
UNIQUE KEY uk_a (a), -- 唯一索引
KEY idx_a (a) -- 重复!唯一索引本身就是索引
);
-- 冗余索引(idx_a_b 包含 idx_a)
CREATE TABLE redundant_index (
id INT PRIMARY KEY,
a INT,
b INT,
KEY idx_a (a),
KEY idx_a_b (a, b) -- 冗余,最左前缀包含idx_a
);
-- 应删除 idx_a
-- 工具:检查冗余索引
SELECT * FROM sys.schema_redundant_indexes;
-- ============================================
-- 原则3:长字符串前缀索引
-- ============================================
-- 邮箱字段:通常前10-15字符区分度已很高
ALTER TABLE users ADD KEY idx_email_prefix (email(10));
-- 计算区分度
SELECT
COUNT(DISTINCT LEFT(email, 10)) / COUNT(DISTINCT email) as selectivity_10,
COUNT(DISTINCT LEFT(email, 15)) / COUNT(DISTINCT email) as selectivity_15
FROM users;
-- 当selectivity > 0.9时,前缀长度合适
索引设计决策流程图:
flowchart TD
A[分析查询] --> B{查询频率?}
B -->|高频| C[考虑索引]
B -->|低频| D[不建索引]
C --> E{列选择性?}
E -->|> 0.1| F[单列/组合索引]
E -->|< 0.1| G{是否与其他列组合?}
G -->|是| H[组合索引<br/>低选择性在前]
G -->|否| I[不建索引]
F --> J{字符串长度?}
J -->|> 20| K[前缀索引]
J -->|<= 20| L[完整索引]
H --> M{是否覆盖查询?}
K --> M
L --> M
M -->|是| N[添加覆盖列]
M -->|否| O[标准索引]
N --> P[EXPLAIN验证]
O --> P
style F fill:#90EE90
style N fill:#90EE90
style I fill:#ffcc99
6. 事务与锁机制
6.1 事务隔离级别
-- ============================================
-- 查看和设置隔离级别
-- ============================================
-- 查看当前隔离级别
SELECT @@transaction_isolation; -- MySQL 8.0
-- 默认: REPEATABLE-READ
-- 设置会话隔离级别
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- ============================================
-- 四种隔离级别对比
-- ============================================
-- 1. READ UNCOMMITTED(读未提交)
-- 问题:脏读(读到未提交数据)
-- 使用场景:极少,对一致性要求极低的统计
-- 2. READ COMMITTED(读已提交)- Oracle/SQL Server默认
-- 解决:脏读
-- 问题:不可重复读(同一事务两次读取结果不同)
-- 适用:互联网应用(减少锁竞争)
-- 3. REPEATABLE READ(可重复读)- MySQL默认
-- 解决:脏读、不可重复读
-- 问题:幻读(InnoDB通过MVCC+间隙锁解决)
-- 适用:金融交易(强一致性)
-- 4. SERIALIZABLE(串行化)
-- 解决:所有并发问题
-- 问题:性能极差(所有SELECT加共享锁)
-- 适用:极少,极端一致性场景
-- ============================================
-- MVCC 机制演示
-- ============================================
-- 事务A开始
START TRANSACTION;
SELECT * FROM users WHERE user_id = 1; -- 读取快照版本 V1
-- 事务B开始并提交
START TRANSACTION;
UPDATE users SET balance = 1000 WHERE user_id = 1;
COMMIT; -- 生成新版本 V2
-- 事务A再次查询(RR隔离级别)
SELECT * FROM users WHERE user_id = 1; -- 仍读取 V1(可重复读)
-- 如果隔离级别是RC,则读取 V2
COMMIT;
6.2 锁机制详解
-- ============================================
-- 行锁演示
-- ============================================
-- 表结构
CREATE TABLE accounts (
account_id INT PRIMARY KEY,
balance DECIMAL(10,2),
KEY idx_balance (balance)
) ENGINE=InnoDB;
-- 事务A
START TRANSACTION;
SELECT * FROM accounts WHERE account_id = 1 FOR UPDATE; -- 加排他锁(X锁)
-- 或 SELECT ... LOCK IN SHARE MODE; -- 加共享锁(S锁)
-- 事务B(阻塞,直到事务A提交)
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
-- ============================================
-- 间隙锁(Gap Lock)- 解决幻读
-- ============================================
-- 表数据:account_id = 1, 5, 10
-- 事务A
START TRANSACTION;
SELECT * FROM accounts WHERE account_id > 3 AND account_id < 8 FOR UPDATE;
-- 锁住范围:(1,5], (5,10) 的间隙
-- 即:阻止其他事务插入 account_id = 4, 6, 7 的记录
-- 事务B(阻塞)
INSERT INTO accounts VALUES (6, 1000); -- 等待事务A释放间隙锁
-- ============================================
-- 死锁检测与避免
-- ============================================
-- 死锁场景
-- 事务A:更新记录1,再更新记录2
-- 事务B:更新记录2,再更新记录1
-- 同时执行时形成循环等待
-- 避免策略1:统一访问顺序(都先1后2)
-- 避免策略2:使用乐观锁(版本号)
ALTER TABLE accounts ADD COLUMN version INT DEFAULT 0;
-- 乐观锁更新
UPDATE accounts
SET balance = balance - 100, version = version + 1
WHERE account_id = 1 AND version = 0;
-- 检查影响行数,为0则说明并发冲突,需重试
锁类型决策流程图:
flowchart TD
A[需要数据一致性] --> B{并发场景?}
B -->|读多写少| C[MVCC快照读<br/>无锁]
B -->|读写均衡| D[行级锁<br/>FOR UPDATE]
B -->|写多读少| E[表级锁<br/>或队列]
C --> F[SELECT普通查询]
D --> G{范围查询?}
G -->|是| H[间隙锁<br/>防止幻读]
G -->|否| I[记录锁<br/>单行锁定]
E --> J[批量更新<br/>LOCK TABLES]
F --> K[高并发读取]
H --> L[范围数据保护]
I --> M[单行数据保护]
J --> N[串行化处理]
style C fill:#90EE90
style F fill:#90EE90
7. 高级查询与优化技巧
7.1 窗口函数(MySQL 8.0+)
-- ============================================
-- 场景:用户消费排名(不使用自连接)
-- ============================================
-- 准备数据
CREATE TABLE user_orders (
user_id INT,
order_date DATE,
amount DECIMAL(10,2)
);
-- ❌ 低效:自连接求排名
SELECT a.user_id, a.amount,
(SELECT COUNT(DISTINCT b.amount)
FROM user_orders b
WHERE b.amount > a.amount) + 1 as rank_num
FROM user_orders a;
-- 复杂度:O(n²)
-- ✅ 高效:窗口函数
SELECT
user_id,
amount,
RANK() OVER (ORDER BY amount DESC) as rank_num, -- 排名(跳跃)
DENSE_RANK() OVER (ORDER BY amount DESC) as dense_rank, -- 密集排名
ROW_NUMBER() OVER (ORDER BY amount DESC) as row_num, -- 行号
NTILE(4) OVER (ORDER BY amount DESC) as quartile -- 四分位数
FROM user_orders;
-- 复杂度:O(n log n),单次扫描排序
-- ============================================
-- 场景:累计消费(滑动窗口)
-- ============================================
SELECT
user_id,
order_date,
amount,
-- 累计和
SUM(amount) OVER (
PARTITION BY user_id
ORDER BY order_date
ROWS UNBOUNDED PRECEDING
) as cumulative_amount,
-- 近3笔订单平均
AVG(amount) OVER (
PARTITION BY user_id
ORDER BY order_date
ROWS 2 PRECEDING
) as avg_last_3,
-- 同月对比
LAG(amount, 1) OVER (
PARTITION BY user_id, MONTH(order_date)
ORDER BY order_date
) as prev_order_amount
FROM user_orders;
-- ============================================
-- 场景:分组TOP N
-- ============================================
-- 每个用户最近3笔订单
WITH ranked_orders AS (
SELECT
user_id,
order_date,
amount,
ROW_NUMBER() OVER (
PARTITION BY user_id
ORDER BY order_date DESC
) as rn
FROM user_orders
)
SELECT * FROM ranked_orders WHERE rn <= 3;
-- 替代:以前的变量法或子查询,性能提升10倍+
7.2 CTE(公用表表达式)
-- ============================================
-- 递归CTE:查询组织架构
-- ============================================
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(50),
manager_id INT,
INDEX idx_manager (manager_id)
);
-- 递归查询所有下属
WITH RECURSIVE subordinates AS (
-- 锚成员:起点
SELECT employee_id, name, manager_id, 0 as level
FROM employees
WHERE employee_id = 1 -- 从CEO开始
UNION ALL
-- 递归成员
SELECT e.employee_id, e.name, e.manager_id, s.level + 1
FROM employees e
JOIN subordinates s ON e.manager_id = s.employee_id
)
SELECT * FROM subordinates;
-- ============================================
-- 非递归CTE:简化复杂查询
-- ============================================
WITH
-- 月度统计
monthly_stats AS (
SELECT
DATE_FORMAT(created_at, '%Y-%m') as month,
COUNT(*) as order_count,
SUM(total_amount) as total_amount
FROM orders
WHERE created_at >= '2024-01-01'
GROUP BY DATE_FORMAT(created_at, '%Y-%m')
),
-- 计算环比
growth_calc AS (
SELECT
month,
order_count,
total_amount,
LAG(total_amount) OVER (ORDER BY month) as prev_amount,
total_amount - LAG(total_amount) OVER (ORDER BY month) as growth
FROM monthly_stats
)
SELECT
month,
order_count,
total_amount,
ROUND(growth / prev_amount * 100, 2) as growth_rate
FROM growth_calc;
7.3 查询优化检查清单
-- ============================================
-- 1. 检查慢查询日志
-- ============================================
-- 开启慢查询日志
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1; -- 超过1秒记录
SET GLOBAL log_queries_not_using_indexes = 'ON';
-- 查看慢查询
SELECT * FROM mysql.slow_log
WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 DAY)
ORDER BY query_time DESC
LIMIT 10;
-- ============================================
-- 2. 性能分析:SHOW PROFILE(已废弃,用Performance Schema)
-- ============================================
-- 使用Performance Schema
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES';
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES';
-- 查看最近查询性能
SELECT
EVENT_ID,
TRUNCATE(TIMER_WAIT/1000000000000, 6) as duration_ms,
SQL_TEXT
FROM performance_schema.events_statements_history_long
WHERE SQL_TEXT IS NOT NULL
ORDER BY TIMER_WAIT DESC
LIMIT 5;
-- ============================================
-- 3. 优化器提示(Hints)
-- ============================================
-- 强制使用索引
SELECT * FROM orders
USE INDEX (idx_user_status_created) -- 建议使用
-- FORCE INDEX (idx_user_status_created) -- 强制使用
WHERE user_id = 10086 AND status = 1;
-- 忽略索引
SELECT * FROM orders IGNORE INDEX (idx_status) WHERE status = 1;
-- 优化器提示:指定JOIN顺序
SELECT /*+ JOIN_ORDER(orders, users) */ *
FROM orders
JOIN users ON orders.user_id = users.user_id;
-- 指定JOIN算法(MySQL 8.0)
SELECT /*+ HASH_JOIN(orders, users) */ *
FROM orders
JOIN users ON orders.user_id = users.user_id;
SQL优化完整流程图:
flowchart TD
A[慢查询发现] --> B[EXPLAIN分析]
B --> C{type类型?}
C -->|ALL| D[添加/优化索引]
C -->|index| E[检查是否覆盖索引]
C -->|range| F[检查范围是否过大]
C -->|ref/eq_ref| G[已较优<br/>检查Extra]
D --> H[分析WHERE条件]
E --> I{Using index?}
F --> J[限制返回行数]
G --> K{Using filesort?}
H --> L[创建合适索引]
I -->|否| M[添加覆盖列]
I -->|是| N[已最优]
J --> O[分页优化<br/>延迟关联]
K -->|是| P[利用索引排序]
L --> Q[验证执行计划]
M --> Q
O --> Q
P --> Q
Q --> R{性能达标?}
R -->|否| S[查询重构<br/>CTE/临时表]
R -->|是| T[完成优化]
S --> Q
style N fill:#90EE90
style T fill:#90EE90
style D fill:#ff9999
style K fill:#ffcc99
总结:SQL优化法则
| 原则 | 说明 | 示例 |
|---|---|---|
| 1. 减少数据访问 | 只查询需要的列和行 | 避免 SELECT *,使用 LIMIT |
| 2. 减少计算 | 避免在索引列上使用函数 | created_at > '2024-01-01' 优于 YEAR(created_at)=2024 |
| 3. 利用索引 | 让查询命中索引,避免回表 | 覆盖索引、最左前缀 |
| 4. 减少交互 | 批量操作,减少网络往返 | 批量 INSERT,存储过程 |
| 5. 减少锁竞争 | 小事务,快速提交 | 分批更新,避免长事务 |
| 6. 读写分离 | 主从架构,分担压力 | 读走从库,写走主库 |