本文已参与「新人创作礼」活动,一起开启掘金创作之路。
- LIMIT 语句
- 隐式转换
- 关联更新,删除
- 混合排序
- EXISTS语句
- 条件下推
- 提前缩小范围
- 中间结果集下推
- 总结
SQL编写顺序
select
distinct <select_list>
from <left_table>
join <right_table>
on <join_condition>
where <where_condition>
group by <group_by_list>
order by <order_by_condition>
limit <limit_number>
SQL执行顺序
from <left_table>
on <join_condition>
join <right_table>
where <where_condition>
group by <group_by_list>
having <having_condition>
select
distinct <select_list>
order by <order_by_condition>
limit <limit_number>
1.FROM :对FROM左边的表和右边的表计算笛卡尔积,产生虚拟表t1
2.ON :对表t1进行ON筛选,只有符合条件的行才会记录在表t2中
3.JOIN :如果指定了OUTER JOIN(如:left join、right join),那么未匹配到的行作为外部行添加到表t3中
4.WHERE :对表t3进行where条件过滤,只有符合条件的记录才会记录在表t4中
5.GROUP BY: 根据group by 子句中的列,对表t4记录进行分组操作,产生表t5
6.HAVING :对表t5进行having过滤,只有符合条件的行才会记录在表t6中
7.SELECT :执行select操作,选择指定的列,产生表t7
8.DISTINCT :对表t7记录进行去重,产生表t8
9.ORDER BY :对表t8记录进行排序,产生表t9
10.LIMIT :取出指定的行,产生表t10,并将结果进行展示
1、LIMIT语句
select *
from tab_name
where column_1='val1'
and column_2='val2'
order by create_time
limit 1000,10;
对于上面简单的语句,一般是给字段查询条件字段和排序字段加索引,这也条件排序能有效利用索引,性能提升。
但当 LIMIT 语句 变成 ”limit 1000000,10“,查询效率还是会慢。
因为数据库不知道第1000000条件记录从什么地方开始,有索引也需要从头计算一次。
解决方案:将上一页的最大值作条件作为查询条件。
select *
from tab_name
where column_1='val1'
and column_2='val2'
and create_time >='2022-05-12 09:45:33'
order by create_time
limit 10;
2、隐式转换
何为隐式转换:即在where语句中条件的值和条件对应的列的数据类型不一致。
隐式类型转换引发的问题:
- 隐式类型转换可能导致索引失效
- 隐式类型转换可能产生非预期的效果
数据准备
CREATE TABLE `convert_test` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`areacode` varchar(12) NOT NULL DEFAULT '',
`bus_date` DATE NOT NULL,
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_areacode` (`areacode`),
KEY `idx_bus_date` (`bus_date`),
KEY `idx_create_time` (`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='隐式转换测试表';
insert into convert_test(areacode,bus_date) values ('001','2021-02-19');
insert into convert_test(areacode,bus_date) values ('1','2021-02-20');
insert into convert_test(areacode,bus_date) values ('1abc','2021-02-21');
导致索引失效
存在隐式转换
不存在隐式转换
产生非预期的结果
存在隐式转换
id 是bigint类型,按照条件不应该查询到数据
不存在隐式转换
防止隐式类型转换
- 写sql的时候注意类型保持一致
- 使用cast函数进行转换 如:select * from convert_test where areacode=CAST(1 AS char);
3、关联更新,删除
MySQL派生表,物化表,临时表
派生表
派生表,是用于存储子查询产生的结果的临时表,这个子查询特指FROM子句里的子查询。派生表也是临时表
explain select * from t1 inner join (select distinct i1 from t3 where id in (1,2,3)) tmp on tmp.i1 = t1.i1explain 结果可以看到,select 中的子查询,会产生一个派生表,存储子查询的查询结果,然后用 t1 表和派生表(derived2)进行连接操作
物化表
物化表,也是存储子查询产生的结果的临时表,这个子查询特指where子句中查询条件里的子查询。
物化表有两种使用场景:
- 对子查询进行半连接优化时,使用
物化策略- IN 子查询转换为 SUBQUERY、UNCACHEABLE SUBQUERY 的 exists 相关子查询时,把子查询的结果物化,避免对于主查询符合条件的每一条记录,子查询都要执行一次从原表里读取数据
explain select * from t1 where t1.i1 in ( select i1 from t3 where id in (3, 666, 990, 887, 76) and i2 > 16384 )explain 结果可以看到,IN 子查询的结果物化之后(select_type = MATERIALIZED)和 t1 表进行连接操作
临时表
临时表,除了派生表、物化表之外,其它会用到临时表的地方,都是为了用空间换时间的。
- group by
- distinct
- union
MySql的物化表特性,目前仅仅针对查询语句的优化,对于更新或删除语句,需要改写为join
UPDATE operation o
SET status = 'applying'
WHERE o.id IN (SELECT id
FROM (SELECT o.id,
o.status
FROM operation o
WHERE o.group = 123
AND o.status NOT IN ( 'done' )
ORDER BY o.parent,
o.id
LIMIT 1) t);
重新为JOIN之后,子查询的选择模式从dependent subquery 变成 derived,执行速度大大加快
UPDATE operation o
JOIN (SELECT o.id,
o.status
FROM operation o
WHERE o.group = 123
AND o.status NOT IN ( 'done' )
ORDER BY o.parent,
o.id
LIMIT 1) t
ON o.id = t.id
SET status = 'applying'
4、混合排序
MySQL不能利用索引进行混合排序
select *
from my_order o
inner join my_appraise a on a.order_id=o.id
order by a.is_repply ASC
a.appraise_time DESC
limit 0,20
由于is_reply只有0和1两种状态,优化重写如下:
SELECT *
FROM ((SELECT *
FROM my_order o
INNER JOIN my_appraise a
ON a.orderid = o.id
AND is_reply = 0
ORDER BY appraise_time DESC
LIMIT 0, 20)
UNION ALL
(SELECT *
FROM my_order o
INNER JOIN my_appraise a
ON a.orderid = o.id
AND is_reply = 1
ORDER BY appraise_time DESC
LIMIT 0, 20)) t
ORDER BY is_reply ASC,
appraisetime DESC
LIMIT 20;
5、EXISTS语句
MySQL对待EXISTS子句时,仍然采用嵌套子查询的执行方式
SELECT *
FROM my_neighbor n
LEFT JOIN my_neighbor_apply sra
ON n.id = sra.neighbor_id
AND sra.user_id = 'xxx'
WHERE n.topic_status < 4
AND EXISTS(SELECT 1
FROM message_info m
WHERE n.id = m.neighbor_id
AND m.inuser = 'xxx')
AND n.topic_type <> 5
去掉exists更改为join,能够避免嵌套子查询
SELECT *
FROM my_neighbor n
INNER JOIN message_info m
ON n.id = m.neighbor_id
AND m.inuser = 'xxx'
LEFT JOIN my_neighbor_apply sra
ON n.id = sra.neighbor_id
AND sra.user_id = 'xxx'
WHERE n.topic_status < 4
AND n.topic_type <> 5
6、条件下推
外部查询条件不能够下推到复杂的视图或子查询的情况有:
- 聚合子查询;
- 含有LIMIT的子查询;
- UNION 或UNION ALL子查询;
- 输出字段中的子查询;
SELECT *
FROM (SELECT target,
Count(*)
FROM operation
GROUP BY target) t
优化重写
SELECT target,
Count(*)
FROM operation
WHERE target = 'rm-xxxx'
GROUP BY target
7、提前缩小范围
SELECT *
FROM my_order o
LEFT JOIN my_userinfo u ON o.uid = u.uid
LEFT JOIN my_productinfo p ON o.pid = p.pid
WHERE ( o.display = 0 )
AND ( o.ostaus = 1 )
ORDER BY o.selltime DESC
LIMIT 0, 15
由于最后WHERE条件以及排序均针对最左主表,因此可以先对my_order排序提前缩小数据量再做左连接。SQL重写后如下:
SELECT *
FROM (
SELECT *
FROM my_order o
WHERE ( o.display = 0 )
AND ( o.ostaus = 1 )
ORDER BY o.selltime DESC
LIMIT 0, 15
) o
LEFT JOIN my_userinfo u ON o.uid = u.uid
LEFT JOIN my_productinfo p ON o.pid = p.pid
ORDER BY o.selltime DESC
limit 0, 15
子查询物化后(select_type=DERIVED)参与JOIN。虽然估算行扫描仍然为90万,但是利用了索引以及LIMIT 子句后,实际执行时间变得很小
8、中间结果集下推
SELECT a.*,
c.allocated
FROM (
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20) a
LEFT JOIN
(
SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
FROM my_resources
GROUP BY resourcesid) c
ON a.resourceid = c.resourcesid
优化重写:
SELECT a.*,
c.allocated
FROM (
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20) a
LEFT JOIN
(
SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
FROM my_resources r,
(
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20) a
WHERE r.resourcesid = a.resourcesid
GROUP BY resourcesid) c
ON a.resourceid = c.resourcesid
\