问题描述

注：文中的数据为线上数据修改得来，去掉部分重要信息，但不影响本文介绍的内容。

前段时间，业务中有个慢查询，具体如下：

select count(distinct order.order_id) from order force index(shop_order_create_time) left join `order_extend` as `extend` on `order`.`order_id` = `extend`.`order_id` where `extend`.`receiver_name` = "张三" and `order`.`shop_id` = 168449734569046909 and `order`.`allot` = 0;

explain 结果如下：

mysql> explain select count(distinct order.order_id) from order force index(shop_order_create_time) left join `order_extend` as `extend` on `order`.`order_id` = `extend`.`order_id` where `extend`.`receiver_name` = "张三" and `order`.`shop_id` = 168449734569046909 and `order`.`allot` = 0;

+----+-------------+-----------------+------------+------+----------------------------+------------------------+---------+--------------------------+--------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys              |          key           | key_len | ref                      | rows   | filtered | Extra       |
+----+-------------+-----------------+------------+------+----------------------------+------------------------+---------+--------------------------+--------+----------+-------------+
|  1 | SIMPLE      |           order | NULL       | ref  | shop_order_create_time     | shop_order_create_time | 8       | const                    | 289384 |    50.00 | Using where |
|  1 | SIMPLE      | order_extend    | NULL       | ref  |       order_id             |       order_id         | 8       | example.order.order_id   |    1   |     5.00 | Using where |
+----+-------------+-----------------+------------+------+----------------------------+------------------------+---------+--------------------------+--------+----------+-------------+

慢查询的原因在这里稍微解释下，将 left join 拆成两个简单的 SQL 来看，where 条件下扫描 order 表会走索引，只需要扫描 289384 行数据，而对于 order_extend 来说，因为 receiver_name 字段没有索引，所以 MYSQL 会进行全表扫描，该表数据有 400 万多条数据，所以 MYSQL 理所当然的认为 order 是小表，被用作驱动表。

注意：上述操作是完全没有问题的，MYSQL 的选择驱动表的操作是完全正确的。

但是上述 SQL 为什么会慢呢，我们之前有讲过，驱动表的行数其实是循环的行数，上述的 SQL 说明我们要循环 289384 次，这不是个小数据。

事实上，order_extend 表中 receiver_name = "张三" 的数据是非常少的，因此我们只需要给 receiver_name 字段添加个索引，在该 where 条件下，order_extend 扫描的行数就会变得只有几十行，order_extend 就会作为驱动表。整个 join 操作循环的次数就降下来了，事实也是如此。

创建索引的语句如下：

alter table order_extend add index idx_receiver_name(receiver_name);

但是该操作带来了另外一个问题，where in 查询超时，explain 发现操作不走索引了。怎么可能！！！

mysql> explain select * from `order_extend` where `order_id` in ('452048700298489164', '658048668602091118', '134025069379300133', '115025059522464298', '357025020159582157', '313025004936486152', '314964165391349174', '149625972108931157', '962583955854210160', '435496258996277158', '894946259343070121', '267305055115077111', '742593358822418192', '295725918128610188', '753725897767159173', '372357059042085115', '732565822543578128', '989725598504909135', '324725576852445129', '220553282521821127');

+----+-------------+--------------+------------+-------+---------------+----------+---------+------+---------+----------+-----------------------+
| id | select_type | table        | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra                 |
+----+-------------+--------------+------------+-------+---------------+----------+---------+------+---------+----------+-----------------------+
|  1 | SIMPLE      | order_extend | NULL       | ALL   | order_id      | NULL     | NULL    | NULL | 4331642 | 100.00   | Using where           |
+----+-------------+--------------+------------+-------+---------------+----------+---------+------+---------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

问题分析

我们预想的情况是上面的查询应该走 order_id 索引，但是实际上没有走。

我们使用 force index(order_id) 来执行一下上述 SQL，结果非常快的出来了。

select * from `order_extend` force index(order_id) where `order_id` in ('452048700298489164', '658048668602091118', '134025069379300133', '115025059522464298', '357025020159582157', '313025004936486152', '314964165391349174', '149625972108931157', '962583955854210160', '435496258996277158', '894946259343070121', '267305055115077111', '742593358822418192', '295725918128610188', '753725897767159173', '372357059042085115', '732565822543578128', '989725598504909135', '324725576852445129', '220553282521821127');

我们对上述 SQL 执行一下 explain 命令，执行结果如下：

MySQL [higo_order]> explain select * from `order_extend` force index(order_id) where `order_id` in ('452048700298489164', '658048668602091118', '134025069379300133', '115025059522464298', '357025020159582157', '313025004936486152', '314964165391349174', '149625972108931157', '962583955854210160', '435496258996277158', '894946259343070121', '267305055115077111', '742593358822418192', '295725918128610188', '753725897767159173', '372357059042085115', '732565822543578128', '989725598504909135', '324725576852445129', '220553282521821127');

+----+-------------+--------------+------------+-------+---------------+----------+---------+------+----------+----------+-----------------------+
| id | select_type | table        | partitions | type  | possible_keys | key      | key_len | ref  | rows     | filtered | Extra                 |
+----+-------------+--------------+------------+-------+---------------+----------+---------+------+----------+----------+-----------------------+
|  1 | SIMPLE      | order_extend | NULL       | range | order_id      | order_id | 8       | NULL | 43290040 |   100.00 | Using index condition |
+----+-------------+--------------+------------+-------+---------------+----------+---------+------+----------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

很显然，这个就是 where in 不走索引的原因！

explain 反映的是 MYSQL 内部优化器的执行过程，MYSQL 认为走 order_id 的索引会扫描 43290040 行数据，而实际上整张表才 4331642 行数据，这说明 MYSQL 对索引要扫描的行数判断是有错误的，而且是整张表数据的十倍，那么 MYSQL 一定会认为全表扫描是最优方案了。

我们知道，选择索引时优化器的工作。

而优化器选择索引的目的，是找到一个最优的执行方案，并用最小的代价去执行语句。在数据库方面，扫描行数是影响执行代价的因素之一。扫描的行数越少，意味着访问磁盘数据的次数越少，消耗的 CPU 资源越少。

当然，扫描行数并不是唯一的判断标准，优化器还会结合是否使用临时表、是否排序等因素进行综合判断。

上述这个问题我们可以明显的知道，MYSQL 选错索引肯定是在判断扫描行数的时候出问题了。

那么 MYSQL 是如何判断扫描行数的呢？

MYSQL 在真正开始执行语句之前，并不能精确地知道满足这个条件的记录有多少条，而只能根据统计信息来估计记录数。

这个统计信息就是索引的“区分度”。显然，一个索引上不同的值越多，这个索引的区分度就越好。而一个索引上不同的值的个数，我们称之为“基数”（cardinality）。也就是说，这个基数越大，索引的区分度越好。

我们可以使用 show index 方法，看到一个索引的技术。我们来看一下 order_extend 表的索引结构信息。

mysql > show index from order_extend;

+--------------+------------+-------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name          | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+-------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| order_extend |          0 | PRIMARY           |            1 | primary_column | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | order_id          |            1 | order_id       | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index2            |            1 | column2        | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index3            |            1 | column3        | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index4            |            1 | column4        | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index5            |            1 | column5        | A         |       2     |     NULL | NULL   | YES  | BTREE      |         |               |
| order_extend |          1 | index6            |            1 | column6        | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index7            |            1 | column7A       | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index7            |            2 | column7B       | A         |       2     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index8            |            1 | column8        | A         |       2     |     NULL | NULL   | YES  | BTREE      |         |               |
| order_extend |          1 | index9            |            1 | column9        | A         |       2     |     NULL | NULL   | YES  | BTREE      |         |               |
| order_extend |          1 | idx_receiver_name |            1 | receiver_name  | A         |      866333 |     NULL | NULL   |      | BTREE      |         |               |
+--------------+------------+-------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
12 rows in set (0.00 sec)

根据我们日常经验来看，order_id 的基数一定特别大，一定存在很多个不同的 order_id，而上面 show index from order_extend 的结果看，order_id 索引的基数为 2 是一定有问题的。

从上面的 show index 结果中我们其实也可以推测出 43290040 的值是怎么得来的。首先 MYSQL 猜测可能使用的索引是 order_id，但是该索引的基数是 2，这就意味着 order_id 只有两种值，因此要全表扫描，然后再去看看其他索引，发现基数也都是 2，上面一共有 10 个 2，因此要十次全表扫描，最后再加上索引 idx_receiver_name 的 866333 个，最终扫描的行数就是 43290040 个，正好是全表行数的十倍。

我们先来看一下 MYSQL 是如何得到索引的基数。

我们先来简单看一下 MYSQL 采样统计的方法。

为什么要采样统计呢？因为把整张表取出来一行行统计，虽然可以得到精确的结果，但是代价太高了，所以只能选择“采样统计”。

采样统计的时候，InnoDB 默认会选择 N 个数据页，统计这些页面上的不同值，得到一个平均值，然后乘以这个索引的页面数，就得到了这个索引的基数。

而数据表是会持续更新的，索引统计信息也不会固定不变。所以，当变更的数据行数超过 1/M 的时候，会自动触发重新做一次索引统计。

在 MYSQL 中，有两种存储索引统计的方式，可以通过设置参数 innodv_stats_persistent 的值来选择：

设置为 on 的时候，表示统计信息会持久化存储。这时，默认的 N 是 20，M 是 10。
设置为 off 的时候，表示统计信息只存储在内存中。这时，默认的 N 是 8，M 是 16。

由于是采样统计，所以不管 N 是 20 还是 8，这个基数都是很容易不准的。

其它注意事项

我们知道，扫描行数会影响 MYSQL 选择索引。但是，扫描行数并不是唯一的判断标准。优化器还会结合是否使用临时表，是否排序等因素进行综合判断。

举个例子，比如说，我们有个普通索引 explain 需要扫描 40000 行，主键索引 explain 要扫描 100000 行，这个时候 MYSQL 可能最终会选择主键索引，因为它认为选择普通索引还需要回表，需要回到主键索引上查出整行数据，这个代价优化器也要算进去的。而如果选择扫描 10 万行，是直接在主键索引上扫描的，没有额外的代价。

优化器会估算这两个选择的代价，从结果看来，优化器认为直接扫描主键索引更快。当然，从执行时间看来，这个选择并不是最优的。

解决方案

所以 MYSQL 选错索引，最终还是归咎于没能准确地判断出扫描行数。

至于是什么原因导致索引基数出错还没有找到，如果找到的话，会及时补充进来。

既然是统计信息不对，那就修正。analyze table order_extend 命令，可以用来重新统计索引信息。我们来看一下执行效果。

mysql > analyze table order_extend;
+----------------------+---------+----------+----------+
| Table                | Op      | Msg_type | Msg_text |
+----------------------+---------+----------+----------+
| example.order_extend | analyze | status   | OK       |
+----------------------+---------+----------+----------+
1 row in set (1.14 sec)

mysql > show index from t_pandora_order_express;
+--------------+------------+-------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name          | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+-------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| order_extend |          0 | PRIMARY           |            1 | primary_column | A         |   4331642   |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | order_id          |            1 | order_id       | A         |   4331642   |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index2            |            1 | column2        | A         |   1201523   |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index3            |            1 | column3        | A         |    8754     |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index4            |            1 | column4        | A         |   267249    |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index5            |            1 | column5        | A         |    59040    |     NULL | NULL   | YES  | BTREE      |         |               |
| order_extend |          1 | index6            |            1 | column6        | A         |   949575    |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index7            |            1 | column7A       | A         |    40604    |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index7            |            2 | column7B       | A         |   260542    |     NULL | NULL   |      | BTREE      |         |               |
| order_extend |          1 | index8            |            1 | column8        | A         |   3200820   |     NULL | NULL   | YES  | BTREE      |         |               |
| order_extend |          1 | index9            |            1 | column9        | A         |   64083     |     NULL | NULL   | YES  | BTREE      |         |               |
| order_extend |          1 | idx_receiver_name |            1 | receiver_name  | A         |   866333    |     NULL | NULL   |      | BTREE      |         |               |
+--------------+------------+-------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
12 rows in set (0.00 sec)

注：上述基数列数据为人工制造，线上数据执行完 analyze 结果一致，主要是为了凸显出 order_id 和主键的基数一致。

可以看到预估的扫描的行数正确了。

所以在我们日常开发中，如果我们发现 explain 的结果预估的 rows 值跟实际情况差距比较大，可以采用这个方法来处理。

需要注意的是，如果开启了 binlog，那么 Analyze Table 的结果也会写入 binlog，我们可以在 analyze 和 table 之间添加关键字 local 取消写入。

基础知识

show index 语句

SHOW INDEX 返回表索引信息。语法结构如下：

SHOW [EXTENDED] {INDEX | INDEXES | KEYS}
    {FROM | IN} tbl_name
    [{FROM | IN} db_name]
    [WHERE expr]

返回结果示例：

mysql> SHOW INDEX FROM City\G
*************************** 1. row ***************************
        Table: city
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: ID
    Collation: A
  Cardinality: 4188
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
      Visible: YES
   Expression: NULL
*************************** 2. row ***************************
        Table: city
   Non_unique: 1
     Key_name: CountryCode
 Seq_in_index: 1
  Column_name: CountryCode
    Collation: A
  Cardinality: 232
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
      Visible: YES
   Expression: NULL

字段解释

字段	含义
Table	表名字
Non_unique	如果索引不能包含重复项，则为0；如果可以，则为1。
Key_name	索引名称。如果索引是主键，则名称始终为 PRIMARY。
Seq_in_index	索引中的列序号，从 1 开始。
Column_name	列名
Collation	列在索引中的排序方式。它可以具有值 A（升序），D（降序）或 NULL（未排序）。
Cardinality	索引中唯一值数量的估计。要更新此数字，请运行 ANALYZE TABLE。基数是抽样统计得来的，因此即使对于较小的表，该值也不一定准确。基数越高，MySQL 在进行联接时使用索引的机会越大。
Sub_part	索引前缀。也就是说，如果仅对列进行部分索引，则为索引字符的数目；如果对整个列进行索引，则为 NULL。
Packed	表示 key 如何被 packed。如果没有则为 NULL。
Null	如果该列可能包含 NULL 值，则包含 YES，否则为 ''。
Index_type	使用的索引方法（BTREE，FULLTEXT，HASH，RTREE）。
Comment	关于在其自己的列中未描述的索引的信息，例如，如果禁用了索引，则为 disabled。
Index_comment	创建索引时，使用 COMMENT 属性为索引提供的任何注释。
Visible	索引是否对优化器可见。
Expression	MySQL 8.0.13 和更高版本支持的功能部分。

总结

在日常开发中，我们可能遇到 MYSQL 不走索引的情况，这个时候如果我们明确的知道，走某个索引会比不走索引更快，比如上面的 SQL，我们可以使用 force index() 语句强制走索引。
另外我们还可以查看一下索引信息。对于由于索引统计信息不准确导致的问题，我们可以用 analyze table 来解决。

参考文档

13.7.7.22 SHOW INDEX Statement
《MYSQL 实战 45 讲》

MYSQL 选错索引问题分析