[MySQL面试题]-索引 15-17. 索引的优缺点15. 索引的优缺点？ 16. 使用索引是否一定提升效率？ 17.

15. 索引的优缺点

优点 Advantages

加速数据查询（查询效率提升）
1. 索引相当于查字典的目录，可以快速定位数据位置，避免全表扫描。
2. 尤其是 B+ 树搜索引对范围查询、排序、LIKE 前缀匹配等性能提升明显。
加速排序操作（Order By）

如果排序字段上有索引，MySQL 可直接利用索引顺序返回结果，避免额外排序开销。
加速分组和聚合操作（group by / count / min / max）

利用索引可以快速提取分组边界或聚合结果。
加速连接操作（JOIN）

被连接的字段建立索引，可大幅减少关联时的匹配成本。
提升唯一性约束的效率

唯一索引（unique）和主键索引（primary key）能够快速判断是否存在重复值。

缺点（Disadvantages）

增加写入和更新成本（insert ｜ update｜ delete）

每次修改表中数据，都需要同步更新对应索引，尤其是存在多个索引时，写入性能受影响比较大。
占用存储空间

每个索引都需要额外存储空间，尤其是组合索引或冗余索引过多时，可能导致磁盘膨胀。
可能导致执行计划选择不当

如果存在多个索引，优化器可能选择次优路径，反而影响性能。
维护成本增加

数据结构（B+树）需要保持平衡和有序，频繁插入会引起页分裂、合并、重排、维护成本较高。

16. 使用索引是否一定提升效率

低选择性字段
1. 比如 gender、status 等字段，取值有限（男/女或 0/1）
2. 使用索引无法有效过滤数据，反而需要回表读取大量数据页，成本更高。
返回大量数据（全表或大部分）
1. 查询返回结果占比接近全表，比如 select * from user。
2. 此时扫描索引再回表，还不如直接全表扫描。
需要频繁维护的场景

表更新频繁（如日志表、交易流水表），索引每次插入/更新/删除都要维护 B+ 树结构，降低写入性能。
LIKE %xxx% 模糊匹配

非前缀匹配无法利用 B+ 树索引，只能回退到全表扫描。
组合索引未满足最左匹配原则

联合索引（a， b， c），如果查询只用到 b，c 则无法使用索引。
类型不一致
1. 查询条件与索引字段类型不一致，会导致索引失效。
2. e.g where id = ‘123’ 中 ‘123’ 是字符串而id是整数

验证

准备数据

-- Create a test table with proper indexes
CREATE TABLE IF NOT EXISTS users (
    id INT PRIMARY KEY,
    `name` VARCHAR(100),
    email VARCHAR(100),
    age INT,
    city VARCHAR(50),
    created_at DATETIME
);

-- Add single-column indexes
CREATE INDEX idx_name ON users(`name`);
CREATE INDEX idx_age ON users(age);

-- Add composite index for multiple columns
CREATE INDEX idx_city_age_email ON users(city, age, email);

-- Insert some sample data
INSERT INTO users VALUES
(1, 'John Doe', 'john@example.com', 25, 'New York', '2023-01-01'),
(2, 'Jane Smith', 'jane@example.com', 30, 'Boston', '2023-01-02'),
(3, 'Bob Johnson', 'bob@example.com', 35, 'Chicago', '2023-01-03'),
(4, 'Alice Williams', 'alice@example.com', 28, 'New York', '2023-01-04'),
(5, 'Charlie Brown', 'charlie@example.com', 40, 'Boston', '2023-01-05');

模糊匹配 idx_name

-- 4. Test LIKE '%xxx%' pattern
-- This won't use the index on 'name'
EXPLAIN SELECT * FROM users WHERE name LIKE '%oh%';

{
  "id": 1,
  "select_type": "SIMPLE",-- 表示一个不包含子查询
  "table": "users",
  "partitions": null,
  "type": "ALL",          -- 全表扫描
  "possible_keys": null,  -- 未使用索引
  "key": null,
  "key_len": null,
  "ref": null,
  "rows": "5",
  "filtered": 20,         -- 过滤后效果（20%， 5行中的1行）
  "Extra": "Using where"
}

-- But LIKE 'xxx%' (prefix match) can use the index
EXPLAIN SELECT * FROM users WHERE name LIKE 'Jo%';
{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "users",
  "partitions": null,
  "type": "range",             -- 使用了索引范围扫描
  "possible_keys": "idx_name", -- 使用了 idx_name 索引
  "key": "idx_name",           -- 实际使用 idx_name 索引
  "key_len": "403",
  "ref": null,
  "rows": 1,                   -- 精确匹配 只需检测1行数据
  "filtered": 100,             -- 过滤后 100% 符合条件
  "Extra": "Using index condition"
}

联合索引 idx_city_age_email

-- 5. Test composite index without leftmost column
-- This won't use the composite index because 'city' is not in the WHERE clause
EXPLAIN SELECT * FROM users WHERE age = 30 AND email = 'jane@example.com';
{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "users",
  "partitions": null,
  "type": "ref",              -- 使用了唯一索引的等值查询
  "possible_keys": "idx_age",
  "key": "idx_age",           -- 使用了 age 字段索引
  "key_len": "5",
  "ref": "const",             -- 使用了常量进行索引查找
  "rows": 1,
  "filtered": 20,             -- 过滤后 20% 符合条件
  "Extra": "Using where"
}

-- This will use the composite index (follows leftmost principle)
EXPLAIN SELECT * FROM users WHERE city = 'Boston' AND age = 30;
{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "users",
  "partitions": null,
  "type": "ref",             -- 等值查找
  "possible_keys": "idx_age,idx_city_age_email",
  "key": "idx_age",          -- 实际使用 idx_age
  "key_len": "5",
  "ref": "const",
  "rows": 1,
  "filtered": 20,           -- WHERE条件过滤后保留20%数据
  "Extra": "Using where"
}
-- 1. 复合索引的第一列（city）未被单独使用
-- 2. 优化器认为单列索引效率更高 （city 可能匹配 10000 行， age 只匹配 100 行）

-- key_len 长度更长会占用更多内存
-- 强制使用复合索引
SELECT * FROM users FORCE INDEX(idx_city_age_email) 
WHERE city = 'Boston' AND age = 30;
-- 提示使用复合索引
SELECT * FROM users USE INDEX(idx_city_age_email)
WHERE city = 'Boston' AND age = 30;

-- This is also fine (uses just the first part of the index)
EXPLAIN SELECT * FROM users WHERE city = 'New York';
{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "users",
  "partitions": null,
  "type": "ref",
  "possible_keys": "idx_city_age_email",
  "key": "idx_city_age_email",      -- 使用了复合索引
  "key_len": "203",
  "ref": "const",
  "rows": "2",
  "filtered": 100,
  "Extra": null
}

差异对比：USE INDEX vs FORCE INDEX

项目	USE INDEX	FORCE INDEX
意图	提示优化器优先使用指定索引（非强制）	强制优化器必须使用指定索引
可选性	优化器仍可能忽略提示（若判断其他索引更优）	除非实在无法使用，否则必须使用指定索引
使用场景	索引较多，优化器有时误判	确认某索引最优，规避错误选择或测试
执行效率	更灵活、智能	可能在某些数据分布下反而性能更差
常见搭配	USE INDEX (idx_name)	FORCE INDEX (idx_name)

类型问题 (不影响)

EXPLAIN SELECT * FROM users WHERE id = '3';
{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "users",
  "partitions": null,
  "type": "const",
  "possible_keys": "PRIMARY",
  "key": "PRIMARY",
  "key_len": "4",
  "ref": "const",
  "rows": 1,
  "filtered": 100,
  "Extra": null
}

EXPLAIN SELECT * FROM users WHERE id = 3;
{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "users",
  "partitions": null,
  "type": "const",
  "possible_keys": "PRIMARY",
  "key": "PRIMARY",
  "key_len": "4",
  "ref": "const",
  "rows": 1,
  "filtered": 100,
  "Extra": null
}

17. 大段文本字段如何建立或优化索引

text/longtext 字段不能直接用于索引。原因在于字段内容太大，不适合存储在 B+树节点中

使用前面几个字段作为索引列。（只能 LIKE ‘关键字%’）

-- 大字段做前缀索引
CREATE INDEX idx_intro_prefix ON documents(intro(100));

- 缺点：
- 不能完全等价于全文索引
- 不支持模糊pp
- 不能覆盖全文内容，适用于前最定位场景

借助外部搜索引擎（Elastisearch）

对于海量文本，多字段检索、中文分词、搜索推荐等复杂场景，应将大文本字段同步至 Elasticsearch 进行索引和全文检索。