MongoDB索引机制当我们查询数据库时，如果不建立索引。默认是通过我们编写的规则去遍历数据库中所有的文档，最后找出符合

mongodb索引

什么是索引

当我们查询数据库时，如果不建立索引。默认是通过我们编写的规则去遍历数据库中所有的文档，最后找出符合条件的文档。这在数据量不是很大的时候问题不是很大，但是如果数据量很大，查询有可能花费数秒甚至数分钟的时间。

而索引会将数据按照一定的顺序进行排序，当我们查询的时候通过这个顺序就会很快的查询出来（O(logN)的时间复杂度）

内部原理 当我们往数据库中存储数据时，通过底层的存储引擎持久化之后，会记录文档的位置信息，通过这个位置信息就能查找到对应的文档

例如：我们在数据库中插入以下信息

> db.test.find()
{ "_id" : ObjectId("5d47f95c4903b485d29ba952"), "name" : "fqr0", "age" : 0 }
{ "_id" : ObjectId("5d47f95c4903b485d29ba953"), "name" : "fqr1", "age" : 1 }
{ "_id" : ObjectId("5d47f95c4903b485d29ba954"), "name" : "fqr2", "age" : 2 }

数据库会记录文档的位置信息

位置信息	文档
位置1	{"name":"far1", "age":1}
位置2	{"name":"far0", "age":0}
位置3	{"name":"far2", "age":2}

这时我们要查询find({"age":2})时，会遍历所有的三个文档，当数据量很大时查询会很慢，如果我们想加快查询速度，就可以对age字段加索引

db.test.createIndex({"age":1}) // 按照age字段创建升序序列

建立索引后

age	位置信息
0	位置2
1	位置1
2	位置3

这样就不用遍历所有的文档去查找符合条件{"age":2}的数据了

其实在MongoDB文档中都有一个_id字段，它就是一个索引，用来通过_id快速的查询文档

索引的好处

加快搜索的速度
优化更新和删除操作，因为这两个操作都是先查询出对应的文档然后在执行更新或者删除
加快排序的速度，如果需要对age字段排序，就不需要再去遍历所有文档了

MongoDB索引类型

单字段索引 就是只对单个字段进行索引

db.test.createIndex({"age":1})

1 表示升序，-1表示降序

单字段索引是最常用的索引方式，MongoDB默认创建的_id索引就是这种方式

复合段索引 对多个字段进行索引

db.test.createIndex({"age":1, "name":1})

多字段索引的方式是如果文档的age相同，就通过name字段排序

例如：

age，name	位置信息
0,fqr0	位置2
0,far1	位置1
2,fqr1	位置3

注意：采用这种索引建立方式，不仅能满足多个字段的查询find({"age":0,"name":"fqr1"})，也可以满足单个字段的查询find({"age":0})。但是，find({"name":"fqr0"})是利用不了索引的

采用这种方式时，通常选择不会容易重复的字段作为第一个条件，这样性能会更好

多Key索引

如果一个字段为数组时，对这个字段建立索引就是多key索引，数据库会为其中的每个元素建立索引

db.test.createIndex({"field": 1})

不常用的索引

文本索引，比如通过文章的关键词，在一个博客系统中查询对应的文章
哈希索引，按照某个字段的hash值进行索引
地理位置索引，比如查找附件的美食等

索引优化

MongoDB支持对数据库的操作进行分析，记录操作比较慢的动作。一共有三个level

不记录慢操作
将处理时间超过阀值的请求记录都记录到system.profile集合
将所有的请求都记录到system.profile集合

设置level

> db.setProfilingLevel(1)
{ "was" : 1, "slowms" : 100, "sampleRate" : 1, "ok" : 1 }

查看level

> db.getProfilingLevel()
1

我们下面测试一下profile

首先在数据库中写入大量的数据

for(let i=0;i<1000000;i++){db.test.insertOne({"name":"fqr"+i,"age":parseInt(i * Math.random())}) }

没有建立索引前查询都是全表扫描

> db.test.find({"name":"fqr1234"}).explain()
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.test",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"name" : {
				"$eq" : "fqr1234"
			}
		},
		"winningPlan" : {
			"stage" : "COLLSCAN",   // 全表扫描
			"filter" : {
				"name" : {
					"$eq" : "fqr1234"
				}
			},
			"direction" : "forward"
		},
		"rejectedPlans" : [ ]
	},
	"ok" : 1
}

我们查看system.profile，发现已经记录了这条数据，因为查询时间是1202毫秒，超过了默认值100毫秒

> db.system.profile.find().sort({$natural:-1}).limit(1).pretty()
{
	"op" : "command",
	"ns" : "test.test",
	"command" : {
		"explain" : {
			"find" : "test",
			"filter" : {
				"name" : "fqr1234"
			}
		},
		"verbosity" : "allPlansExecution",
		"$db" : "test"
	},
	"numYield" : 7834,
	"locks" : {
		"Global" : {
			"acquireCount" : {
				"r" : NumberLong(7835)
			}
		},
		"Database" : {
			"acquireCount" : {
				"r" : NumberLong(7835)
			}
		},
		"Collection" : {
			"acquireCount" : {
				"r" : NumberLong(7835)
			}
		}
	},
	"responseLength" : 862,
	"protocol" : "op_msg",
	"millis" : 1202,
	"ts" : ISODate("2019-08-06T02:14:59.455Z"),
	"client" : "127.0.0.1",
	"appName" : "MongoDB Shell",
	"allUsers" : [ ],
	"user" : ""
}

我们优化查询速度时，可以根据system.profile中的记录来建立相关字段的索引，提升查询速度

下面我们建立一个索引

> db.test.createIndex({"name":1})
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}

再进行查询

> db.test.find({"name":"fqr12345"}).explain("allPlansExecution")
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.test",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"name" : {
				"$eq" : "fqr12345"
			}
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"name" : 1
				},
				"indexName" : "name_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"name" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"name" : [
						"[\"fqr12345\", \"fqr12345\"]"
					]
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"executionStats" : {
		"executionSuccess" : true,
		"nReturned" : 1,
		"executionTimeMillis" : 4,
		"totalKeysExamined" : 1,
		"totalDocsExamined" : 1,
		"executionStages" : {
			"stage" : "FETCH",
			"nReturned" : 1,
			"executionTimeMillisEstimate" : 0,
			"works" : 2,
			"advanced" : 1,
			"needTime" : 0,
			"needYield" : 0,
			"saveState" : 0,
			"restoreState" : 0,
			"isEOF" : 1,
			"invalidates" : 0,
			"docsExamined" : 1,
			"alreadyHasObj" : 0,
			"inputStage" : {
				"stage" : "IXSCAN",
				"nReturned" : 1,
				"executionTimeMillisEstimate" : 0,
				"works" : 2,
				"advanced" : 1,
				"needTime" : 0,
				"needYield" : 0,
				"saveState" : 0,
				"restoreState" : 0,
				"isEOF" : 1,
				"invalidates" : 0,
				"keyPattern" : {
					"name" : 1
				},
				"indexName" : "name_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"name" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"name" : [
						"[\"fqr12345\", \"fqr12345\"]"
					]
				},
				"keysExamined" : 1,
				"seeks" : 1,
				"dupsTested" : 0,
				"dupsDropped" : 0,
				"seenInvalidated" : 0
			}
		},
		"allPlansExecution" : [ ]
	},
	"ok" : 1
}

会发现查询已经不再是全表扫描了，而是根据索引查询，并且查询速度由原来的1202ms提升到了现在的4ms

常用命令

查看索引

> db.test.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.test"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1
        },
        "name" : "name_1",
        "ns" : "test.test"
    }
]

指定需要使用的索引

> db.test.find({"name":"fqr12345"}).hint({"name":1}).pretty()
{
    "_id" : ObjectId("5d47f9604903b485d29bd98b"),
    "name" : "fqr12345",
    "age" : 5526
}

删除索引

> db.test.dropIndex("name_1")
# 删除所有索引
> db.test.dropIndex("*")

注意事项

虽然索引可以提升查询性能，但是会降低插入性能
数字索引要比字符串索引快得多