MongoDB学习笔记(六):索引

108 阅读8分钟
[(140条消息) mongo索引_乾复道的博客-CSDN博客_mongo 索引](https://blog.csdn.net/Long861774/article/details/126005364)

    在查询操作中,如果没有索引,会扫描集合中的每个文档,以选择与查询语句匹配的文档。如果查询条件带有索引,MongoDB 将扫描索引, 通过索引确定要查询的部分文档,而非直接对全部文档进行扫描

     索引可以提升文档的查询速度,但建立索引的过程需要使用计算与存储资源,在已经建立索引的前提下,插入新的文档会引起索引顺序的重排。

    MongoDB 的索引是基于 B-tree [数据结构](MySQL是B+Tree)及对应算法形成的。树索引存储特定字段或字段集的值,按字段值排序。索引条目的排序支持有效的等式匹配和基于范围的查询操作。

    下图所示的过程说明了使用索引选择和排序匹配文档的查询过程。

    MongoDB 在创建集合时,会默认在 _id 字段上创建唯一索引。该索引可防止客户端插入具有相同字段的两个文档,_id 字段上的索引不能被删除。

一、索引操作

索引创建

db.collection.createIndex()

db.<collection>.createIndex(<keys>,<options>)

<keys>文档指定了创建索引的字段
<options>文档定义了创建索引时可以使用的一些参数
  • 创建一个新集合
> db.accountsWithIndex.insertMany([
    {
        name: "alice", balance: 50, currency: [ "GBP", "USD"]
    },
    {
        name: "bob", balance:20, currency: ["AUD","USD"]
    },
    {
        name: "bob", balance: 300, currency:["CNY"]
    }
])

{
        "acknowledged" : true,
        "insertedIds" : [
                ObjectId("63db1ad71ce94226b81725ae"),
                ObjectId("63db1ad71ce94226b81725af"),
                ObjectId("63db1ad71ce94226b81725b0")
        ]
}

单键索引

db.<collection>.create({fieldName:value}) value: 1 为升序,-1 为降序

> db.accountsWithIndex.createIndex({name:1})

{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}

  • 列出集合中已存在的索引
> db.accountsWithIndex.getIndexes()

[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "test.accountsWithIndex"
        },
        {
                "v" : 2,
                "key" : {
                        "name" : 1   # 新索引时正序排列
                },
                "name" : "name_1",   # 新索引的名字
                "ns" : "test.accountsWithIndex"   # 表明索引创建在哪一个集合上,test数据库中的accountsWithIndex集合中
        }
]


复合键索引

name是由小到大的正向排序,balance是由大到小的反向排序

> db.accountsWithIndex.createIndex( { name:1, balance: -1})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 2,
        "numIndexesAfter" : 3,
        "ok" : 1
}

> db.accountsWithIndex.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "test.accountsWithIndex"
        },
        {
                "v" : 2,
                "key" : {
                        "name" : 1
                },
                "name" : "name_1",
                "ns" : "test.accountsWithIndex"
        },
        {
                "v" : 2,
                "key" : {                       # 创建的复合键
                        "name" : 1,
                        "balance" : -1
                },
                "name" : "name_1_balance_-1",
                "ns" : "test.accountsWithIndex"
        }
]

多键索引

为包含数组的字段建立索引,会为数组中的每个元素创建索引键。但是如果集合中包含多个待索引字段是数组,则无法创建复合多键索引

> db.accountsWithIndex.createIndex( { currency:1 })
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 3,
        "numIndexesAfter" : 4,
        "ok" : 1
}

  • 数组字段中的每一个元素,都会在多键索引中创建一个键
"AUD" --> {"bob"}
"CNY" --> {"bob"}
"GBP" --> {"alice"}
"USD" --> {"alice"}
"USD" --> {"bob"}

索引效果

db.<collection>.explain().<method(...)>

可以使用explain()进行分析命令包括aggregate(),count(),distinct(),find(),group(),remove(),upadte()
  • 使用没有创建 索引的字段进行搜索
> db.accountsWithIndex.explain().find({balance:100})
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.accountsWithIndex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "balance" : {
                                "$eq" : 100
                        }
                },
                "queryHash" : "88DDD986",
                "planCacheKey" : "9238DC63",
                "winningPlan" : {
                        "stage" : "COLLSCAN",    # 将集合中的文档遍历一遍去匹配,意味着最慢,效率低
                        "filter" : {
                                "balance" : {
                                        "$eq" : 100
                                }
                        },
                        "direction" : "forward"
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "e2faa98aab20",
                "port" : 27017,
                "version" : "4.2.23",
                "gitVersion" : "f4e6602d3a4c5b22e9d8bcf0722d0afd0ec01ea2"
        },
        "ok" : 1
}

  • 使用已经创建索引的字段进行搜索
> db.accountsWithIndex.explain().find({ name: "alice"})
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.accountsWithIndex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "name" : {
                                "$eq" : "alice"
                        }
                },
                "queryHash" : "01AEE5EC",
                "planCacheKey" : "0BE5F32C",
                "winningPlan" : {
                        "stage" : "FETCH",   # 根据索引当中指示的文档存储地址,把对应的文档提取出来
                        "inputStage" : {
                                "stage" : "IXSCAN",      # 意味着创建的索引起到了效果
                                "keyPattern" : {
                                        "name" : 1
                                },
                                "indexName" : "name_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "name" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "name" : [
                                                "[\"alice\", \"alice\"]"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [
                        {
                                "stage" : "FETCH",
                                "inputStage" : {
                                        "stage" : "IXSCAN",
                                        "keyPattern" : {
                                                "name" : 1,
                                                "balance" : -1
                                        },
                                        "indexName" : "name_1_balance_-1",
                                        "isMultiKey" : false,
                                        "multiKeyPaths" : {
                                                "name" : [ ],
                                                "balance" : [ ]
                                        },
                                        "isUnique" : false,
                                        "isSparse" : false,
                                        "isPartial" : false,
                                        "indexVersion" : 2,
                                        "direction" : "forward",
                                        "indexBounds" : {
                                                "name" : [
                                                        "[\"alice\", \"alice\"]"
                                                ],
                                                "balance" : [
                                                        "[MaxKey, MinKey]"
                                                ]
                                        }
                                }
                        }
                ]
        },
        "serverInfo" : {
                "host" : "e2faa98aab20",
                "port" : 27017,
                "version" : "4.2.23",
                "gitVersion" : "f4e6602d3a4c5b22e9d8bcf0722d0afd0ec01ea2"
        },
        "ok" : 1
}

  • 仅返回创建了索引的字段
> db.accountsWithIndex.explain().find( {name: "alice"},{_id: 0, name:1})
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.accountsWithIndex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "name" : {
                                "$eq" : "alice"
                        }
                },
                "queryHash" : "2E50AAA9",
                "planCacheKey" : "BAD68206",
                "winningPlan" : {
                        "stage" : "PROJECTION_COVERED",
                        "transformBy" : {
                                "_id" : 0,
                                "name" : 1
                        },
                        "inputStage" : {
                                "stage" : "IXSCAN",     # 效率最快
                                "keyPattern" : {
                                        "name" : 1
                                },
                                "indexName" : "name_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "name" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "name" : [
                                                "[\"alice\", \"alice\"]"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [
                        {
                                "stage" : "PROJECTION_COVERED",
                                "transformBy" : {
                                        "_id" : 0,
                                        "name" : 1
                                },
                                "inputStage" : {
                                        "stage" : "IXSCAN",
                                        "keyPattern" : {
                                                "name" : 1,
                                                "balance" : -1
                                        },
                                        "indexName" : "name_1_balance_-1",
                                        "isMultiKey" : false,
                                        "multiKeyPaths" : {
                                                "name" : [ ],
                                                "balance" : [ ]
                                        },
                                        "isUnique" : false,
                                        "isSparse" : false,
                                        "isPartial" : false,
                                        "indexVersion" : 2,
                                        "direction" : "forward",
                                        "indexBounds" : {
                                                "name" : [
                                                        "[\"alice\", \"alice\"]"
                                                ],
                                                "balance" : [
                                                        "[MaxKey, MinKey]"
                                                ]
                                        }
                                }
                        }
                ]
        },
        "serverInfo" : {
                "host" : "e2faa98aab20",
                "port" : 27017,
                "version" : "4.2.23",
                "gitVersion" : "f4e6602d3a4c5b22e9d8bcf0722d0afd0ec01ea2"
        },
        "ok" : 1
}

  • 使用已经创建索引的字段进行排序
> db.accountsWithIndex.explain().find().sort( {name:1, balance:-1})
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.accountsWithIndex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        
                },
                "queryHash" : "DC9EFEDE",
                "planCacheKey" : "DC9EFEDE",
                "winningPlan" : {
                        "stage" : "FETCH",
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "name" : 1,
                                        "balance" : -1
                                },
                                "indexName" : "name_1_balance_-1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "name" : [ ],
                                        "balance" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "name" : [
                                                "[MinKey, MaxKey]"
                                        ],
                                        "balance" : [
                                                "[MaxKey, MinKey]"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "e2faa98aab20",
                "port" : 27017,
                "version" : "4.2.23",
                "gitVersion" : "f4e6602d3a4c5b22e9d8bcf0722d0afd0ec01ea2"
        },
        "ok" : 1
}

  • 使用未创建索引的字段进行排序
db.accountsWithIndex.explain().find().sort( {name:1, balance:1})   # 前面索引创建的是1,-1
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.accountsWithIndex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        
                },
                "queryHash" : "797A24CD",
                "planCacheKey" : "797A24CD",
                "winningPlan" : {
                        "stage" : "SORT",     # 耗费时间和资源的操作
                        "sortPattern" : {
                                "name" : 1,
                                "balance" : 1
                        },
                        "inputStage" : {
                                "stage" : "SORT_KEY_GENERATOR",
                                "inputStage" : {
                                        "stage" : "COLLSCAN",
                                        "direction" : "forward"
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "e2faa98aab20",
                "port" : 27017,
                "version" : "4.2.23",
                "gitVersion" : "f4e6602d3a4c5b22e9d8bcf0722d0afd0ec01ea2"
        },
        "ok" : 1
}

索引删除

db.collection.dropIndex()

如果需要更改某些字段上已经创建的索引

必须首先删除原有索引,再重新创建新索引

否则,新索引不会包含原有文档(没有被正确的创建)
  • 列出集合中已存在的索引
> db.accountsWithIndex.getIndexes()
[        {                "v" : 2,                "key" : {                        "_id" : 1                },                "name" : "_id_",                "ns" : "test.accountsWithIndex"        },        {                "v" : 2,                "key" : {                        "name" : 1                },                "name" : "name_1",                "ns" : "test.accountsWithIndex"        },        {                "v" : 2,                "key" : {                        "name" : 1,                        "balance" : -1                },                "name" : "name_1_balance_-1",                "ns" : "test.accountsWithIndex"        },        {                "v" : 2,                "key" : {                        "currency" : 1                },                "name" : "currency_1",                "ns" : "test.accountsWithIndex"        }]

  • 使用索引名称删除索引
> db.accountsWithIndex.dropIndex("name_1")
{ "nIndexesWas" : 4, "ok" : 1 }

  • 再次查看已存在的索引
> db.accountsWithIndex.getIndexes()
[        {                "v" : 2,                "key" : {                        "_id" : 1                },                "name" : "_id_",                "ns" : "test.accountsWithIndex"        },        {                "v" : 2,                "key" : {                        "name" : 1,                        "balance" : -1                },                "name" : "name_1_balance_-1",                "ns" : "test.accountsWithIndex"        },        {                "v" : 2,                "key" : {                        "currency" : 1                },                "name" : "currency_1",                "ns" : "test.accountsWithIndex"        }]

  • 使用索引定义删除索引
> db.accountsWithIndex.dropIndex({"name":1,"balance":-1})
{ "nIndexesWas" : 3, "ok" : 1 }

索引的唯一性

文档主键上创建的默认索引
> db.accountsWithIndex.getIndexes()
[        {                "v" : 2,                "key" : {             # 主键的默认索引                        "_id" : 1                },                "name" : "_id_",                "ns" : "test.accountsWithIndex"        },        {                "v" : 2,                "key" : {                        "currency" : 1                },                "name" : "currency_1",                "ns" : "test.accountsWithIndex"        }]

  • 创建一个具有唯一性的索引
db.accountsWithIndex.createIndex({ balance:1},{unique:true})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 2,
        "numIndexesAfter" : 3,
        "ok" : 1
}

  • 如果已有文档中的某个字段出现了重复值,就不可以在这个字段上创建唯一性索引
> db.accountsWithIndex.createIndex( { name:1},{unique:true})
{
        "ok" : 0,
        "errmsg" : "E11000 duplicate key error collection: test.accountsWithIndex index: name_1 dup key: { name: \"bob\" }",
        "code" : 11000,
        "codeName" : "DuplicateKey",
        "keyPattern" : {
                "name" : 1
        },
        "keyValue" : {
                "name" : "bob"
        }
}
  • 如果新增的文档不包含唯一性索引字段,只有第一篇缺失该字段的文档可以被写入数据库,索引中该文档的键值被默认null
> db.accountsWithIndex.insert( {name: "charlie", lastAccess:new Date()})   # 缺失balance字段
WriteResult({ "nInserted" : 1 })
第一次可以写入

> db.accountsWithIndex.insert( { name: "david", lastAccess: new Date()})
WriteResult({
        "nInserted" : 0,
        "writeError" : {
                "code" : 11000,
                "errmsg" : "E11000 duplicate key error collection: test.accountsWithIndex index: balance_1 dup key: { balance: null }"
        }
})

复合键索引也可以具有唯一性,在这种情况下,不同的文档之间,其所包含的符合键字段值得组合,不可以重复。

  • 先删除掉刚刚创建的唯一性索引

> db.accountsWithIndex.dropIndex("balance_1")
{ "nIndexesWas" : 3, "ok" : 1 }

索引的稀疏性

  • 只将包含索引键字段的文档加入到索引中(即使索引键字段值为null)
> db.accountsWithIndex.createIndex( { balance:1 },{ sparse:true})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 2,
        "numIndexesAfter" : 3,
        "ok" : 1
}

如果同一个索引既具有唯一性,又具有稀疏性,就可以保存多篇确实索引键值的文档了

> db.accountsWithIndex.dropIndex("balance_1")
{ "nIndexesWas" : 3, "ok" : 1 }

> db.accountsWithIndex.createIndex( {balance:1}, {unique:true,sparse:true })
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 2,
        "numIndexesAfter" : 3,
        "ok" : 1
}

> db.accountsWithIndex.insert( {name: "david",lastAccess: new Date()})
WriteResult({ "nInserted" : 1 })

复合键索引也可以具有稀疏性,在这种情况下,只有在缺失复合键所有包含的所有字段的情况下,文档才不会被加入到索引中

索引的生存时间

针对日期字段,或者包含日期元素的数组字段,可以使用设定了生存时间的索引,来自动删除字段值超过生存时间的文档

  • 查看所有文档

> db.accountsWithIndex.find()
{ "_id" : ObjectId("63db23461ce94226b81725b1"), "name" : "alice", "balance" : 50, "currency" : [ "GBP", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b2"), "name" : "bob", "balance" : 20, "currency" : [ "AUD", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b3"), "name" : "bob", "balance" : 300, "currency" : [ "CNY" ] }
{ "_id" : ObjectId("63db65971ce94226b81725b4"), "name" : "charlie", "lastAccess" : ISODate("2023-02-02T07:26:15.296Z") }
{ "_id" : ObjectId("63db75621ce94226b81725b6"), "name" : "david", "lastAccess" : ISODate("2023-02-02T08:33:38.541Z") }
  • 在lastAccess字段上创建一个生存时间是二十秒的索引
通过比较文档中的lastAccess字段的时间戳与现在的时间戳进行对比,超过了二十秒,就意味太旧了,数据库会自动删除这篇文档
> db.accountsWithIndex.createIndex( {lastAccess:1}, {expireAfterSeconds:20})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 3,
        "numIndexesAfter" : 4,
        "ok" : 1
}

查看一下
> db.accountsWithIndex.find()
{ "_id" : ObjectId("63db23461ce94226b81725b1"), "name" : "alice", "balance" : 50, "currency" : [ "GBP", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b2"), "name" : "bob", "balance" : 20, "currency" : [ "AUD", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b3"), "name" : "bob", "balance" : 300, "currency" : [ "CNY" ] }

发现那两篇已经超过了有效期,所以自动删除了
  • 再插入一篇新的文档
> db.accountsWithIndex.insert( {name:"eddie", lastAccess: new Date()})
WriteResult({ "nInserted" : 1 })

立马查看一下
> db.accountsWithIndex.find()
{ "_id" : ObjectId("63db23461ce94226b81725b1"), "name" : "alice", "balance" : 50, "currency" : [ "GBP", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b2"), "name" : "bob", "balance" : 20, "currency" : [ "AUD", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b3"), "name" : "bob", "balance" : 300, "currency" : [ "CNY" ] }
{ "_id" : ObjectId("63db7bd31ce94226b81725b7"), "name" : "eddie", "lastAccess" : ISODate("2023-02-02T09:01:07.895Z") }

过二十秒之后再看
> db.accountsWithIndex.find()
{ "_id" : ObjectId("63db23461ce94226b81725b1"), "name" : "alice", "balance" : 50, "currency" : [ "GBP", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b2"), "name" : "bob", "balance" : 20, "currency" : [ "AUD", "USD" ] }
{ "_id" : ObjectId("63db23461ce94226b81725b3"), "name" : "bob", "balance" : 300, "currency" : [ "CNY" ] }

复合键索引不具备生存时间特性

当索引键是包含日期元素的数组字段时,数组中最小的日期将被用来计算文档是否过期

数据库使用一个后台线程来检测盒删除过期的文档,删除操作可能有一定的延迟