简介

之前写了很多MongoDB代码结构、如何处理请求、catalog与storage等，接下来会介绍MongoDB创建记录经过了哪些步骤。

首先回顾一下上N篇MongoDB源码学习：Mongo中的OpRunner，当时提到不同的命令由不同的OpRunner执行，而创建记录的命令就由InsertOpRunner来处理。

创建记录都做了些什么

InsertOpRunner会调用receivedInsert方法，继而调用write_ops_exec.cpp中的insertDocuments方法（这里是创建记录的主要入口）

接下来介绍一下，创建记录经过以下步骤：

通过 WriteUnitOfWork 机制，确保下面的步骤都是一个原子操作。
当collection不存在时候进行创建
为每一条记录分配一个OplogSlot（包含时间戳）用于记录Oplog时候保证有序
调用storage创建记录
记录索引
写入Oplog

当collection不存在时候进行创建

在创建记录之前，会检查collection是否存在，如果不存在就创建，逻辑相对简单下面直接贴源代码，细节可以翻看 MongoDB源码学习：执行创建Collection命令。

// 这个方法定义在write_ops_exec.cpp的insertBatchAndHandleErrors中，在创建记录之前会调用一次

boost::optional<AutoGetCollection> collection;
boost::optional<AutoGetCollection> collection;
    auto acquireCollection = [&] {
    while (true) {
        // 实例化AutoGetCollection
        collection.emplace(
                opCtx,
                wholeOp.getNamespace(),
                fixLockModeForSystemDotViewsChanges(wholeOp.getNamespace(), MODE_IX));
        // 如果collection不存在，会调用AutoDB的userCreateNS创建collection
        makeCollection(opCtx, wholeOp.getNamespace());
    }
}

为每一条记录分配一个OplogSlot（包含时间戳）用于记录Oplog时候保证有序

在数据库系统中，Oplog是一个很重要的功能。其记录了数据库的变更操作（例如insert、update、delete），可以用于数据还原、主从复制等。

对于非事务的创建记录操作，会调用repl::getNextOpTimes，每一个记录生成一个OplogSlot。伪代码如下

void insertDocuments(OperationContext* opCtx,
                     const CollectionPtr& collection,
                     std::vector<InsertStatement>::iterator begin,
                     std::vector<InsertStatement>::iterator end,
                     bool fromMigrate) {
    // dosomething
    if (!inTransaction && !replCoord->isOplogDisabledFor(opCtx, collection->ns())) {
        // 只有不在事务中，而且没有禁用Oplog，预先创建足够的OplogSlot
        auto oplogSlots = repl::getNextOpTimes(opCtx, batchSize);
        
        // 分配到每一个InsertStatement中
        auto slot = oplogSlots.begin();
        for (auto it = begin; it != end; it++) {
            it->oplogSlot = *slot++;
        }
    }
    
    // do insert
}

调用storage创建记录与记录索引

在上一步分配好了Oplogslot之后，CollectionImpl::insertDocuments，开始写入数据以及记录索引的部分了（本次只讲述大概的流程，细节后面有机会继续分享）。

会经过以下的流程：

获取SnapshotId，依赖了WriteUnitOfWork的机制。在写入操作结束之后，再次获取一次SnapshotId进行对比，确保SnapshotId没有变化来保证原子操作。
调用RecordService的insertRecords写入数据。
调用IndexCategory的indexRecords写入索引。

Status CollectionImpl::insertDocuments(OperationContext* opCtx,
                                       const std::vector<InsertStatement>::const_iterator begin,
                                       const std::vector<InsertStatement>::const_iterator end,
                                       OpDebug* opDebug,
                                       bool fromMigrate) const {

    // do some check
    
    // 获取一次SnapshotId
    const SnapshotId sid = opCtx->recoveryUnit()->getSnapshotId();
    // 写入数据和索引
    status = _insertDocuments(opCtx, begin, end, opDebug, fromMigrate);
    // 再次获取SnapshotId检查，完成乐观锁
    invariant(sid == opCtx->recoveryUnit()->getSnapshotId());
    
    // 原子操作完成
    opCtx->recoveryUnit()->onCommit(
        [this](boost::optional<Timestamp>) { _shared->notifyCappedWaitersIfNeeded(); });
}

Status CollectionImpl::_insertDocuments(OperationContext* opCtx,
                                        const std::vector<InsertStatement>::const_iterator begin,
                                        const std::vector<InsertStatement>::const_iterator end,
                                        OpDebug* opDebug,
                                        bool fromMigrate) const {

    std::vector<Record> records;
    std::vector<Timestamp> timestamps;
    for (auto it = begin; it != end; it++) {
        records.emplace_back(Record{recordId, RecordData(doc.objdata(), doc.objsize())});
        timestamps.emplace_back(it->oplogSlot.getTimestamp());
    }
    // 写入数据
    Status status = _shared->_recordStore->insertRecords(opCtx, &records, timestamps);
    
    std::vector<BsonRecord> bsonRecords;
    for (auto it = begin; it != end; it++) {
        BsonRecord bsonRecord = {loc, Timestamp(it->oplogSlot.getTimestamp()), &(it->doc)};
        bsonRecords.push_back(bsonRecord);
    }
    // 写入索引
    status = _indexCatalog->indexRecords(
        opCtx, {this, CollectionPtr::NoYieldTag{}}, bsonRecords, &keysInserted);
    
    // 通知开始写入Oplog
    opCtx->getServiceContext()->getOpObserver()->onInserts(
        opCtx, ns(), uuid(), begin, end, fromMigrate);
}

写入Oplog

在写入数据和索引完成之后，会触发OpObserverImpl::onInserts，在这里会完成Oplog的写入操作（同样细节在后续章节分享）。

总结

本次介绍了创建记录和索引的主要流程，其中很多串联了之前分享过得内容，例如创建collection，WriteUnitOfWork等，但是留了很多的坑，后面的章节会进行填坑。

MongoDB源码学习：创建记录和索引（insertDocuments）

简介

创建记录都做了些什么

当collection不存在时候进行创建

为每一条记录分配一个OplogSlot（包含时间戳）用于记录Oplog时候保证有序

调用storage创建记录与记录索引

写入Oplog

总结

to be continue