RocketMQ的消息存储

181 阅读13分钟

一、存储文件

RocketMQ使用文件系统作为消息存储

存储效率:文件系统 > K-V存储 > 关系型数据库

可靠性:关系型数据库 > K-V存储 > 文件系统

image.png


CommitLog

所有Topic的消息按照顺序写的方式存储在CommitLog中

所在目录:{ROCKET_HOME}/store/commitlog

每一个文件默认大小为1G,一个文件写满后再创建另外一个,以该文件中第一条消息的物理偏移量为文件名,偏移量小于20位用0补齐,根据物理偏移量就能快速定位到消息

上图中有两个文件,第二个文件名的1073741824表示在第二个文件存储的第一条消息的物理偏移量为1073741824,而1073741824字节 / 1024 / 1024 / 1024 = 1G,也就是上个文件(每个文件)的大小

一条消息由以下20个部分组成:

  1. TotalSize:此消息的总长度(4字节)

  2. MagicCode:魔数,固定值Oxdaa320a7(4字节)

  3. BodyCRC:消息体CRC校验码(4字节)

  4. QueueID:消息消费队列ID(4字节)

  5. Flag:消息Flag,RocketMQ不做处理,供应用程序使用(4字节)

  6. QueueOffset:消息在消息消费队列的下标,QueueOffset值*20后表示偏移量(8字节)

  7. PhysicalOffset:消息在CommitLog文件中的物理偏移量(8字节)

  8. SysFlag:消息系统Flag,例如是否压缩、是否是事务消息等(4字节)

  9. BornTimestamp:消息生产者调用消息发送API的时间戳(8字节)

  10. BornHost:消息发送者IP和端口号(8字节)

  11. StoreTimeStamp:消息存储时的毫秒时间戳(8字节)

  12. StoreHostAddress:Broker服务器IP和端口号(8字节)

  13. ReConsumeTimes:消息重试次数(4字节)

  14. PreparedTransactionOffset:事务消息物理偏移量(8字节)

  15. BodyLength:消息体的长度(4字节)

  16. Body:消息体内容(长度为BodyLenth中存储的值)

  17. TopicLength:Topic名称的长度(1字节)

  18. Topic:Topic名称(长度为TopicLength中存储的值)

  19. PropertiesLength:消息属性长度(2字节)

  20. Properties:消息属性(长度为PropertiesLength中存储的值)


ConsumeQueue

RocketMQ基于Topic的订阅模式实现消息消费,消费者关心的是一个Topic下的所有消息,但由于同一个Topic的消息不连续地存储在CommitLog文件中,如果直接从CommitLog文件遍历查找消息,效率将非常低下。而ConsumeQueue正是为了解决这个问题,可以把ConsumeQueue当作CommitLog中消息的"索引"文件

所在目录:{ROCKET_HOME}/store/consumequeue

第一级目录为Topic,第二级目录为队列。就是说每个Topic都会建立一个文件夹,Topic中的每个队列又会建立一个文件夹,默认配置下会有4个队列,对应的文件夹名称为0、1、2、3

单个文件中默认包含30万个条目,大小为6M(30w×20字节),一个文件写满后再创建另外一个,以该文件中第一个条目的物理偏移量为文件名,偏移量小于20位用0补齐。当消息到达Commitlog文件后,会由专门的线程进行转发任务,从而构建ConsumeQueue

一个条目由以下3个部分组成:

  1. CommitLog Offset:对应CommitLog中的PhysicalOffset值(8字节)

  2. Size:对应CommitLog中的TotalSize值(4字节)

  3. Tag HashCode:消息标签的哈希值,用来进一步区分某个Topic下的消息分类(8字节)


Index

提供了一种可以通过key或时间区间来查询消息的能力,文件名以创建时的时间戳命名,固定的单个Index文件大小约为400M,一个IndexFile可以保存2000W个索引,Index底层存储设计为在文件系统中实现HashMap结构

所在目录:{ROCKET_HOME}/store/index


举个栗子吧!

(1)启动NamesrvStartupBrokerStartupMyProducerMyConsumer

public class MyProducer {

    public static void main(String[] args) throws MQClientException, InterruptedException {
        DefaultMQProducer producer = new DefaultMQProducer("ProducerA");
        producer.setNamesrvAddr("127.0.0.1:9876");
        producer.start();
        for (int i = 0; i < 10; i++) {
            try {
                Message msg = new Message(
                        "Topic-01",
                        ("Store Msg " + i).getBytes(RemotingHelper.DEFAULT_CHARSET));
                producer.send(msg);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
}
public class MyConsumer {

    public static void main(String[] args) throws InterruptedException, MQClientException {
        DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("ConsumerA");
        consumer.setNamesrvAddr("localhost:9876");
        consumer.setConsumeFromWhere(ConsumeFromWhere.CONSUME_FROM_FIRST_OFFSET);
        consumer.subscribe("Topic-01", "*");
        consumer.registerMessageListener(
                (MessageListenerConcurrently) (msgs, context) -> {
            System.out.printf("%s Receive New Messages: %n", msgs);
            return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
        });
        consumer.start();
    }
}

(2)再进入{ROCKET_HOME}/store/commitlog目录

(3)使用xxd 00000000000000000000 | less命令查看第一个文件的内容

Xnip2021-12-30_13-54-31.jpg

重点需要关注的部分已经用红点标记出来了,红点上的序号与上述文件格式中的序号一致

① TotalSize = 000000b9:该消息的总长度为185字节

② MagicCode = daa320a7:魔数,固定值Oxdaa320a7

④ QueueID = 00000000:消息在ConsumeQueue的0号队列

⑥ QueueOffset = 000000000000000:消息在ConsumeQueue的0号队列的偏移量为0

⑦ PhysicalOffset = 0000000000000000:消息在CommitLog文件中的偏移量为0

⑨ BornTimestamp = 0000017e093e326c:消息生产者调用消息发送API的时间戳为1640832578156

⑮ BodyLength = 0000000b:消息体长度为11字节,对应的消息内容为

53746f7265204d73672031 = Store Msg 1

⑰ TopicLength = 08:Topic名称长度为8字节,对应的名称为

546f7069632d3031 = Topic-01

CommitLog中的第一条消息到红色竖线的位置为止,紧接着便是第二条消息的内容了

(4)进入{ROCKET_HOME}/store/consumequeue/Topic-01/0目录

(5)使用xxd 00000000000000000000 | less命令查看第一个文件的内容

image.png

① CommitLog Offset = 00000000:此条目对应的消息在CommitLog中的PhysicalOffset值为0

② Size = 000000b9:此条目对应的消息在CommitLog中的TotalSize值为185字节

③ Tag HashCode = 00000000:此条目对应的消息标签的哈希值为0

另外,在{ROCKET_HOME}/store/config/consumerOffset.json文件中还记录了ConsumeQueue中每条队列的消费进度

{
    "offsetTable":{
        "%RETRY%ConsumerA@ConsumerA":{0:0},
        "Topic-01@ConsumerA":{0:3,1:2,2:2,3:3}
    }
}

小结一下

image.png

  • RocketMQ采用的是混合型的存储结构,即Broker单个实例下所有的Topic共用一个数据文件(CommitLog)来存储
  • 针对Producer和Consumer采用了数据和索引相分离的存储结构,Producer发送消息至Broker端,然后Broker端使用同步或者异步的方式对消息刷盘持久化,保存至CommitLog中
  • 在Broker中有专门的后台线程ReputMessageService不停地分发请求并异步构建ConsumeQueue和Index的数据
  • 消费时Consumer通过ConsumeQueue中记录的CommitLog Offset和Size在CommitLog找到相应的消息来进行消费

二、存储流程

核心对象

先来认识一下存储流程中主要的几个对象

image.png

DefaultMessageStore:消息存储的顶层对象,在Broker启动时会对其进行初始化,是对{ROCKET_HOME}/store目录的封装

public class DefaultMessageStore implements MessageStore {
    // 消息存储配置,包含70多个属性
    private final MessageStoreConfig messageStoreConfig;
    // CommitLog
    private final CommitLog commitLog;
    // Topic、QueueId、ConsumeQueue三者关联Map
    private final ConcurrentMap<String, ConcurrentMap<Integer, ConsumeQueue>> consumeQueueTable;
}

CommitLog:对{ROCKET_HOME}/store/commitlog目录的封装

public class CommitLog {
    // MappedFileQueue
    protected final MappedFileQueue mappedFileQueue;
    // 通过DefaultMessageStore来获取相关的配置和操作
    protected final DefaultMessageStore defaultMessageStore;
    // Topic-QueueId和Offset的关联Map
    // 例:key = MyTopic-0, value = 1 表示在名称为MyTopic的Topic里,Id为0的队列中已经存储了一条消息
    protected HashMap<String, Long> topicQueueTable = new HashMap<String, Long>(1024);
    // 存储消息时用的锁,默认实现为for循环+CAS实现的自旋锁
    protected final PutMessageLock putMessageLock;
    // 刷盘服务,默认采用异步刷盘
    private final FlushCommitLogService flushCommitLogService;
}

MappedFileQueue:对{ROCKET_HOME}/store/commitlog目录下所有文件的封装

public class MappedFileQueue {
    // 维护了MappedFile集合
    private final CopyOnWriteArrayList<MappedFile> mappedFiles = new CopyOnWriteArrayList<MappedFile>();
}

MappedFile:对{ROCKET_HOME}/store/commitlog目录下单个文件的封装

public class MappedFile extends ReferenceResource {
    // 已写入消息的偏移量,不能大于fileSize
    protected final AtomicInteger wrotePosition = new AtomicInteger(0);
    // 已提交消息的偏移量
    protected final AtomicInteger committedPosition = new AtomicInteger(0);
    // 已刷盘消息的偏移量
    private final AtomicInteger flushedPosition = new AtomicInteger(0);
    // 文件大小,默认1G
    protected int fileSize;
    // FileChannel
    protected FileChannel fileChannel;
    // 文件名称
    private String fileName;
    // 文件名称对应的物理偏移量
    private long fileFromOffset;
    // 文件对象
    private File file;
    // 从FileChannel中获取,是实际读写的操作对象,与下述的ByteBuffer二选一
    private MappedByteBuffer mappedByteBuffer;
    // 从TransientStorePool中获取,与上述的MappedByteBuffer二选一
    protected ByteBuffer writeBuffer = null;

核心代码

以下仅展示关键步骤的代码

public class DefaultMessageStore implements MessageStore {
    // 消息存储的入口方法
    public CompletableFuture<PutMessageResult> asyncPutMessage(MessageExtBrokerInner msg) {
        // 由上文可知消息是存储在CommitLog中的
        // 所以在这里调用MessageStore维护的CommitLog的同名方法
        CompletableFuture<PutMessageResult> putResultFuture = 
            this.commitLog.asyncPutMessage(msg);
    }
}
public class CommitLog {
    public CompletableFuture<PutMessageResult> asyncPutMessage(final MessageExtBrokerInner msg) {
        // 从MappedFileQueue中获取最新的MappedFile,即文件名的数字最大的文件
        MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile();
        // 获取锁
        putMessageLock.lock();

        try {
            // 将消息添加到内存中
            result = mappedFile.appendMessage(msg, this.appendMessageCallback);
            // 文件刷盘
            CompletableFuture<PutMessageStatus> flushResultFuture = 
                                                       submitFlushRequest(result, msg);
            // 主从复制
            CompletableFuture<PutMessageStatus> replicaResultFuture = 
                                                     submitReplicaRequest(result, msg);
            // 合并上述两步的结果并返回
            return flushResultFuture.thenCombine(...)

        } finally {
            // 释放锁
            putMessageLock.unlock();
        }
    }
}
第一步:将消息添加到内存中

最终会调用CommitLog.DefaultAppendMessageCallback#doAppend()方法

四个方法参数简介:

  • fileFromOffset:当前最新MappedFile的物理偏移量,即此MappedFile的文件名

  • byteBuffer:当前最新MappedFile对应的ByteBuffer对象

  • maxBlank:根据当前最新MappedFile的文件大小fileSize - 已写完的消息位置wrotePosition计算得到,表示此文件的剩余大小

  • msgInner:在内部重新封装后的消息对象

public AppendMessageResult doAppend(final long fileFromOffset, final ByteBuffer byteBuffer, final int maxBlank, final MessageExtBrokerInner msgInner) {

    // CommitLog物理偏移量 = 当前最新MappedFile的物理偏移量 + 当前最新MappedFile对应的ByteBuffer内部的位置
    long wroteOffset = fileFromOffset + byteBuffer.position();

    int sysflag = msgInner.getSysFlag();
    
    // 生成唯一的MsgId:UtilAll.bytes2string(Broker的IP地址 + CommitLog的物理偏移量)
    int bornHostLength = (sysflag & MessageSysFlag.BORNHOST_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
    int storeHostLength = (sysflag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
    ByteBuffer bornHostHolder = ByteBuffer.allocate(bornHostLength);
    ByteBuffer storeHostHolder = ByteBuffer.allocate(storeHostLength);
    this.resetByteBuffer(storeHostHolder, storeHostLength);
    String msgId;
    if ((sysflag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0) {
        msgId = MessageDecoder.createMessageId(this.msgIdMemory, 
                             msgInner.getStoreHostBytes(storeHostHolder), wroteOffset);
    } else {
        msgId = MessageDecoder.createMessageId(this.msgIdV6Memory,
                             msgInner.getStoreHostBytes(storeHostHolder), wroteOffset);
    }
    
    // 获取TopicQueueTable中对应队列的偏移量queueOffset
    keyBuilder.setLength(0);
    keyBuilder.append(msgInner.getTopic());
    keyBuilder.append('-');
    keyBuilder.append(msgInner.getQueueId());
    String key = keyBuilder.toString();
    Long queueOffset = CommitLog.this.topicQueueTable.get(key);
    if (null == queueOffset) {
        queueOffset = 0L;
        CommitLog.this.topicQueueTable.put(key, queueOffset);
    }

    // 事务消息要对queueOffset进行特殊处理
    final int tranType = MessageSysFlag.getTransactionValue(msgInner.getSysFlag());
    switch (tranType) {
        case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
        case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
            queueOffset = 0L;
            break;
        case MessageSysFlag.TRANSACTION_NOT_TYPE:
        case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
        default:
            break;
    }

    // 获取消息属性Properties
    final byte[] propertiesData =
        msgInner.getPropertiesString() == null ? null : msgInner.getPropertiesString().getBytes(MessageDecoder.CHARSET_UTF8);

    final int propertiesLength = propertiesData == null ? 0 : propertiesData.length;

    if (propertiesLength > Short.MAX_VALUE) {
        log.warn("putMessage message properties length too long. length={}", propertiesData.length);
        return new AppendMessageResult(AppendMessageStatus.PROPERTIES_SIZE_EXCEEDED);
    }
    
    // 获取Topic
    final byte[] topicData = msgInner.getTopic().getBytes(MessageDecoder.CHARSET_UTF8);
    final int topicLength = topicData.length;

    final int bodyLength = msgInner.getBody() == null ? 0 : msgInner.getBody().length;
    // 获取消息长度
    final int msgLen = calMsgLength(msgInner.getSysFlag(), bodyLength, topicLength, propertiesLength);

    if (msgLen > this.maxMessageSize) {
        CommitLog.log.warn("message size exceeded, msg total size: " + msgLen + ", msg body size: " + bodyLength + ", maxMessageSize: " + this.maxMessageSize);
        return new AppendMessageResult(AppendMessageStatus.MESSAGE_SIZE_EXCEEDED);
    }

    // 判断是否有足够的空间
    if ((msgLen + END_FILE_MIN_BLANK_LENGTH) > maxBlank) {
        this.resetByteBuffer(this.msgStoreItemMemory, maxBlank);
        this.msgStoreItemMemory.putInt(maxBlank);
        this.msgStoreItemMemory.putInt(CommitLog.BLANK_MAGIC_CODE);
        final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
        byteBuffer.put(this.msgStoreItemMemory.array(), 0, maxBlank);
        return new AppendMessageResult(AppendMessageStatus.END_OF_FILE, wroteOffset, maxBlank, msgId, msgInner.getStoreTimestamp(), queueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
    }

    // 初始化msgStoreItemMemory
    this.resetByteBuffer(msgStoreItemMemory, msgLen);
    // 1 TOTALSIZE
    this.msgStoreItemMemory.putInt(msgLen);
    // 2 MAGICCODE
    this.msgStoreItemMemory.putInt(CommitLog.MESSAGE_MAGIC_CODE);
    // 3 BODYCRC
    this.msgStoreItemMemory.putInt(msgInner.getBodyCRC());
    // 4 QUEUEID
    this.msgStoreItemMemory.putInt(msgInner.getQueueId());
    // 5 FLAG
    this.msgStoreItemMemory.putInt(msgInner.getFlag());
    // 6 QUEUEOFFSET
    this.msgStoreItemMemory.putLong(queueOffset);
    // 7 PHYSICALOFFSET
    this.msgStoreItemMemory.putLong(fileFromOffset + byteBuffer.position());
    // 8 SYSFLAG
    this.msgStoreItemMemory.putInt(msgInner.getSysFlag());
    // 9 BORNTIMESTAMP
    this.msgStoreItemMemory.putLong(msgInner.getBornTimestamp());
    // 10 BORNHOST
    this.resetByteBuffer(bornHostHolder, bornHostLength);
    this.msgStoreItemMemory.put(msgInner.getBornHostBytes(bornHostHolder));
    // 11 STORETIMESTAMP
    this.msgStoreItemMemory.putLong(msgInner.getStoreTimestamp());
    // 12 STOREHOSTADDRESS
    this.resetByteBuffer(storeHostHolder, storeHostLength);
    this.msgStoreItemMemory.put(msgInner.getStoreHostBytes(storeHostHolder));
    // 13 RECONSUMETIMES
    this.msgStoreItemMemory.putInt(msgInner.getReconsumeTimes());
    // 14 Prepared Transaction Offset
    this.msgStoreItemMemory.putLong(msgInner.getPreparedTransactionOffset());
    // 15 BODY
    this.msgStoreItemMemory.putInt(bodyLength);
    if (bodyLength > 0)
        this.msgStoreItemMemory.put(msgInner.getBody());
    // 16 TOPIC
    this.msgStoreItemMemory.put((byte) topicLength);
    this.msgStoreItemMemory.put(topicData);
    // 17 PROPERTIES
    this.msgStoreItemMemory.putShort((short) propertiesLength);
    if (propertiesLength > 0)
        this.msgStoreItemMemory.put(propertiesData);

    final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
    
    // 最终将消息写入byteBuffer!
    byteBuffer.put(this.msgStoreItemMemory.array(), 0, msgLen);

    AppendMessageResult result = new AppendMessageResult(AppendMessageStatus.PUT_OK, 
                wroteOffset, msgLen, msgId, msgInner.getStoreTimestamp(), queueOffset, 
                CommitLog.this.defaultMessageStore.now() - beginTimeMills);

    switch (tranType) {
        case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
        case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
            break;
        case MessageSysFlag.TRANSACTION_NOT_TYPE:
        case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
            // The next update ConsumeQueue information
            CommitLog.this.topicQueueTable.put(key, ++queueOffset);
            break;
        default:
            break;
    }
    return result;
}

小结:计算CommitLog的物理偏移量,通过约定的格式将消息写入到MappedFile的ByteBuffer对象中

第二步:文件刷盘

CommitLog在初始化时会根据配置选择不同的刷盘方式

if (FlushDiskType.SYNC_FLUSH == defaultMessageStore.getMessageStoreConfig().getFlushDiskType()) {
    // 同步刷盘
    this.flushCommitLogService = new GroupCommitService();
} else {
    // 异步刷盘
    this.flushCommitLogService = new FlushRealTimeService();
}

在Broker启动时就会开启刷盘线程

public void start() {
    this.flushCommitLogService.start();  // 开启刷盘线程
    if (defaultMessageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
        this.commitLogService.start();
    }
}

在执行到CommitLog#submitFlushRequest()方法时,也会调用wakeup()方法来主动唤醒刷盘线程

两种刷盘线程里run()方法的内容都是死循环+固定时间休眠+刷盘

if (FlushDiskType.SYNC_FLUSH == this.defaultMessageStore.getMessageStoreConfig()
                                                                .getFlushDiskType()) {
    final GroupCommitService service = (GroupCommitService) this.flushCommitLogService;
    // waitStoreMsgOK为true表示需要等待消息同步刷新到磁盘后再返回结果
    if (messageExt.isWaitStoreMsgOK()) {
        GroupCommitRequest request = new GroupCommitRequest(
        result.getWroteOffset() + result.getWroteBytes(),
              this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
        // 将刷盘请求放到GroupCommitService里的写队列中
        service.putRequest(request);
        return request.future();
    } else {
        service.wakeup();
        return CompletableFuture.completedFuture(PutMessageStatus.PUT_OK);
    }
}

同步刷盘

同步刷盘的实现对象为GroupCommitService

实现原理比较巧妙,在内部维护了两个GroupCommitRequest队列,代表写队列和读队列,先将请求都放到写队列中,然后交换两个队列的对象,再从读队列中获取刚才的请求。这样做的好处是在对其进行加锁处理时,两个队列不会相互影响,达到同时读写的效果

GroupCommitRequest对象里面包含了nextOffset属性,在run()方法的死循环中默认每隔10ms执行一次doCommit()方法和swapRequests()方法

class GroupCommitService extends FlushCommitLogService {
    // 写队列
    private volatile List<GroupCommitRequest> requestsWrite = new ArrayList<>();
    // 读队列
    private volatile List<GroupCommitRequest> requestsRead = new ArrayList<>();
    
    public synchronized void putRequest(final GroupCommitRequest request) {
        synchronized (this.requestsWrite) {   // 加锁后不会对下面的swapRequests()方法产生阻塞
            this.requestsWrite.add(request);
        }
        this.wakeup();
    }
    
    // 通过交换两条队列, 来达到同时读写的效果!
    private void swapRequests() {
        List<GroupCommitRequest> tmp = this.requestsWrite;
        this.requestsWrite = this.requestsRead;
        this.requestsRead = tmp;
    }
    
    private void doCommit() {
        synchronized (this.requestsRead) {
            if (!this.requestsRead.isEmpty()) {
                // 当写队列不为空时,遍历写入消息请求
                for (GroupCommitRequest req : this.requestsRead) {
                    boolean flushOK = CommitLog.this.mappedFileQueue.getFlushedWhere() >= req.getNextOffset();
                    for (int i = 0; i < 2 && !flushOK; i++) {
                        CommitLog.this.mappedFileQueue.flush(0);  // 进行刷盘
                        flushOK = CommitLog.this.mappedFileQueue.getFlushedWhere() >= req.getNextOffset();
                    }

                    req.wakeupCustomer(flushOK ? PutMessageStatus.PUT_OK : PutMessageStatus.FLUSH_DISK_TIMEOUT);
                }

                long storeTimestamp = CommitLog.this.mappedFileQueue.getStoreTimestamp();
                if (storeTimestamp > 0) {
                    CommitLog.this.defaultMessageStore.getStoreCheckpoint().setPhysicMsgTimestamp(storeTimestamp);
                }

                this.requestsRead.clear();
            } else {
                CommitLog.this.mappedFileQueue.flush(0);
            }
        }
    }
    
    public void run() {
        while (!this.isStopped()) {
            try {
                this.waitForRunning(10);  // 内部会调用swapRequests()
                this.doCommit();
            } catch (Exception e) {
                CommitLog.log.warn(this.getServiceName() + " service has exception. ", e);
            }
        }
    }
}

异步刷盘

异步刷盘的实现对象为FlushRealTimeService

定期进行刷盘即可

class FlushRealTimeService extends FlushCommitLogService {

    public void run() {
        while (!this.isStopped()) {
            // 刷盘间隔时间,默认500毫秒
            int interval = CommitLog.this.defaultMessageStore.getMessageStoreConfig().getFlushIntervalCommitLog();
            // 每次至少要刷盘的页大小,每页为4k,默认4页
            int flushPhysicQueueLeastPages = CommitLog.this.defaultMessageStore.getMessageStoreConfig().getFlushCommitLogLeastPages();
            try {
                if (flushCommitLogTimed) {
                    Thread.sleep(interval);
                } else {
                    this.waitForRunning(interval);
                }
                CommitLog.this.mappedFileQueue
                                        .flush(flushPhysicQueueLeastPages);  // 进行刷盘
            } catch (Throwable e) {
                CommitLog.log.warn(this.getServiceName() + " service has exception. ", e);
            }
        }
    }

不管是同步刷盘还是异步刷盘,最终都会执行MappedFileQueue#flush()方法

public boolean flush(final int flushLeastPages) {
    boolean result = true;
    // 找到需要进行刷盘的MappedFile
    MappedFile mappedFile = this.findMappedFileByOffset(this.flushedWhere, 
                                             this.flushedWhere == 0);
    if (mappedFile != null) {
        long tmpTimeStamp = mappedFile.getStoreTimestamp();
        // 刷盘!
        // 内部会调用FileChannel#force()或MappedByteBuffer#force()
        int offset = mappedFile.flush(flushLeastPages);
        // 更新CommitLog的物理偏移量
        long where = mappedFile.getFileFromOffset() + offset;
        result = where == this.flushedWhere;
        this.flushedWhere = where;
        if (0 == flushLeastPages) {
            this.storeTimestamp = tmpTimeStamp;
        }
    }

    return result;
}

三、恢复

由于RocketMQ存储首先将消息全量存储在CommitLog文件中,然后异步生成转发任务更新 ConsumeQueue、Index文件。如果消息成功存储到CommitLog文件中,转发任务未成功执行,此时消息服务器Broker由于某个原因宕机,导致CommitLog、ConsumeQueue、IndexFile文件数据不一致。如果不加以人工修复的话,会有一部分消息即便在CommitLog文件中存在,但由于并没有转发到ConsumeQueue,这部分消息将永远不会被消费者消费

正常退出:从倒数第三个CommitLog文件开始,遍历判断每条消息的正确性,以最后一条消息的全局物理偏移量作为标准,更新ConsumeQueue和Index文件

异常退出:从最后一个CommitLog文件开始,遍历判断每条消息的正确性,以最后一条消息的全局物理偏移量作为标准,更新ConsumeQueue和Index文件


四、过期删除

为了避免内存与磁盘的浪费,不可能将消息永久存储在消息服务器上。如果非当前写文件在一定时间间隔内没有再次被更新,则认为是过期文件,可以被删除,RocketMQ不会关注这个文件上的消息是否全部被消费。默认每个文件的过期时间为72小时