Rocketmq源码解读——刷盘策略
前言
rocketmq一共有三种(其实真正进行写操作的是两种)刷盘策略:
- CommitRealTimeService:异步刷盘并且开启内存字节缓冲区
- FlushRealTimeService:异步刷盘但是不开启内存字节缓冲区
- GroupCommitService:同步刷盘
三种刷盘策略性能依次降低。
代码
先看刷盘策略被初始化的部分:
private final FlushCommitLogService flushCommitLogService;
//If TransientStorePool enabled, we must flush message to FileChannel at fixed periods
private final FlushCommitLogService commitLogService;
if (FlushDiskType.SYNC_FLUSH == defaultMessageStore.getMessageStoreConfig().getFlushDiskType()) {
this.flushCommitLogService = new GroupCommitService();
} else {
this.flushCommitLogService = new FlushRealTimeService();
}
this.commitLogService = new CommitRealTimeService();
这里初始化了两个FlushCommitLogService,第一个是根据刷盘策略来的,第二个固定是CommitRealTimeService,也就是我们说的性能最好的刷盘策略。
后面会讲到作用。
我们先看第二和第三种刷盘策略。先看FlushRealTimeService,实际上是一个任务线程:
//定时执行刷盘还是等待
boolean flushCommitLogTimed = CommitLog.this.defaultMessageStore.getMessageStoreConfig().isFlushCommitLogTimed();
//如果是定时执行,间隔时长
int interval = CommitLog.this.defaultMessageStore.getMessageStoreConfig().getFlushIntervalCommitLog();
//被刷到CommitLog里的page数量
int flushPhysicQueueLeastPages = CommitLog.this.defaultMessageStore.getMessageStoreConfig().getFlushCommitLogLeastPages();
int flushPhysicQueueThoroughInterval =
CommitLog.this.defaultMessageStore.getMessageStoreConfig().getFlushCommitLogThoroughInterval();
boolean printFlushProgress = false;
// Print flush progress
long currentTimeMillis = System.currentTimeMillis();
//当时间满足flushPhysicQueueThoroughInterval时,
//即使写入的数量不足flushPhysicQueueLeastPages,也进行flush
if (currentTimeMillis >= (this.lastFlushTimestamp + flushPhysicQueueThoroughInterval)) {
this.lastFlushTimestamp = currentTimeMillis;
flushPhysicQueueLeastPages = 0;
printFlushProgress = (printTimes++ % 10) == 0;
}
try {
//是等待执行还是间隔执行
if (flushCommitLogTimed) {
Thread.sleep(interval);
} else {
//插入消息时会用wakeup方式唤醒
this.waitForRunning(interval);
}
if (printFlushProgress) {
this.printFlushProgress();
}
long begin = System.currentTimeMillis();
//刷盘
CommitLog.this.mappedFileQueue.flush(flushPhysicQueueLeastPages);
long storeTimestamp = CommitLog.this.mappedFileQueue.getStoreTimestamp();
if (storeTimestamp > 0) {
CommitLog.this.defaultMessageStore.getStoreCheckpoint().setPhysicMsgTimestamp(storeTimestamp);
}
long past = System.currentTimeMillis() - begin;
} catch (Throwable e) {
this.printFlushProgress();
}
上面的注释已经比较详细了,我们看一下flush方法,最终会落到MappedFile的flush方法上:
//this.isAbleToFlush主要是考虑到写入性能,满足 flushLeastPages * OS_PAGE_SIZE才进行flush
if (this.isAbleToFlush(flushLeastPages)) {
if (this.hold()) {
int value = getReadPosition();
try {
//We only append data to fileChannel or mappedByteBuffer, never both.
//写入到fileChannel或者mappedByteBuffer中,强制把缓存中的内容刷到磁盘上。
if (writeBuffer != null || this.fileChannel.position() != 0) {
this.fileChannel.force(false);
} else {
this.mappedByteBuffer.force();
}
} catch (Throwable e) {
}
this.flushedPosition.set(value);
this.release();
} else {
this.flushedPosition.set(getReadPosition());
}
}
return this.getFlushedPosition();
再看一下GroupCommitService。我们先看插入消息以后,如果是同步刷盘策略,会做什么:
GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
service.putRequest(request);
boolean flushOK = request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
if (!flushOK) {
putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_DISK_TIMEOUT);
}
这里这个GroupCommitRequest的作用主要是表明从哪里写,写多少字节。接着看代码:
//request写入队列,和read队列切换
private volatile List<GroupCommitRequest> requestsWrite = new ArrayList<GroupCommitRequest>();
//request读队列,和write队列切换
private volatile List<GroupCommitRequest> requestsRead = new ArrayList<GroupCommitRequest>();
//把新的request放到写队列里,然后唤醒线程
public synchronized void putRequest(final GroupCommitRequest request) {
////添加写入请求。方法设置了sync的原因:this.requestsWrite 会和 this.requestsRead 不断交换,无法保证稳定的同步。
synchronized (this.requestsWrite) {
this.requestsWrite.add(request);
}
if (hasNotified.compareAndSet(false, true)) {
waitPoint.countDown(); // notify
}
}
//读写队列切换
private void swapRequests() {
List<GroupCommitRequest> tmp = this.requestsWrite;
this.requestsWrite = this.requestsRead;
this.requestsRead = tmp;
}
private void doCommit() {
synchronized (this.requestsRead) {
if (!this.requestsRead.isEmpty()) {
for (GroupCommitRequest req : this.requestsRead) {
// There may be a message in the next file, so a maximum of
// two times the flush
boolean flushOK = false;
//考虑到有可能每次循环的消息写入的消息,可能分布在两个 MappedFile(写第N个消息时,MappedFile 已满,创建了一个新的),所以需要有循环2次。
for (int i = 0; i < 2 && !flushOK; i++) {
//是否满足需要flush条件,即请求的offset超过flush的offset
flushOK = CommitLog.this.mappedFileQueue.getFlushedWhere() >= req.getNextOffset();
if (!flushOK) {
CommitLog.this.mappedFileQueue.flush(0);
}
}
req.wakeupCustomer(flushOK);
}
long storeTimestamp = CommitLog.this.mappedFileQueue.getStoreTimestamp();
if (storeTimestamp > 0) {
CommitLog.this.defaultMessageStore.getStoreCheckpoint().setPhysicMsgTimestamp(storeTimestamp);
}
//// 清理读取队列
this.requestsRead.clear();
} else {
// Because of individual messages is set to not sync flush, it
// will come to this process
CommitLog.this.mappedFileQueue.flush(0);
}
}
}
public void run() {
while (!this.isStopped()) {
try {
//等待10ms后执行
this.waitForRunning(10);
this.doCommit();
} catch (Exception e) {
CommitLog.log.warn(this.getServiceName() + " service has exception. ", e);
}
}
// Under normal circumstances shutdown, wait for the arrival of the
// request, and then flush
try {
Thread.sleep(10);
} catch (InterruptedException e) {
}
//切换读写队列
synchronized (this) {
this.swapRequests();
}
//最后执行一次commit
this.doCommit();
}
总结一下,先看写入request的部分:
- 先向write队列写
- 启动时切换读写队列(这时一定是读队列为空),读队列作为写队列去存储request,写队列则作为读队列做批量提交
- 结束提交后清理一下队列,然后交换继续执行。
真正flush的部分:
- 循环所有的request,如果请求的offset超过flush的offset,则进行一次flush
- 清理读取队列。
- 如果读取队列是空的,则直接flush
最后看一下CommitRealTimeService。这个是刷盘策略,也是性能最好的,但是实际上这个并非是落盘策略,这也是为什么需要独立启动一个CommitRealTimeService的原因。
我们看一下代码:
int interval = CommitLog.this.defaultMessageStore.getMessageStoreConfig().getCommitIntervalCommitLog();
int commitDataLeastPages = CommitLog.this.defaultMessageStore.getMessageStoreConfig().getCommitCommitLogLeastPages();
int commitDataThoroughInterval =
CommitLog.this.defaultMessageStore.getMessageStoreConfig().getCommitCommitLogThoroughInterval();
long begin = System.currentTimeMillis();
if (begin >= (this.lastCommitTimestamp + commitDataThoroughInterval)) {
this.lastCommitTimestamp = begin;
commitDataLeastPages = 0;
}
try {
//这里并非flush 而是commit
boolean result = CommitLog.this.mappedFileQueue.commit(commitDataLeastPages);
long end = System.currentTimeMillis();
if (!result) {
this.lastCommitTimestamp = end; // result = false means some data committed.
//now wake up flush thread.
// 未写入成功,意思是只写了部分数据,同样刷盘一次
flushCommitLogService.wakeup();
}
this.waitForRunning(interval);
} catch (Throwable e) {
}
跟FlushRealTimeService类似的地方我就不说了,我们直接看commit操作,最终同样会落到MappedFile的commit上:
if (writeBuffer == null) {
//no need to commit data to file channel, so just regard wrotePosition as committedPosition.
return this.wrotePosition.get();
}
//考虑到写入性能,满足 commitLeastPages * OS_PAGE_SIZE 才进行 commit。
if (this.isAbleToCommit(commitLeastPages)) {
if (this.hold()) {
commit0(commitLeastPages);
this.release();
} else {
}
}
// All dirty data has been committed to FileChannel.
//写到文件尾时,回收writeBuffer。
if (writeBuffer != null && this.transientStorePool != null && this.fileSize == this.committedPosition.get()) {
this.transientStorePool.returnBuffer(writeBuffer);
this.writeBuffer = null;
}
return this.committedPosition.get();
继续看commit0(commit实现,将writeBuffer写入fileChannel):
int writePos = this.wrotePosition.get();
int lastCommittedPosition = this.committedPosition.get();
if (writePos - this.committedPosition.get() > 0) {
try {
ByteBuffer byteBuffer = writeBuffer.slice();
byteBuffer.position(lastCommittedPosition);
byteBuffer.limit(writePos);
this.fileChannel.position(lastCommittedPosition);
// 写入fileChannel
this.fileChannel.write(byteBuffer);
this.committedPosition.set(writePos);
} catch (Throwable e) {
}
}
可以看到,CommitReal实际上是把buffer中的数据写入到fileChannel里,再由fileChannel写入到文件里落盘。