概述
我们都知道rocketmq的broker是基于raft协议来做的集群高可用,那么我们今天这节就来讲讲rocketmq的主从同步相关的源码分析。
总流程
HAservice
private final AtomicInteger connectionCount = new AtomicInteger(0);
private final List<HAConnection> connectionList = new LinkedList<>();
private final AcceptSocketService acceptSocketService; //主要实现主服务器监听从服务器的连接请求
private final DefaultMessageStore defaultMessageStore;//消息存储
private final WaitNotifyObject waitNotifyObject = new WaitNotifyObject();
private final AtomicLong push2SlaveMaxOffset = new AtomicLong(0); //记录发送给从节点的最大位移
private final GroupTransferService groupTransferService; //和同步落盘的commitlOG的作用是差不多的 用于阻塞客户端的请求等待从节点ack
private final HAClient haClient; //用于从节点的相关操作
我们接下来就根据上面的流程来看一看源码~
master step1:启动并监听haPort【默认10912】
其中这个master启动并监听端口等待从节点的连接操作是在acceptScoketService中进行实现的:
private final SocketAddress socketAddressListen; //监听的端口
private ServerSocketChannel serverSocketChannel; //channel
private Selector selector;
public AcceptSocketService(final int port) {
this.socketAddressListen = new InetSocketAddress(port);
}
/**
* Starts listening to slave connections.
*
* @throws Exception If fails.
*/
public void beginAccept() throws Exception {
//启动serverSocketChannel并注册选择器
this.serverSocketChannel = ServerSocketChannel.open();
this.selector = RemotingUtil.openSelector();
this.serverSocketChannel.socket().setReuseAddress(true);
this.serverSocketChannel.socket().bind(this.socketAddressListen);
this.serverSocketChannel.configureBlocking(false);
this.serverSocketChannel.register(this.selector, SelectionKey.OP_ACCEPT);
}
/**
* {@inheritDoc}
*/
@Override
public void run() {
log.info(this.getServiceName() + " service started");
while (!this.isStopped()) {
try {
this.selector.select(1000); //每隔1s 处理一下连接请求
Set<SelectionKey> selected = this.selector.selectedKeys();
if (selected != null) {
for (SelectionKey k : selected) {
if ((k.readyOps() & SelectionKey.OP_ACCEPT) != 0) {
SocketChannel sc = ((ServerSocketChannel) k.channel()).accept();
if (sc != null) {
HAService.log.info("HAService receive new connection, "
+ sc.socket().getRemoteSocketAddress());
try { //将从节点的连接封装成HAConnection对象加入集合中
HAConnection conn = new HAConnection(HAService.this, sc);
conn.start();
HAService.this.addConnection(conn);
} catch (Exception e) {
log.error("new HAConnection exception", e);
sc.close();
}
}
} else {
log.warn("Unexpected ops in select " + k.readyOps());
}
}
selected.clear();
}
} catch (Exception e) {
log.error(this.getServiceName() + " service has exception.", e);
}
}
log.info(this.getServiceName() + " service end");
}
}
slave step1 启动并连接master
从节点的相关操作是在HaClient中进行实现的,我们先来看看HaClient的属性有哪些
private static final int READ_MAX_BUFFER_SIZE = 1024 * 1024 * 4; //读缓冲区最大大小
private final AtomicReference<String> masterAddress = new AtomicReference<>(); //主服务器地址
private final ByteBuffer reportOffset = ByteBuffer.allocate(8); //从服务器向主服务器发起主从同步的拉取偏移量
private SocketChannel socketChannel; //网络传输通道
private Selector selector; //选择器
private long lastWriteTimestamp = System.currentTimeMillis(); //上次写入消息的时间戳
private long currentReportedOffset = 0; //反馈从服务器的复制进度 即当前CommitLog文件的最大偏移量
private int dispatchPosition = 0; //本次已处理的读缓冲器的指针
private ByteBuffer byteBufferRead = ByteBuffer.allocate(READ_MAX_BUFFER_SIZE); //读缓冲区 大小为4M
private ByteBuffer byteBufferBackup = ByteBuffer.allocate(READ_MAX_BUFFER_SIZE); //读缓冲区备份 与bufferRead交换
连接主服务器
private boolean connectMaster() throws ClosedChannelException {
if (null == socketChannel) {
String addr = this.masterAddress.get();
if (addr != null) {
SocketAddress socketAddress = RemotingUtil.string2SocketAddress(addr);
if (socketAddress != null) {
this.socketChannel = RemotingUtil.connect(socketAddress); //连接主服务器
if (this.socketChannel != null) {
this.socketChannel.register(this.selector, SelectionKey.OP_READ);
}
}
}
this.currentReportedOffset = HAService.this.defaultMessageStore.getMaxPhyOffset();
this.lastWriteTimestamp = System.currentTimeMillis();
}
return this.socketChannel != null;
}
slave step2 发送当前自己的commitLog的Offset最大值
说明:
这里其实可以从两个方面来看 对于slave来说 这次发出去的是当前自己的commitLog的最大值 也就是下次请求的开始位置 对于master来说可以看作是slave上次请求的ack/下次请求的开始offset
private boolean reportSlaveMaxOffset(final long maxOffset) { //commitLog的最大offset
this.reportOffset.position(0);
this.reportOffset.limit(8);
this.reportOffset.putLong(maxOffset);
this.reportOffset.position(0);
this.reportOffset.limit(8);
for (int i = 0; i < 3 && this.reportOffset.hasRemaining(); i++) {
try {
this.socketChannel.write(this.reportOffset); //写入管道
} catch (IOException e) {
log.error(this.getServiceName()
+ "reportSlaveMaxOffset this.socketChannel.write exception", e);
return false;
}
}
lastWriteTimestamp = HAService.this.defaultMessageStore.getSystemClock().now();
return !this.reportOffset.hasRemaining();
}
master step2处理从服务器拉取信息的请求
@Override
public void run() {
HAConnection.log.info(this.getServiceName() + " service started");
while (!this.isStopped()) {
try {
this.selector.select(1000);
if (-1 == HAConnection.this.slaveRequestOffset) { //表示当前还没有从节点请求
Thread.sleep(10);
continue;
}
if (-1 == this.nextTransferFromWhere) { //证明这个时候主服务器还没有收到从服务器的拉取消息的请求 放弃本次事件处理 这个字段在收到从服务器拉取消息的请求时候更新
if (0 == HAConnection.this.slaveRequestOffset) { //从当前的commitLog的最大偏移量开始
long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
masterOffset =
masterOffset
- (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
.getMappedFileSizeCommitLog());
if (masterOffset < 0) {
masterOffset = 0;
}
this.nextTransferFromWhere = masterOffset;
} else {
this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
}
log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr
+ "], and slave request " + HAConnection.this.slaveRequestOffset);
}
if (this.lastWriteOver) { //判断上次的消息时候处理完了
long interval =
HAConnection.this.haService.getDefaultMessageStore().getSystemClock().now() - this.lastWriteTimestamp;
if (interval > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
.getHaSendHeartbeatInterval()) {
// Build Header
this.byteBufferHeader.position(0);
this.byteBufferHeader.limit(headerSize);
this.byteBufferHeader.putLong(this.nextTransferFromWhere);
this.byteBufferHeader.putInt(0);
this.byteBufferHeader.flip();
this.lastWriteOver = this.transferData();
if (!this.lastWriteOver)
continue;
}
} else {
this.lastWriteOver = this.transferData(); //没写完先处理上次的
if (!this.lastWriteOver)
continue;
}
//通过当前的offset取commitLog
SelectMappedBufferResult selectResult =
HAConnection.this.haService.getDefaultMessageStore().getCommitLogData(this.nextTransferFromWhere);
if (selectResult != null) {
int size = selectResult.getSize();
//最大size 【可配置】
if (size > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize()) {
size = HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize();
}
long thisOffset = this.nextTransferFromWhere;
this.nextTransferFromWhere += size; //设置下次偏移量
selectResult.getByteBuffer().limit(size);
this.selectMappedBufferResult = selectResult;
// Build Header 构建消息header
this.byteBufferHeader.position(0);
this.byteBufferHeader.limit(headerSize);
this.byteBufferHeader.putLong(thisOffset);
this.byteBufferHeader.putInt(size);
this.byteBufferHeader.flip();
this.lastWriteOver = this.transferData();
} else {
HAConnection.this.haService.getWaitNotifyObject().allWaitForRunning(100);
}
} catch (Exception e) {
HAConnection.log.error(this.getServiceName() + " service has exception.", e);
break;
}
}
HAConnection.this.haService.getWaitNotifyObject().removeFromWaitingThreadTable();
if (this.selectMappedBufferResult != null) {
this.selectMappedBufferResult.release();
}
this.makeStop();
readSocketService.makeStop();
haService.removeConnection(HAConnection.this);
SelectionKey sk = this.socketChannel.keyFor(this.selector);
if (sk != null) {
sk.cancel();
}
try {
this.selector.close();
this.socketChannel.close();
} catch (IOException e) {
HAConnection.log.error("", e);
}
HAConnection.log.info(this.getServiceName() + " service end");
}
transferData()
private boolean transferData() throws Exception {
int writeSizeZeroTimes = 0;
// Write Header
while (this.byteBufferHeader.hasRemaining()) {
int writeSize = this.socketChannel.write(this.byteBufferHeader);
if (writeSize > 0) {
writeSizeZeroTimes = 0;
this.lastWriteTimestamp = HAConnection.this.haService.getDefaultMessageStore().getSystemClock().now();
} else if (writeSize == 0) {
if (++writeSizeZeroTimes >= 3) {
break;
}
} else {
throw new Exception("ha master write header error < 0");
}
}
if (null == this.selectMappedBufferResult) {
return !this.byteBufferHeader.hasRemaining();
}
writeSizeZeroTimes = 0;
// Write Body 开始写消息
if (!this.byteBufferHeader.hasRemaining()) {
while (this.selectMappedBufferResult.getByteBuffer().hasRemaining()) {
int writeSize = this.socketChannel.write(this.selectMappedBufferResult.getByteBuffer());
if (writeSize > 0) {
writeSizeZeroTimes = 0;
this.lastWriteTimestamp = HAConnection.this.haService.getDefaultMessageStore().getSystemClock().now();
} else if (writeSize == 0) {
if (++writeSizeZeroTimes >= 3) {
break;
}
} else {
throw new Exception("ha master write body error < 0");
}
}
}
boolean result = !this.byteBufferHeader.hasRemaining() && !this.selectMappedBufferResult.getByteBuffer().hasRemaining();
if (!this.selectMappedBufferResult.getByteBuffer().hasRemaining()) {
this.selectMappedBufferResult.release();
this.selectMappedBufferResult = null;
}
return result;
}
流程:
- master每隔1s处理一次slave的读请求
- 判断记录的下次传输的起始offset如果是-1并且slave请求的offset=0的话就从当前commitLog的最大offset开始
- 判断上次的消息是否处理完了 如果没有先将上次的处理完
- 将header【包括起始偏移量&size】写到channel中 并设置下次的起始offset
- 根据偏移量获取commotLog内容 并将其写到channel中
- 标记当前是否传输完成
slave step2 从节点处理master信息并更新commitlog
处理master的信息
private boolean processReadEvent() {
int readSizeZeroTimes = 0;
while (this.byteBufferRead.hasRemaining()) {
try {
int readSize = this.socketChannel.read(this.byteBufferRead);
if (readSize > 0) {
readSizeZeroTimes = 0;
boolean result = this.dispatchReadRequest();
if (!result) {
log.error("HAClient, dispatchReadRequest error");
return false;
}
} else if (readSize == 0) {
if (++readSizeZeroTimes >= 3) {
break;
}
} else {
log.info("HAClient, processReadEvent read socket < 0");
return false;
}
} catch (IOException e) {
log.info("HAClient, processReadEvent read socket exception", e);
return false;
}
}
return true;
}
主要看一下dispatchReadRequest这个方法的逻辑:
private boolean dispatchReadRequest() {
final int msgHeaderSize = 8 + 4; // phyoffset + size
int readSocketPos = this.byteBufferRead.position();
while (true) {
int diff = this.byteBufferRead.position() - this.dispatchPosition;
if (diff >= msgHeaderSize) { //证明有消息
long masterPhyOffset = this.byteBufferRead.getLong(this.dispatchPosition);
int bodySize = this.byteBufferRead.getInt(this.dispatchPosition + 8);
long slavePhyOffset = HAService.this.defaultMessageStore.getMaxPhyOffset();
if (slavePhyOffset != 0) {
if (slavePhyOffset != masterPhyOffset) {
log.error("master pushed offset not equal the max phy offset in slave, SLAVE: "
+ slavePhyOffset + " MASTER: " + masterPhyOffset);
return false;
}
}
if (diff >= (msgHeaderSize + bodySize)) {
byte[] bodyData = new byte[bodySize];
this.byteBufferRead.position(this.dispatchPosition + msgHeaderSize);
this.byteBufferRead.get(bodyData);
HAService.this.defaultMessageStore.appendToCommitLog(masterPhyOffset, bodyData);
this.byteBufferRead.position(readSocketPos);
this.dispatchPosition += msgHeaderSize + bodySize;
if (!reportSlaveMaxOffsetPlus()) {
return false;
}
continue;
}
}
if (!this.byteBufferRead.hasRemaining()) {
this.reallocateByteBuffer();
}
break;
}
return true;
}
步骤:
1.每隔5s(默认)发送一次心跳,将当前自己的commitLog的最大offset发送给master
2.每隔1s发送处理一次master发送来的信息
3.超过12字节开始处理【offset+size】证明有消息
4.判断一下当前传过来的消息的offset和自己本地的最大offset是否相等 不等直接退出
5.将消息取出追加到commitLog
读写分离
概述
client发起pullMessage请求,brker处理请求并返回result其中包括下次建议的当前messageQueue的brokerId,并更新到本地的map中下次发起对这个messageQueue的拉取消息的请求的时候使用这个brokerId
整体架构
brokermaster进行判断并选择slave
上面的代码可以看出如果当前请求拉取的消息的起始位移和当前master的maxOffset相差超过了config中配置的阈值*内存的大小那么证明当前请求的消息部分已经被交换出了内存,那么这个时候master会在返回的结果中告诉client下次建议的请求的broker地址
client接收到result中的建议的broker地址并更新到本地内存中
public void updatePullFromWhichNode(final MessageQueue mq, final long brokerId) {
//pullFromWhichNodeTable:
//ConcurrentMap<MessageQueue, AtomicLong/* brokerId */>
AtomicLong suggest = this.pullFromWhichNodeTable.get(mq);
if (null == suggest) {
this.pullFromWhichNodeTable.put(mq, new AtomicLong(brokerId));
} else {
suggest.set(brokerId);
}
}