RocketMQ4.9.1源码分析(HA模块) Master读写处理

1,052 阅读4分钟

主从同步简单流程示意

从最抽象的角度看,主从同步流程可以分为3个步骤:

  1. master启动
  2. slave的启动
  3. m/s 数据同步

image.png

针对上述步骤,结合rocketmq的设计,可以先提一些问题。

疑问点

Master

  1. master如何接受slave的请求?
  2. master处理slave请求时,如何判断哪些数据需要同步?
  3. master如何保证发送的数据同步成功的?

Slave

  1. slave如何获取master的路由信息
  2. slave如何向master报告offset
  3. slave如何处理master同步的数据

高级

  1. 同步通知异步通知是如何实现的?
  2. RocketMQ的读写分离是如何实现的?

本文先对Master的部分进行解析:

HA类

主从相关的代码位于 store/src/main/java/org/apache/rocketmq/store/ha/路径下,其中主要为两个类HAServiceHAConnection

HA类说明

  • HAService: RocketMQ 主从同步核心实现类
  • HAService$AcceptSocketService: Master 监听客户端连接
  • HAService$GroupTransferService: 主从同步通知
  • HAService$HAClient: Client 端
  • HAConnection: M/S间Channel的封装,同时负责MS数据同步逻辑。
  • HAConnection$ReadSocketService: Master 网络读实现类
  • HAConnection$WriteSocketService:Master 网络写实现类

image.png

源码部分

HAService 模块启动

HA模块的启动路径在store/src/main/java/org/apache/rocketmq/store/ha/HAService#start()

// HAService 启动
public void start() throws Exception {
    // master相关
    this.acceptSocketService.beginAccept();
    this.acceptSocketService.start();
    this.groupTransferService.start();  // 同步模式的实现
    
    //slave相关
    this.haClient.start();
}
  • acceptSocketService.beginAccept() : 启动slave的监听服务
  • acceptSocketService.start(): 处理slave的连接事件

acceptSocketService.beginAccept()

public void beginAccept() throws Exception {
    // 创建 channel
    this.serverSocketChannel = ServerSocketChannel.open();
    // 创建 selector
    this.selector = RemotingUtil.openSelector();
    // 设置 TCP reuseAddress
    this.serverSocketChannel.socket().setReuseAddress(true);
    // 绑定监听端口,默认10912
    this.serverSocketChannel.socket().bind(this.socketAddressListen);
    // 设置为非阻塞模式
    this.serverSocketChannel.configureBlocking(false);
    // 注册OP_ACCEPT(连接事件)
    this.serverSocketChannel.register(this.selector, SelectionKey.OP_ACCEPT);
}

acceptSocketService.start()

@Override
public void run() {
    log.info(this.getServiceName() + " service started");

    while (!this.isStopped()) {
        try {
            // 每1s钟处理一次slave连接事件
            this.selector.select(1000);
            Set<SelectionKey> selected = this.selector.selectedKeys();

            if (selected != null) {
                for (SelectionKey k : selected) {
                    if ((k.readyOps() & SelectionKey.OP_ACCEPT) != 0) {
                        // slave的连接channel
                        SocketChannel sc = ((ServerSocketChannel) k.channel()).accept();

                        if (sc != null) {
                            HAService.log.info("HAService receive new connection, " + sc.socket().getRemoteSocketAddress());

                            try {
                                // 创建一个HAConnection对象,保存slave的Channel
                                HAConnection conn = new HAConnection(HAService.this, sc);
                                // 启动HAConnection
                                conn.start();
                                // 保存HAConnection到connectionList中
                                HAService.this.addConnection(conn);
                            } catch (Exception e) {
                                log.error("new HAConnection exception", e);
                                sc.close();
                            }
                        }
                    } else {
                        log.warn("Unexpected ops in select " + k.readyOps());
                    }
                }

                selected.clear();
            }
        } catch (Exception e) {
            log.error(this.getServiceName() + " service has exception.", e);
        }
    }

    log.info(this.getServiceName() + " service end");
}

当有新的连接时,会将该连接封装成一个HAConnection对象,调用HAConnection.start()方法,然后将连接保存到连接列表中。

master如何处理slave的请求以及如何向slave发送消息的逻辑全部在HAConnection对象里,继续追踪HAConnection#start()

HAConnection启动

public void start() {
    // master处理slave的消息部分
    this.readSocketService.start();
    // mastger向slave发送消息部分
    this.writeSocketService.start();
}

这里分为readSocketServicewriteSocketService,顾名思义,一个处理slave的读事件,一个处理写事件

  • readSocketService : 处理接收到的slave的请求
  • writeSocketService: 负责master向slave同步数据的逻辑

Master处理slave请求

store/src/main/java/org/apache/rocketmq/store/ha/HAConnection.run()中,主要关注这部分逻辑:

while (!this.isStopped()) {
    try {
        // 1s检查一次读请求
        this.selector.select(1000);
        // 处理读事件
        boolean ok = this.processReadEvent();
        if (!ok) {
            HAConnection.log.error("processReadEvent error");
            break;
        }

        // 两次读事件的间隔超过了既定的值,则master和slave的连接失效,跳出循环。
        long interval = HAConnection.this.haService.getDefaultMessageStore().getSystemClock().now() - this.lastReadTimestamp;
        if (interval > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaHousekeepingInterval()) {
            log.warn("ha housekeeping, found this connection[" + HAConnection.this.clientAddr + "] expired, " + interval);
            break;
        }
    } catch (Exception e) {
        HAConnection.log.error(this.getServiceName() + " service has exception.", e);
        break;
    }
}

处理读事件的代码:

slave发送的消息内容为要拉取的数据的offset,而master接收要这个offset后,含义有两层

  1. 这个offset表示slave这次要拉取的位置,给master提供参考。
  2. 这个offset也表示slave以及同步到的位置,可以当作一个ack包的作用。
private boolean processReadEvent() {
    int readSizeZeroTimes = 0;

    // 若byteBufferRead没有剩余
    if (!this.byteBufferRead.hasRemaining()) {
        this.byteBufferRead.flip();
        this.processPosition = 0;
    }

    while (this.byteBufferRead.hasRemaining()) {
        try {
            int readSize = this.socketChannel.read(this.byteBufferRead);
            if (readSize > 0) {
                readSizeZeroTimes = 0;
                this.lastReadTimestamp = HAConnection.this.haService.getDefaultMessageStore().getSystemClock().now();
                // 超过8字节就处理,因为slave发送的心跳包就是8字节的offset
                if ((this.byteBufferRead.position() - this.processPosition) >= 8) {
                    // 获取离byteBufferRead.position()最近的8的整数
                    int pos = this.byteBufferRead.position() - (this.byteBufferRead.position() % 8);
                    long readOffset = this.byteBufferRead.getLong(pos - 8);
                    this.processPosition = pos;
                    // 更新slave 已拉取的 offset
                    HAConnection.this.slaveAckOffset = readOffset;
                    // 假如是第一次拉取的情况
                    if (HAConnection.this.slaveRequestOffset < 0) {
                        HAConnection.this.slaveRequestOffset = readOffset;
                        log.info("slave[" + HAConnection.this.clientAddr + "] request offset " + readOffset);
                    }
                    // 通知slave已经更新,更新push2SlaveMaxOffset字段
                    HAConnection.this.haService.notifyTransferSome(HAConnection.this.slaveAckOffset);
                }
            } else if (readSize == 0) {
                if (++readSizeZeroTimes >= 3) {
                    break;
                }
            } else {
                log.error("read socket[" + HAConnection.this.clientAddr + "] < 0");
                return false;
            }
        } catch (IOException e) {
            log.error("processReadEvent exception", e);
            return false;
        }
    }

    return true;
}

master获取到offset后,更新push2SlavemaxOffset字段,这个字段的作用是表示当前M/S之间已成功同步的位置,在master向slave发送数据时需要。

Master传输数据

while (!this.isStopped()) {
    try {
        this.selector.select(1000);

        if (-1 == HAConnection.this.slaveRequestOffset) {
            Thread.sleep(10);
            continue;
        }

        // 是否第一次进行传输
        if (-1 == this.nextTransferFromWhere) {
            // request为0
            if (0 == HAConnection.this.slaveRequestOffset) {
                long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                masterOffset = masterOffset - (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getMappedFileSizeCommitLog());

                if (masterOffset < 0) {
                    masterOffset = 0;
                }

                this.nextTransferFromWhere = masterOffset;
            } else {
                // slaveRequestOffset != 0
                this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
            }

            log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr + "], and slave request " + HAConnection.this.slaveRequestOffset);
        }

        // 上次传输是否成功
        if (this.lastWriteOver) {
            // 距离上次写的时间间隔
            long interval = HAConnection.this.haService.getDefaultMessageStore().getSystemClock().now() - this.lastWriteTimestamp;
            // 如果等待时间间隔 > ha心跳时间间隔
            if (interval > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaSendHeartbeatInterval()) {
                // Build Header
                this.byteBufferHeader.position(0);
                this.byteBufferHeader.limit(headerSize);
                this.byteBufferHeader.putLong(this.nextTransferFromWhere);
                this.byteBufferHeader.putInt(0);
                this.byteBufferHeader.flip();
                this.lastWriteOver = this.transferData();
                if (!this.lastWriteOver) continue;
            }
        } else {
            //上次失败,需要进行重新传输
            this.lastWriteOver = this.transferData();
            if (!this.lastWriteOver) continue;
        }

        // 根据nextTransferFromWhere获取commitlog数据
        SelectMappedBufferResult selectResult = HAConnection.this.haService.getDefaultMessageStore().getCommitLogData(this.nextTransferFromWhere);
        if (selectResult != null) {
            int size = selectResult.getSize();
            if (size > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize()) {
                size = HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize();
            }

            long thisOffset = this.nextTransferFromWhere;
            this.nextTransferFromWhere += size;

            selectResult.getByteBuffer().limit(size);
            this.selectMappedBufferResult = selectResult;

            // Build Header
            this.byteBufferHeader.position(0);
            this.byteBufferHeader.limit(headerSize);
            this.byteBufferHeader.putLong(thisOffset);
            this.byteBufferHeader.putInt(size);
            this.byteBufferHeader.flip();

            this.lastWriteOver = this.transferData();
        } else {

            HAConnection.this.haService.getWaitNotifyObject().allWaitForRunning(100);
        }
    } catch (Exception e) {

        HAConnection.log.error(this.getServiceName() + " service has exception.", e);
        break;
    }
}

image.png

总结

  1. master如何接受slave的请求?

答:在readSocketService类中进行OP_READ事件处理。master收到slave发送的的offset,表示slave当前同步的位置,master保存该offset。

  1. master处理slave请求时,如何判断哪些数据需要同步?

答:slave的请求会携带同步完的offset,master在readSocketService中保存了该offset,后续writeSocketService在执行OP_WRITE事件时,会与 master的offset进行比较,如果有新的数据则发送给slave。

  1. master如何保证发送的数据同步成功的?

答:ack机制,slave -> master 发送的请求中的offset即代表slave已同步完成的进度,假如slave在同步操作时发送异常,那么未同步成功的数据下一次master会再发送过来。

关于slave于高级部分在后续文章中更新。