RocketMq源码分析(三):nameServer接受消息

220 阅读7分钟

一.简介

在上一篇文章的最后一节,我们在启动nameserver的时候,会创建一个netty服务端,并在里面添加四个handler,分别是

  • encoder/NettyDecoder:处理报文的编解码操作
  • IdleStateHandler:处理心跳
  • connectionManageHandler:处理连接请求
  • serverHandler:处理读写请求=> TODO 用来处理broker注册消息、producer/consumer获取topic消息的

本章重点就是介绍serverHandler这个处理器,这里处理器用来处理消息,看这个类

二.serverHandler处理器

    @ChannelHandler.Sharable
    class NettyServerHandler extends SimpleChannelInboundHandler<RemotingCommand> {

        @Override
        protected void channelRead0(ChannelHandlerContext ctx, RemotingCommand msg) {
//            System.out.println("NettyServerHandler 被触发==>"+msg.toString());
            int localPort = RemotingHelper.parseSocketAddressPort(ctx.channel().localAddress());
            NettyRemotingAbstract remotingAbstract = NettyRemotingServer.this.remotingServerTable.get(localPort);
            if (localPort != -1 && remotingAbstract != null) {
                //处理请求
                remotingAbstract.processMessageReceived(ctx, msg);
                return;
            }
            // The related remoting server has been shutdown, so close the connected channel
            RemotingUtil.closeChannel(ctx.channel());
        }
    }

继续跟进org.apache.rocketmq.remoting.netty.NettyRemotingAbstract#processMessageReceived

public void processMessageReceived(ChannelHandlerContext ctx, RemotingCommand msg) {
    if (msg != null) {
        switch (msg.getType()) {
            case REQUEST_COMMAND:
                // 处理request命令
                processRequestCommand(ctx, msg);
                break;
            // 处理response命令
            case RESPONSE_COMMAND:
                processResponseCommand(ctx, msg);
                break;
            default:
                break;
        }
    }
}

通常处理的都是request的命令,继续跟进

public void processRequestCommand(final ChannelHandlerContext ctx, final RemotingCommand cmd) {
    // 根据 code 从 processorTable 获取 Pair
    final Pair<NettyRequestProcessor, ExecutorService> matched = this.processorTable.get(cmd.getCode());
    // 找不到,给个默认值
    final Pair<NettyRequestProcessor, ExecutorService> pair = null == matched ? this.defaultRequestProcessorPair : matched;
    final int opaque = cmd.getOpaque();

    if (pair == null) {
        String error = " request type " + cmd.getCode() + " not supported";
        final RemotingCommand response =
            RemotingCommand.createResponseCommand(RemotingSysResponseCode.REQUEST_CODE_NOT_SUPPORTED, error);
        response.setOpaque(opaque);
        ctx.writeAndFlush(response);
        log.error(RemotingHelper.parseChannelRemoteAddr(ctx.channel()) + error);
        return;
    }

    /**
     * 通过code找到 process(即pair),构建成一个 Runnable,
     * 交个线程池去执行
     * todo  => pair.getObject1() 是这个process
     * todo  => pair.getObject2() 是这个线程池
     *
     */
    Runnable run = buildProcessRequestHandler(ctx, cmd, pair, opaque);

    if (pair.getObject1().rejectRequest()) {
        final RemotingCommand response = RemotingCommand.createResponseCommand(RemotingSysResponseCode.SYSTEM_BUSY,
                "[REJECTREQUEST]system busy, start flow control for a while");
        response.setOpaque(opaque);
        ctx.writeAndFlush(response);
        return;
    }

    try {
        /**
         * 将上面构建的runnable交给线程池执行
         */
        final RequestTask requestTask = new RequestTask(run, ctx.channel(), cmd);
        //async execute task, current thread return directly
        pair.getObject2().submit(requestTask);
    } catch (RejectedExecutionException e) {
        if ((System.currentTimeMillis() % 10000) == 0) {
            log.warn(RemotingHelper.parseChannelRemoteAddr(ctx.channel())
                    + ", too many requests and system thread pool busy, RejectedExecutionException "
                    + pair.getObject2().toString()
                    + " request code: " + cmd.getCode());
        }

        if (!cmd.isOnewayRPC()) {
            final RemotingCommand response = RemotingCommand.createResponseCommand(RemotingSysResponseCode.SYSTEM_BUSY,
                    "[OVERLOAD]system busy, start flow control for a while");
            response.setOpaque(opaque);
            ctx.writeAndFlush(response);
        }
    }
}

主要分为一下几部

  • 通过code找到对应的pair

备注:pair是封装一个runnable对象和处理类process

  • 如果pair为空,使用默认的pair
  • 把找到的这个pair,构建成可以执行的runnable对象
  • 把runnable对象封装为一个RequestTask对象,并在线程池中执行 接着我们进入buildProcessRequestHandler()这个方法,可以看到核心代码是

image.png 紧着进入pair.getObject1().processRequest(ctx, cmd); 会进入到 DefaultRequestProcessor#processRequest方法中

image.png 这里会根据不同的code找到不同的方法去执行

三.broker启动/停机,nameserver接受注册/销毁信息

broker启动时候,会指定nameserver的地址,启动成功后,会通知nameserver的,进行注册 来看nameserver的DefaultRequestProcessor#processRequest

case RequestCode.REGISTER_BROKER:
   //注册broker
    return this.registerBroker(ctx, request);

进入registerBroker(ctx, request)方法,我们可以看到请求参数request的数据是

RemotingCommand [code=103, language=JAVA, version=413, opaque=96, flag(B)=0, remark=null, extFields={brokerId=0, bodyCrc32=910316528, clusterName=DefaultCluster, brokerAddr=127.0.0.1:10911, enableActingMaster=false, haServerAddr=192.168.192.1:10912, compressed=false, brokerName=broker-a}, serializeTypeCurrentRPC=JSON]

在这里最终会调用

RegisterBrokerResult result = this.namesrvController.getRouteInfoManager().registerBroker(
    requestHeader.getClusterName(),
    requestHeader.getBrokerAddr(),
    requestHeader.getBrokerName(),
    requestHeader.getBrokerId(),
    requestHeader.getHaServerAddr(),
    request.getExtFields().get(MixAll.ZONE_NAME),
    requestHeader.getHeartbeatTimeoutMillis(),
    requestHeader.getEnableActingMaster(),
    topicConfigWrapper,
    filterServerList,
    ctx.channel()
);

就是会把broker的信息注册到 RouteInfoManager中,可以看下这个类的成员变量

private final Map<String/* topic */, Map<String, QueueData>> topicQueueTable;
private final Map<String/* brokerName */, BrokerData> brokerAddrTable;
private final Map<String/* clusterName */, Set<String/* brokerName */>> clusterAddrTable;
private final Map<BrokerAddrInfo/* brokerAddr */, BrokerLiveInfo> brokerLiveTable;
private final Map<BrokerAddrInfo/* brokerAddr */, List<String>/* Filter Server */> filterServerTable;
private final Map<String/* topic */, Map<String/*brokerName*/, TopicQueueMappingInfo>> topicQueueMappingInfoTable;

前面提到过NameServer是一个非常简单的Topic路由注册中心,这个HashMap就是NameServer实现注册中心的关键!

  1. topicQueueTable:存放保存topicQueue的关系,value类型为List,表明一个topic可以有多个queueQueueData的成员变量如下:

    public class QueueData implements Comparable<QueueData> {
     // 所在的 borker 的名称   
     private String brokerName;
     // 读写数
     private int readQueueNums;
     private int writeQueueNums;
     private int perm;
     private int topicSynFlag;
     ...
    }
    复制代码
    
  2. brokerAddrTable:记录broker的具体信息,keybroker名称,valuebroker具体信息,BrokerData的成员变量如下:

    public class BrokerData implements Comparable<BrokerData> {
     // 所在集群的名称   
     private String cluster;
     // broker名称
     private String brokerName;
     // borkerId对应的服务器地址,一个brokerName可以有多个broker服务器
     private HashMap<Long, String> brokerAddrs;
    }
    复制代码
    
  3. clusterAddrTable:集群信息,保存集群名称对应的brokerName

  4. brokerLiveTable:存活的broker信息,keybroker地址,value为具体的broker服务器,BrokerLiveInfo的成员变量如下:

class BrokerLiveInfo {
    // 上一次心跳更新时间
    private long lastUpdateTimestamp;
    private long heartbeatTimeoutMillis;
    private DataVersion dataVersion;
    // 表示网络连接的channel,由netty提供
    private Channel channel;
    // 高可用的服务地址
    private String haServerAddr;

了解完成这些后,再回过头来看RouteInfoManager#registerBroker方法,我们就会发现所谓的注册就是往以上几个HashMapput数据的操作:

public RegisterBrokerResult registerBroker(
   final String clusterName,
   final String brokerAddr,
   final String brokerName,
   final long brokerId,
   final String haServerAddr,
   final String zoneName,
   final Long timeoutMillis,
   final Boolean enableActingMaster,
   final TopicConfigSerializeWrapper topicConfigWrapper,
   final List<String> filterServerList,
   final Channel channel) {
   RegisterBrokerResult result = new RegisterBrokerResult();
   try {
       this.lock.writeLock().lockInterruptibly();

       //init or update the cluster info   TODO 根据 clusterName 从集群map 获取 brokerNames
       Set<String> brokerNames = ConcurrentHashMapUtils.computeIfAbsent((ConcurrentHashMap<String, Set<String>>) this.clusterAddrTable, clusterName, k -> new HashSet<>());
       brokerNames.add(brokerName);

       boolean registerFirst = false;
       //  根据 brokerName 获取指定的 brokerData
       BrokerData brokerData = this.brokerAddrTable.get(brokerName);
       if (null == brokerData) {
           registerFirst = true;
           brokerData = new BrokerData(clusterName, brokerName, new HashMap<>());
           this.brokerAddrTable.put(brokerName, brokerData);    // put操作,操作的是 brokerAddrTable
       }

       boolean isOldVersionBroker = enableActingMaster == null;
       brokerData.setEnableActingMaster(!isOldVersionBroker && enableActingMaster);
       brokerData.setZoneName(zoneName);

       Map<Long, String> brokerAddrsMap = brokerData.getBrokerAddrs();

       boolean isMinBrokerIdChanged = false;
       long prevMinBrokerId = 0;
       if (!brokerAddrsMap.isEmpty()) {
           prevMinBrokerId = Collections.min(brokerAddrsMap.keySet());
       }

       if (brokerId < prevMinBrokerId) {
           isMinBrokerIdChanged = true;
       }
       // TODO   如果是由从切换为主,需要删除原来的从节点记录
       //Switch slave to master: first remove <1, IP:PORT> in namesrv, then add <0, IP:PORT>
       //The same IP:PORT must only have one record in brokerAddrTable
       brokerAddrsMap.entrySet().removeIf(item -> null != brokerAddr && brokerAddr.equals(item.getValue()) && brokerId != item.getKey());

       //If Local brokerId stateVersion bigger than the registering one,
       String oldBrokerAddr = brokerAddrsMap.get(brokerId);
       if (null != oldBrokerAddr && !oldBrokerAddr.equals(brokerAddr)) {
           BrokerLiveInfo oldBrokerInfo = brokerLiveTable.get(new BrokerAddrInfo(clusterName, oldBrokerAddr));

           if (null != oldBrokerInfo) {
               long oldStateVersion = oldBrokerInfo.getDataVersion().getStateVersion();
               long newStateVersion = topicConfigWrapper.getDataVersion().getStateVersion();
               if (oldStateVersion > newStateVersion) {
                   log.warn("Registered Broker conflicts with the existed one, just ignore.: Cluster:{}, BrokerName:{}, BrokerId:{}, " +
                           "Old BrokerAddr:{}, Old Version:{}, New BrokerAddr:{}, New Version:{}.",
                       clusterName, brokerName, brokerId, oldBrokerAddr, oldStateVersion, brokerAddr, newStateVersion);
                   //Remove the rejected brokerAddr from brokerLiveTable.
                   brokerLiveTable.remove(new BrokerAddrInfo(clusterName, brokerAddr));
                   return result;
               }
           }
       }

...
...
...


   return result;
}

这样一来,这个方法所做的工作就一目了然了,就是把broker上报的信息包装下,然后放到这几个hashMap中。

了解完成注册操作后,注销操作就不难理解了,它是跟注册相反的操作,所做的事就是从这几个hashMap中移除broker对应的信息,处理方法为RouteInfoManager#unregisterBroker,代码中确实是进行hashMap移除的相关操作,这里就不分析了。

四.获取topic信息

producerconsumer启动时,都需要根据topicNameServer获取对应的路由信息,处理消息的方法为org.apache.rocketmq.namesrv.processor.DefaultRequestProcessor#getAllTopicListFromNameserver

private RemotingCommand getAllTopicListFromNameserver(ChannelHandlerContext ctx, RemotingCommand request) {
    final RemotingCommand response = RemotingCommand.createResponseCommand(null);
    // 判断是否可以获取所有的topic
    boolean enableAllTopicList = namesrvController.getNamesrvConfig().isEnableAllTopicList();
    log.warn("getAllTopicListFromNameserver {} enable {}", ctx.channel().remoteAddress(), enableAllTopicList);
    if (enableAllTopicList) {
        // 从RouteInfoManager中获取所有的topic,就是从topicQueueTable 取出来所有的值
        byte[] body = this.namesrvController.getRouteInfoManager().getAllTopicList().encode();
        response.setBody(body);
        response.setCode(ResponseCode.SUCCESS);
        response.setRemark(null);
    } else {
        response.setCode(ResponseCode.SYSTEM_ERROR);
        response.setRemark("disable");
    }

    return response;
}

概括:获取所有的topic,就是从RouteInfoManager类中取出topicQueueTable的所有的key,即是所有的topic名字

五.获取broker版本信息

this.queryBrokerTopicConfig(ctx, request)方法中

    public RemotingCommand queryBrokerTopicConfig(ChannelHandlerContext ctx,
        RemotingCommand request) throws RemotingCommandException {
       ...
        // 关键代码:判断版本是否发生变化
        Boolean changed = this.namesrvController.getRouteInfoManager().isBrokerTopicConfigChanged(clusterName, brokerAddr, dataVersion);
        // 就更新最后一次的上报时间为当前时间
        this.namesrvController.getRouteInfoManager().updateBrokerInfoUpdateTimestamp(clusterName, brokerAddr);

        DataVersion nameSeverDataVersion = this.namesrvController.getRouteInfoManager().queryBrokerTopicConfig(clusterName, brokerAddr);
        response.setCode(ResponseCode.SUCCESS);
        response.setRemark(null);
        // 返回 nameServer当前的版本号
        if (nameSeverDataVersion != null) {
            response.setBody(nameSeverDataVersion.encode());
        }
        responseHeader.setChanged(changed);
        return response;
    }

六.检查broker是否存活

在前面分析NamesrvController#start方法时候,我们提到该方法启动了一个定时任务 DefaultBrokerHeartbeatManager#start

@Override
public void start() {
    this.scheduledService.scheduleAtFixedRate(this::scanNotActiveBroker, 2000, this.controllerConfig.getScanNotActiveBrokerInterval(), TimeUnit.MILLISECONDS);
}

继续进入


public void scanNotActiveBroker() {
    try {
        // brokerLiveTable:存放活跃的broker,就是找出其中不活跃的,然后移除,操作的是 brokerLiveTable
        final Iterator<Map.Entry<BrokerAddrInfo, BrokerLiveInfo>> iterator = this.brokerLiveTable.entrySet().iterator();
        while (iterator.hasNext()) {
            final Map.Entry<BrokerAddrInfo, BrokerLiveInfo> next = iterator.next();
            // 上一次的心跳时间
            long last = next.getValue().getLastUpdateTimestamp();
            long timeoutMillis = next.getValue().getHeartbeatTimeoutMillis();
            // 根据心跳时间判断是否存活,超时时间为2min
            if ((last + timeoutMillis) < System.currentTimeMillis()) {
                final Channel channel = next.getValue().getChannel();
                // 移除
                iterator.remove();
                if (channel != null) {
                    // 处理channel的关闭,这个方法里会处理其他 hashMap 的移除
                    RemotingUtil.closeChannel(channel);
                }
                this.executor.submit(() ->
                    notifyBrokerInActive(next.getKey().getClusterName(), next.getValue().getBrokerName(), next.getKey().getBrokerAddr(), next.getValue().getBrokerId()));
                log.warn("The broker channel {} expired, brokerInfo {}, expired {}ms", next.getValue().getChannel(), next.getKey(), timeoutMillis);
            }
        }
  
}

这个方法先是遍历brokerLiveTable,然后判断每个BrokerLiveInfo的最近一次的上报时间,判断是否超时,如果最近的上报时间距离当前超过了2分钟,说明该broker可能挂了,就将它从brokerLiveTable移除,然后调用RouteInfoManager#onChannelDestroy方法,移除其他hashMapbroker.

七.总结

本文分析了NameServer对请求消息的处理,nameServer底层使用netty进行通讯,处理brokerproducerconsumer请求消息的ChannelHandlerNettyServerHandler,最终的处理方法为DefaultRequestProcessor#processRequest,这个方法会处理众多的请求,我们重点分析了注册/注销broker消息获取topic路由消息获取broker版本信息的处理流程。

注册/注销broker消息获取topic路由消息获取broker版本信息最终都是在RouteInfoManager类中处理,这个类中有几个非常重要的、类型为HashMap的成员变量如下:

  1. topicQueueTable:存放保存topicQueue的关系,value类型为List,表明一个topic可以有多个queue
  2. brokerAddrTable:记录broker的具体信息,keybroker名称,valuebroker具体信息
  3. clusterAddrTable:集群信息,保存集群名称对应的brokerName
  4. brokerLiveTable:存活的broker信息,keybroker地址,value为具体的broker服务器

这个几成员变量就是NameServer被称为注册中心的原因所在,所谓的注册/注销broker,就是往这几个hashMapputremove相关的broker信息;获取topic路由消息就是从topicQueueTable中获取broker/messageQueue等信息。

nameServer所谓的"注册"、“发现”、“心跳”等,都是对RouteInfoManager这几个hashMap成员变量进行操作的。