【RocketMQ | 源码分析】消息队列负载均衡策略介绍大家好，我是林师傅！前面文章我们介绍了Consumer消息队列

前言

前面文章我们介绍了Consumer消息队列负载均衡触发的时机，本篇文章我们来介绍Consumer是如何处理负载均衡的。

【RocketMQ | 源码分析】消息队列负载均衡时机分析 - 掘金 (juejin.cn)

Consumer负载均衡源码分析

执行rebalance的方法是MQClientInstance#doRebalance，源码如下，在doRebalance方法中遍历consumerTable，遍历每个consumerGroup对应的MQConsumerInner，并逐个调用doRebalance方法执行负载均衡

// org.apache.rocketmq.client.impl.factory.MQClientInstance#doRebalance
public void doRebalance() {
    for (Map.Entry<String/*consumerGroup*/, MQConsumerInner> entry : this.consumerTable.entrySet()) {
        MQConsumerInner impl = entry.getValue();
        if (impl != null) {
            try {
                impl.doRebalance();
            } catch (Throwable e) {
                log.error("doRebalance exception", e);
            }
        }
    }
}

MQConsumerInner是Consumer的接口，它有三个实现类DefaultMQPullConsumerImpl、DefaultLitePullConsumerImpl和DefaultMQPushConsumerImpl。

这三个Consumer实现类的doRebalance方法都相同，调用了RebalanceImpl#doRebalance执行负载均衡。

// 消费负载均衡器
private final RebalanceImpl rebalanceImpl = new RebalancePushImpl(this);

// org.apache.rocketmq.client.impl.consumer.DefaultMQPushConsumerImpl#doRebalance
public void doRebalance() {
    if (!this.pause) {
        this.rebalanceImpl.doRebalance(this.isConsumeOrderly()/*是否顺序消费，默认false*/);
    }
}

在doRebalance中，获取订阅信息表(subTable)，遍历获取当前ConsumerGroup订阅的所有Topic，并逐个调用rebalanceByTopic方法进行负载均衡，遍历完成后删除没有订阅的MessageQueue。

public void doRebalance(final boolean isOrder) {
    Map<String/*topic*/, SubscriptionData> subTable = this.getSubscriptionInner();
    if (subTable != null) {
        for (final Map.Entry<String/*topic*/, SubscriptionData> entry : subTable.entrySet()) {
            final String topic = entry.getKey();
            try {
                // 按topic rebalance
                this.rebalanceByTopic(topic, isOrder);
            } catch (Throwable e) {
                if (!topic.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
                    log.warn("rebalanceByTopic Exception", e);
                }
            }
        }
    }
    // 删除没有订阅的MessageQueue
    this.truncateMessageQueueNotMyTopic();
}

根据topic执行Rebalance的方法，如下所示，RebalanceImpl会根据不同的消息模式执行不同的策略

广播模式

如果是广播模式，则Consumer不需要负载均衡，每隔consumer都会消费Topic的所有MessageQueue，仅需要更新当前consumer的processQueueTable的消息

集群模式

如果是集群模式，需要根据Consumer的负载均衡策略分配MessageQueue，然后更新当前Consumer的processQueueTable

集群模式大致逻辑为

从topic订阅信息表(topicSubscribeInfoTable)中获取topic的MessageQueue集合(mqSet)，然后再随机选择一个Broker获取当前consumerGroup的clientId集合(cidAll)，并将mqSet和cidAll排序，这样可以保证不同consumer客户端中进行负载均衡时拿MessageQueue集合(mqAll)和clientId集合(cidAll)的顺序是一致的。
获取分配Consumer订阅当前topic的负载策略实现类(AllocateMessageQueueStrategy实现类)，执行allocate负载均衡算法，为当前clientId分配消费队列，获得分类后的MessageQueue集合(allocateResult)
更新消息队列处理集合(processQueueTable)，为新分配的消息队列创建pullRequest并分发给PullMessageService
如果消息队列处理集合(processQueueTable)更新了，则调用messageQueueChanged方法。向所有broker发送心跳，让Broker更新当前订阅关系。

// org.apache.rocketmq.client.impl.consumer.RebalanceImpl#rebalanceByTopic
private void rebalanceByTopic(final String topic, final boolean isOrder) {
    switch (messageModel) {
        // 广播模式下不需要rebalance，只需要更新topic的processQueueTable
        case BROADCASTING: {
            // 获取消费队列
            Set<MessageQueue> mqSet = this.topicSubscribeInfoTable.get(topic);
            if (mqSet != null) {
                // 直接更新全部消息队列的处理队列processQueueTable的信息，创建最初的pullRequest并分发给PullMessageService
                boolean changed = this.updateProcessQueueTableInRebalance(topic, mqSet/*topic消费messageQueueSet*/, isOrder);
                if (changed) {
                    // 通知Broker订阅信息更新了
                    this.messageQueueChanged(topic, mqSet, mqSet);
                }
            } 
            break;
        }
				// 集群模式
        case CLUSTERING: {
            // 根据topic获取MessageQueue
            Set<MessageQueue> mqSet = this.topicSubscribeInfoTable.get(topic);
            // 从broker获取指定topic当前consumerGroup下的所有clientIdList
            List<String> cidAll = this.mQClientFactory.findConsumerIdList(topic, consumerGroup);

            if (mqSet != null && cidAll != null) {
                List<MessageQueue> mqAll = new ArrayList<MessageQueue>();
                mqAll.addAll(mqSet);

                Collections.sort(mqAll);
                Collections.sort(cidAll);
                // 分配消息负载均衡策略
                AllocateMessageQueueStrategy strategy = this.allocateMessageQueueStrategy;

                List<MessageQueue> allocateResult = null;
                try {
                  	// 根据负载均衡策略分配MessageQueue
                    allocateResult = strategy.allocate(this.consumerGroup,                       this.mQClientFactory.getClientId(),mqAll,cidAll);
                } catch (Throwable e) {
                    return;
                }
                // rebalance的结果
                Set<MessageQueue> allocateResultSet = new HashSet<MessageQueue>();
                if (allocateResult != null) {
                    allocateResultSet.addAll(allocateResult);
                }
                // 更新rebalance的结果到消息处理队列表processQueueTable的信息，
                boolean changed = this.updateProcessQueueTableInRebalance(topic, allocateResultSet, isOrder);
                if (changed) {
                    this.messageQueueChanged(topic, mqSet, allocateResultSet);
                }
            }
            break;
        }
        default:
            break;
    }
}

Consumer负载均衡类源码分析

AllocateMessageQueueStrategy是RocketMQ中负载均衡策略实现类的顶层接口，它提供了两个方法

public interface AllocateMessageQueueStrategy {
		// 根据clientId分配MessageQueue列表
    List<MessageQueue> allocate(final String consumerGroup,final String currentCID,
        final List<MessageQueue> mqAll,final List<String> cidAll);
		
  	// 负载均衡策略算法名称
    String getName();
}

负载均衡策略接口与实现类图如下所示

由类图可知RocketMQ提供了6个负载均衡策略的实现类，我们分别来看下他们的实现原理

AllocateMessageQueueAveragely

平均分配策略，它也是RocketMQ提供的默认负载均衡策略。这个策略会尽量将MessageQueue平均分配给所有消费者，多余的队列分配到前面的消费者，分配的时候前面一个消费者分配完了才会给下一个消费者分配

AllocateMessageQueueAveragelyByCircle

环形平均分配策略。它会尽量将消息队列平均分配给所有消费者，多余的队列分配到前面的消费者。

AllocateMessageQueueByMachineRoom

机房平均分配策略。Consumer需要绑定机房中的broker，并绑定机房中的MessageQueue进行负载均衡

AllocateMachineRoomNearby

机房就近分配策略。消费者对绑定机房的MessageQueue进行负载均衡，消费者对绑定机房中的

AllocateMessageQueueConsistentHash

一致性哈希分配策略。并不推荐使用该方法，该策略会导致消息队列负载信息不容易跟踪

AllocateMessageQueueByConfig

根据配置分配MessageQueue，为每个消费者配置固定的消息队列。

如果没有特殊的要求，尽量使用AllocateMessageQueueAveragely和AllocateMessageQueueAveragelyByCircle负载均衡分配策略。另外由于一个MessageQeueue只会分配给一个Consumer，因此如果消费者数量大于MessageQueue数量，则会导致有些消费者无法消费到消息。

AllocateMessageQueueAveragely源码

平均分配MessageQueue，每个Consumer分配到的MessageQueue是连续的，只有排在前面的消费者分配完了，才会给后面一个消费者分配，如果。例如有8个消息队列q1,q2,q3,q4,q5,q6,q7,q8，有3个消费者c1,c2,c3，按照平均分配策略，c1和c2可以分配到3个消息队列，c3只能分配到2个消息队列，消息队列的分配情况如下

c1: q1,q2,q3

c2: q4,q5,q6

c3: q7,q8

public class AllocateMessageQueueAveragely extends AbstractAllocateMessageQueueStrategy {
    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID/*当前consumer的clientId*/, List<MessageQueue> mqAll/*topic所有MessageQueue*/,
        List<String> cidAll/*当前consumerGroup的所有clientId列表*/) {

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!check(consumerGroup, currentCID, mqAll, cidAll)) {
            return result;
        }
        // 获取当前consumer在clientId列表的序号
        int index = cidAll.indexOf(currentCID);
        // 所有MessageQueue除以clientId数量的余数
        int mod = mqAll.size() % cidAll.size();
        // 当前消费者分配的队列数量
        // 如果MessageQueue数量小于等于Consumer数量，那么没给消费者只能分配到一个MessageQueue
        // 如果余数大于0并且小于当前clientId所在index，则可以多分配到一个Message，可以分配到MessageQueue的数量为：mqAll.size() / cidAll.size() + 1
        // 如果余数等于0或者大于当前clientId所在的index，则只能分配到的MessageQueue数量为：mqAll.size() / cidAll.size()
        int averageSize =
            mqAll.size() <= cidAll.size() ? 1 : (mod > 0 && index < mod ? mqAll.size() / cidAll.size()
                + 1 : mqAll.size() / cidAll.size());
        // 如果余数大于0并且大于clientId所在index，则
        int startIndex = (mod > 0 && index < mod) ? index * averageSize : index * averageSize + mod;
        // 最终分配的消费队列数量，两者取最小值是因为有些Consumer可能分配不到MessageQueue
        int range = Math.min(averageSize, mqAll.size() - startIndex);
        for (int i = 0; i < range; i++) {
            result.add(mqAll.get((startIndex + i) % mqAll.size()));
        }
        return result;
    }
}

AllocateMessageQueueAveragelyByCircle源码

按照消费者的顺序进行循环分配，直到分配完所有消息队列。还是拿上面的例子，8个消息队列，3个消费者，则按照AllocateMessageQueueAveragelyByCircle的分配规则，c1、c2可以分配到3个消息队列，c3可以分配到2个消息队列，消息队列的分配情况如下

C1: q1,q4,q7

c2: q2,q5,q8

c3: q3,q6

public class AllocateMessageQueueAveragelyByCircle extends AbstractAllocateMessageQueueStrategy {
    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!check(consumerGroup, currentCID, mqAll, cidAll)) {
            return result;
        }

        int index = cidAll.indexOf(currentCID);
        // 遍历消息队列
        for (int i = index; i < mqAll.size(); i++) {
            // 当前序号除以client列表大小与client在client列表序号相同时添加
            if (i % cidAll.size() == index) {
                result.add(mqAll.get(i));
            }
        }
        return result;
    }
}

AllocateMessageQueueByMachineRoom源码

Consumer在负载均衡时只分配绑定机房中的broker的MessageQueue，使用AllocateMessageQueueByMachineRoom要求brokerName必须按照"机房名@brokerName"的格式设置，并且设置将要消费的机房名称列表复制给consumeridcs。在分配时首先会按照机房名称过滤出MessageQueue，然后再按照平均分配策略进行分配。

如前面例子，假设有8个消息队列，3个消费者，其中q8不是消费者目标消费机房的，preqAll过滤完所有当前机房的消息队列为q1,q2,q3,q4,q5,q6,q7，则消息队列分配情况如下

c1: q1,q2,q7

c2: q3,q4

c3: q5,q6

public class AllocateMessageQueueByMachineRoom extends AbstractAllocateMessageQueueStrategy {
    // 指定consumer的消费机房名称
    private Set<String> consumeridcs;

    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!check(consumerGroup, currentCID, mqAll, cidAll)) {
            return result;
        }
        // 当前clientId所在的序号
        int currentIndex = cidAll.indexOf(currentCID);
        if (currentIndex < 0) {
            return result;
        }
        // 所有BrokerName名称为"机房名@brokerName"的MessageQueue
        List<MessageQueue> premqAll = new ArrayList<MessageQueue>();
        for (MessageQueue mq : mqAll) {
            String[] temp = mq.getBrokerName().split("@");
            // 如果brokerName的名称为"机房名@brokerName"
            if (temp.length == 2 && consumeridcs.contains(temp[0])) {
                premqAll.add(mq);
            }
        }

        int mod = premqAll.size() / cidAll.size();
        int rem = premqAll.size() % cidAll.size();
        int startIndex = mod * currentIndex;
        int endIndex = startIndex + mod;
        // 整除部分连续分配
        for (int i = startIndex; i < endIndex; i++) {
            result.add(premqAll.get(i));
        }
        // 余数部分单独分配
        if (rem > currentIndex) {
            result.add(premqAll.get(currentIndex + mod * cidAll.size()));
        }
        return result;
    }

    public void setConsumeridcs(Set<String> consumeridcs) {
        this.consumeridcs = consumeridcs;
    }
}

AllocateMachineRoomNearby源码

AllocateMachineRoomNearby是机房就近分配策略，只用该策需要传递两个参数

// 用于真正分配消息队列的策略对象
private final AllocateMessageQueueStrategy allocateMessageQueueStrategy;//actual allocate strategy
// 机房解析器
private final MachineRoomResolver machineRoomResolver;

机房解析器是一个接口，它提供两个方法

public interface MachineRoomResolver {
  	// 解析出消息队列(MessageQueue)的机房名称
    String brokerDeployIn(MessageQueue messageQueue);
		// 解析出clientId的机房名称
    String consumerDeployIn(String clientID);
}

机房就近策略消息队列分配的主要逻辑为：

使用机房解析器将消息队列按照机房分组，将Consumer按照机房分组
解析出当前clientId对应的机房，查出该机房的所有消息队列和clientId列表，使用指定的消息队列分配策略分配该机房的消息队列。
如果某个机房没有存活的client，那么会将该机房的消息队列传入指定的消息队列分配策略分配。

// 
public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
    List<String> cidAll) {

    List<MessageQueue> result = new ArrayList<MessageQueue>();

    // 将消息队列按照机房分组
    Map<String/*机房名 */, List<MessageQueue>> mr2Mq = new TreeMap<String, List<MessageQueue>>();
    for (MessageQueue mq : mqAll) {
        String brokerMachineRoom = machineRoomResolver.brokerDeployIn(mq);
        if (StringUtils.isNoneEmpty(brokerMachineRoom)) {
            if (mr2Mq.get(brokerMachineRoom) == null) {
                mr2Mq.put(brokerMachineRoom, new ArrayList<MessageQueue>());
            }
            mr2Mq.get(brokerMachineRoom).add(mq);
        } 
    }

    // 将Consumer按照机房分组
    Map<String/* 机房名 */, List<String/*clientId*/>> mr2c = new TreeMap<String, List<String>>();
    for (String cid : cidAll) {
        String consumerMachineRoom = machineRoomResolver.consumerDeployIn(cid);
        if (StringUtils.isNoneEmpty(consumerMachineRoom)) {
            if (mr2c.get(consumerMachineRoom) == null) {
                mr2c.put(consumerMachineRoom, new ArrayList<String>());
            }
            mr2c.get(consumerMachineRoom).add(cid);
        } 
    }

    List<MessageQueue> allocateResults = new ArrayList<MessageQueue>();
    // 使用指定的消息队列分配策略，分配消息队列
    String currentMachineRoom = machineRoomResolver.consumerDeployIn(currentCID);
    List<MessageQueue> mqInThisMachineRoom = mr2Mq.remove(currentMachineRoom);
    List<String> consumerInThisMachineRoom = mr2c.get(currentMachineRoom);
    if (mqInThisMachineRoom != null && !mqInThisMachineRoom.isEmpty()) {
        allocateResults.addAll(allocateMessageQueueStrategy.allocate(consumerGroup, currentCID, mqInThisMachineRoom, consumerInThisMachineRoom));
    }

    // 使用指定的消息队列分配策略分配剩余机房中没有Consumer存活的消息队列
    for (Entry<String, List<MessageQueue>> machineRoomEntry : mr2Mq.entrySet()) {
        if (!mr2c.containsKey(machineRoomEntry.getKey())) { // no alive consumer in the corresponding machine room, so all consumers share these queues
            allocateResults.addAll(allocateMessageQueueStrategy.allocate(consumerGroup, currentCID, machineRoomEntry.getValue(), cidAll));
        }
    }

    return allocateResults;
}

AllocateMessageQueueConsistentHash源码

使用一致性哈希消息队列分配策略有两个参数可以赋值

public class AllocateMessageQueueConsistentHash extends AbstractAllocateMessageQueueStrategy {
    // 物理节点的虚拟节点数量，必须大于等于0，默认是10
    private final int virtualNodeCnt;
    // 自定义哈希函数，默认是MD5Hash
    private final HashFunction customHashFunction;
}

HashFunction是一个接口，可以传入一个String类型的key，获得一个long类型的hash值

public interface HashFunction {
    long hash(String key);
}

一致性哈希算法分配消息队列的逻辑包括如下步骤

使用clientId列表作为参数构建一致性哈希路由对象(ConsistentHashRouter)，它用于构建虚拟接节点以及哈希环，如果没有指定哈希函数，默认会采用MD5Hash哈希函数
遍历消息队列集合，对MessageQueue进行hash计算，按照顺时针找到最近的ClientNode节点，如果ClientNode内部的clientId与当前Consumer的ClientId相同，则会加入返回结果集

一致性哈希路由对象(ConsistenHashRouter)底层使用了TreeMap来实现一致性哈希算法

public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
    List<String> cidAll) {

    List<MessageQueue> result = new ArrayList<MessageQueue>();
    if (!check(consumerGroup, currentCID, mqAll, cidAll)) {
        return result;
    }
    // 包装为clientNode节点
    Collection<ClientNode> cidNodes = new ArrayList<ClientNode>();
    for (String cid : cidAll) {
        cidNodes.add(new ClientNode(cid));
    }

    final ConsistentHashRouter<ClientNode> router; //for building hash ring
    if (customHashFunction != null) {
        router = new ConsistentHashRouter<ClientNode>(cidNodes, virtualNodeCnt, customHashFunction);
    } else {
        router = new ConsistentHashRouter<ClientNode>(cidNodes, virtualNodeCnt);
    }

    List<MessageQueue> results = new ArrayList<MessageQueue>();
    for (MessageQueue mq : mqAll) {
        // 获取消息队列在Client哈希环中的节点
        ClientNode clientNode = router.routeNode(mq.toString());
        // 如果哈希环中的节点是当前clientId的节点，则添加到结果中
        if (clientNode != null && currentCID.equals(clientNode.getKey())) {
            results.add(mq);
        }
    }
    return results;
}

AllocateMessageQueueByConfig源码

AllocateMessageQueueByConfig逻辑相对简单，这个策略会返回set到策略中的MessageQueueList

public class AllocateMessageQueueByConfig extends AbstractAllocateMessageQueueStrategy {
    private List<MessageQueue> messageQueueList;

    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        // 返回配置的消息队列
        return this.messageQueueList;
    }
    // 配置消息队列
    public void setMessageQueueList(List<MessageQueue> messageQueueList) {
        this.messageQueueList = messageQueueList;
    }
}

总结

本篇文章我们从MQClientInstance#doRebalance开始分析了消息队列负载均衡的源码，整个过程大致分为3步

获取消息队列列表和clientId列表并排序
获取消息分配策略，并进行负载均衡，获取负载均衡后的消息队列
更新消息处理队列processQueueTable，并给broker发送心跳，更新订阅消息

我们还分析了RocketMQ提供了6种负载均衡策略的源码，了解到了这些负载均衡策略分配逻辑以及使用建议。