前言
上一篇文章分析了生产者客户端流程,这篇文章开始分析消费者客户端流程源码,从PartitionAssignor开始分析,看一下消费者是如如何被分配分区的
正文
KafKa一个「主题」可以有多个「分区」,一个「分区」只能被一个「消费者」所消费,一个或多个消费者组成一个「消费者组」,当多个消费者去订阅一个主题的时候,需要将主题里面的分区尽可能平均分配给每一个消费者,例如有9个分区和3个消费者,那么一个消费者能分到3个分区。
分区是由消费者客户端决定的,而不是Kafka服务器,当我们去消费消息的时候可以自定义消费哪个分区,也可以使用内置的策略来进行分配,下面分析三种不同的消费策略
抽象方法——AbstractPartitionAssignor
@Override
public GroupAssignment assign(Cluster metadata, GroupSubscription groupSubscription) {
Map<String, Subscription> subscriptions = groupSubscription.groupSubscription();
Set<String> allSubscribedTopics = new HashSet<>();
//获取所有的topic
for (Map.Entry<String, Subscription> subscriptionEntry : subscriptions.entrySet())
allSubscribedTopics.addAll(subscriptionEntry.getValue().topics());
//将获取topic对应的分区数量
Map<String, Integer> partitionsPerTopic = new HashMap<>();
for (String topic : allSubscribedTopics) {
Integer numPartitions = metadata.partitionCountForTopic(topic);
if (numPartitions != null && numPartitions > 0)
partitionsPerTopic.put(topic, numPartitions);
else
log.debug("Skipping assignment for topic {} since no metadata is available", topic);
}
//获取group里面消费者分到到分区集合
Map<String, List<TopicPartition>> rawAssignments = assign(partitionsPerTopic, subscriptions);
// this class maintains no user data, so just wrap the results
Map<String, Assignment> assignments = new HashMap<>();
for (Map.Entry<String, List<TopicPartition>> assignmentEntry : rawAssignments.entrySet())
assignments.put(assignmentEntry.getKey(), new Assignment(assignmentEntry.getValue()));
return new GroupAssignment(assignments);
}
该抽象方法是将元数据和订阅数据转换为「主题——分区数」,「消费者——订阅主题」的映射关系,可以从中获取到某个消费者订阅了哪些主题,同时获取主题的分区数量,然后交给子类实现
RangeAssignor
该分区方式是:[分区数量 / 消费者数量] 来获取最小分配单位,多出来的分区按照消费者顺序分配
例如上面图片:分区:7 / 消费者:3 = 2,多出来1个分区就分配到「消费者1」身上,如果多出来2个分区,那么第二个多出的分区会分配到「消费者2」身上
@Override
public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
Map<String, Subscription> subscriptions) {
Map<String, List<MemberInfo>> consumersPerTopic = consumersPerTopic(subscriptions);
Map<String, List<TopicPartition>> assignment = new HashMap<>();
//初始化Map
for (String memberId : subscriptions.keySet())
assignment.put(memberId, new ArrayList<>());
for (Map.Entry<String, List<MemberInfo>> topicEntry : consumersPerTopic.entrySet()) {
String topic = topicEntry.getKey();
List<MemberInfo> consumersForTopic = topicEntry.getValue();
Integer numPartitionsForTopic = partitionsPerTopic.get(topic);
if (numPartitionsForTopic == null)
continue;
//排序
Collections.sort(consumersForTopic);
//每个消费者能分到多少个topic
int numPartitionsPerConsumer = numPartitionsForTopic / consumersForTopic.size();
//还余下多少个topic
int consumersWithExtraPartition = numPartitionsForTopic % consumersForTopic.size();
List<TopicPartition> partitions = AbstractPartitionAssignor.partitions(topic, numPartitionsForTopic);
//遍历消费者 然后分区
for (int i = 0, n = consumersForTopic.size(); i < n; i++) {
int start = numPartitionsPerConsumer * i + Math.min(i, consumersWithExtraPartition);
int length = numPartitionsPerConsumer + (i + 1 > consumersWithExtraPartition ? 0 : 1);
assignment.get(consumersForTopic.get(i).memberId).addAll(partitions.subList(start, start + length));
}
}
return assignment;
}
源码很简单,对消费者排序,然后相除和取余来进行计算
RoundRobinAssignor
该方式是使用轮循的方式来分配分区,原理也很简单
@Override
public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
Map<String, Subscription> subscriptions) {
Map<String, List<TopicPartition>> assignment = new HashMap<>();
List<MemberInfo> memberInfoList = new ArrayList<>();
for (Map.Entry<String, Subscription> memberSubscription : subscriptions.entrySet()) {
assignment.put(memberSubscription.getKey(), new ArrayList<>());
memberInfoList.add(new MemberInfo(memberSubscription.getKey(),
memberSubscription.getValue().groupInstanceId()));
}
//使用循环队列 循环对消费者进行分配
CircularIterator<MemberInfo> assigner = new CircularIterator<>(Utils.sorted(memberInfoList));
for (TopicPartition partition : allPartitionsSorted(partitionsPerTopic, subscriptions)) {
final String topic = partition.topic();
while (!subscriptions.get(assigner.peek().memberId).topics().contains(topic))
assigner.next();
assignment.get(assigner.next().memberId).add(partition);
}
return assignment;
}
private List<TopicPartition> allPartitionsSorted(Map<String, Integer> partitionsPerTopic,
Map<String, Subscription> subscriptions) {
//将分区进行排序
SortedSet<String> topics = new TreeSet<>();
for (Subscription subscription : subscriptions.values())
topics.addAll(subscription.topics());
//依次创建TopicPartition对象
List<TopicPartition> allPartitions = new ArrayList<>();
for (String topic : topics) {
Integer numPartitionsForTopic = partitionsPerTopic.get(topic);
if (numPartitionsForTopic != null)
allPartitions.addAll(AbstractPartitionAssignor.partitions(topic, numPartitionsForTopic));
}
return allPartitions;
}
核心是使用循环的迭代器来对消费者进行循环,每次迭代都分配分区,直到所有的分区都被分配完毕
StickyAssignor
StickyAssignor是粘性分区,对于上面介绍到的RangeAssignor和RoundRobinAssignor来说,这两种只体现了分区的「公平性」,能够保证每一次分配都尽可能公平,但是没有考虑到上一次分配情况,粘性分区方式则是保存了上一次分配情况,根据上次分配情况来进行公平分配。
还是按照刚才图片所示的分区方式: 消费者1:1、2、3 消费者2:4、5 消费者3:6、7
如果这个时候消费者1下线了,按照RangeAssignor的分配方式是:
消费者2:1,2,3,4 消费者3:5,6,7
如果按照StickyAssignor的方式,会考虑到上一次分配情况:
消费者2:4,5,1,2 消费者3:6、7,3
在尽量维持原来分配不变的情况下进行公平分配
@Override
public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
Map<String, Subscription> subscriptions) {
partitionMovements = new PartitionMovements();
Map<String, List<TopicPartition>> consumerToOwnedPartitions = new HashMap<>();
//这种情况是当前消费者组里面的消费者都订阅同一个主题
if (allSubscriptionsEqual(partitionsPerTopic.keySet(), subscriptions, consumerToOwnedPartitions)) {
log.debug("Detected that all consumers were subscribed to same set of topics, invoking the "
+ "optimized assignment algorithm");
partitionsTransferringOwnership = new HashMap<>();
return constrainedAssign(partitionsPerTopic, consumerToOwnedPartitions);
} else {
log.debug("Detected that all not consumers were subscribed to same set of topics, falling back to the "
+ "general case assignment algorithm");
partitionsTransferringOwnership = null;
return generalAssign(partitionsPerTopic, subscriptions);
}
}
由于各种操作都会触发重分配:新增/取消订阅主题、消费者下线、修改主题分区数量,所以StickyAssignor将重分配分为两种情况:
- 消费者组里面的消费者都订阅了相同的主题列表
- 订阅的主题是混乱的
第一种情况:都订阅一样的主题
/**
* 如果所有使用者都具有相同的订阅,则返回true。 还用每个使用者的先前拥有和仍订阅的分区来填充传入的consumerToOwnedPartitions
*/
private boolean allSubscriptionsEqual(Set<String> allTopics,
Map<String, Subscription> subscriptions,
Map<String, List<TopicPartition>> consumerToOwnedPartitions) {
Set<String> membersWithOldGeneration = new HashSet<>();
Set<String> membersOfCurrentHighestGeneration = new HashSet<>();
int maxGeneration = DEFAULT_GENERATION;
Set<String> subscribedTopics = new HashSet<>();
for (Map.Entry<String, Subscription> subscriptionEntry : subscriptions.entrySet()) {
String consumer = subscriptionEntry.getKey();
Subscription subscription = subscriptionEntry.getValue();
if (subscribedTopics.isEmpty()) {
subscribedTopics.addAll(subscription.topics());
//用于判断消费者是否都订阅了相同主题
} else if (!(subscription.topics().size() == subscribedTopics.size()
&& subscribedTopics.containsAll(subscription.topics()))) {
return false;
}
//获取上一次分区信息previous_assignment
MemberData memberData = memberData(subscription);
List<TopicPartition> ownedPartitions = new ArrayList<>();
consumerToOwnedPartitions.put(consumer, ownedPartitions);
if (memberData.generation.isPresent() && memberData.generation.get() >= maxGeneration
|| !memberData.generation.isPresent() && maxGeneration == DEFAULT_GENERATION) {
// 如果当前成员的年龄较高,则所有先前拥有的分区都无效
if (memberData.generation.isPresent() && memberData.generation.get() > maxGeneration) {
membersWithOldGeneration.addAll(membersOfCurrentHighestGeneration);
membersOfCurrentHighestGeneration.clear();
maxGeneration = memberData.generation.get();
}
membersOfCurrentHighestGeneration.add(consumer);
for (final TopicPartition tp : memberData.partitions) {
//过滤掉失效的主题
if (allTopics.contains(tp.topic())) {
ownedPartitions.add(tp);
}
}
}
}
for (String consumer : membersWithOldGeneration) {
consumerToOwnedPartitions.get(consumer).clear();
}
return true;
}
迭代消费者订阅情况,判断是否都订阅了相同主题;如果都是订阅了相同主题,就获取上一次的分配情况跟当前订阅的主题进行比较,删除那些失效的主题分区,这样我们就获取到了一份有效的分配情况,但是可能不公平,所以进入到constrainedAssign来进行公平分配
//获取主题-分区对象 按照 按照主题排序 主题相同按照分区排序
SortedSet<TopicPartition> unassignedPartitions = getTopicPartitions(partitionsPerTopic);
Set<TopicPartition> allRevokedPartitions = new HashSet<>();
// 未达标的成员
List<String> unfilledMembers = new LinkedList<>();
// 最大数量成员
Queue<String> maxCapacityMembers = new LinkedList<>();
// 最小数量成员
Queue<String> minCapacityMembers = new LinkedList<>();
int numberOfConsumers = consumerToOwnedPartitions.size();
// minQuota和maxQuota相差1
int minQuota = (int) Math.floor(((double) unassignedPartitions.size()) / numberOfConsumers);
int maxQuota = (int) Math.ceil(((double) unassignedPartitions.size()) / numberOfConsumers);
// 使用minQuota初始化map
Map<String, List<TopicPartition>> assignment = new HashMap<>(
consumerToOwnedPartitions.keySet().stream().collect(Collectors.toMap(c -> c, c -> new ArrayList<>(minQuota))));
步骤一:将主题——分区数量格式化为TopicPartition对象,由于所有消费者订阅的主题都是一样的,所以理论上说每个消费者能够分配的最小值是【分区数量 / 消费者数量】,最大值是 【最小值+1】
for (Map.Entry<String, List<TopicPartition>> consumerEntry : consumerToOwnedPartitions.entrySet()) {
String consumer = consumerEntry.getKey();
List<TopicPartition> ownedPartitions = consumerEntry.getValue();
List<TopicPartition> consumerAssignment = assignment.get(consumer);
int i = 0;
// 分配maxQuota个 超出这个数量的加入「撤销」集合
for (TopicPartition tp : ownedPartitions) {
if (i < maxQuota) {
consumerAssignment.add(tp);
unassignedPartitions.remove(tp);
} else {
allRevokedPartitions.add(tp);
}
++i;
}
//说明该消费者没有达到平均水平
if (ownedPartitions.size() < minQuota) {
unfilledMembers.add(consumer);
} else {
//添加到对应的集合中
if (consumerAssignment.size() == minQuota)
minCapacityMembers.add(consumer);
if (consumerAssignment.size() == maxQuota)
maxCapacityMembers.add(consumer);
}
}
步骤二:对上一次分配情况进行统计,主要是验证上一次分配是否达到了这一次允许的最大值,如果超过了需要删除多余的,然后将消费者放到对应的集合中
这一步后我们得到了四个集合:
- unfilledMembers:没有达到最小值的消费者,重点关注它,需要将里面的消费者的分区数量达到到minQuota
- minCapacityMembers:刚好到minQuota的消费者,说明已经饱和分配完成,不用管它了
- maxCapacityMembers:达到了允许的最大值,如果有需要可以从中取出一个
- unassignedPartitions:还没有分配的分区,主要将这个里面的分区分配给unfilledMembers的消费者
Collections.sort(unfilledMembers);
Iterator<TopicPartition> unassignedPartitionsIter = unassignedPartitions.iterator();
//将那些没有达到平均水平分区消费者分配分区
while (!unfilledMembers.isEmpty() && !unassignedPartitions.isEmpty()) {
Iterator<String> unfilledConsumerIter = unfilledMembers.iterator();
while (unfilledConsumerIter.hasNext()) {
String consumer = unfilledConsumerIter.next();
List<TopicPartition> consumerAssignment = assignment.get(consumer);
if (unassignedPartitionsIter.hasNext()) {
TopicPartition tp = unassignedPartitionsIter.next();
consumerAssignment.add(tp);
unassignedPartitionsIter.remove();
//如果说这个分区在撤销集合里面,说明是从另外一个消费者转移过来的
if (allRevokedPartitions.contains(tp))
partitionsTransferringOwnership.put(tp, consumer);
} else {
break;
}
if (consumerAssignment.size() == minQuota) {
minCapacityMembers.add(consumer);
unfilledConsumerIter.remove();
}
}
}
步骤三:从「未分配分区」集合中取出分区,分配给「未达到最小分配数量」的消费者,将所有未分配的节点尽量分配出去
//将剩余的主题分区都分配完以后 如果还有没达到最低标准的消费者 需要从maxQuota中取出
for (String consumer : unfilledMembers) {
List<TopicPartition> consumerAssignment = assignment.get(consumer);
int remainingCapacity = minQuota - consumerAssignment.size();
while (remainingCapacity > 0) {
String overloadedConsumer = maxCapacityMembers.poll();
//这种情况说明消费者太多 分区太少 有些消费者就分不到足够的数量
if (overloadedConsumer == null) {
throw new IllegalStateException("Some consumers are under capacity but all partitions have been assigned");
}
TopicPartition swappedPartition = assignment.get(overloadedConsumer).remove(0);
consumerAssignment.add(swappedPartition);
--remainingCapacity;
partitionsTransferringOwnership.put(swappedPartition, consumer);
}
minCapacityMembers.add(consumer);
}
步骤四:经过步骤三以后,还会有一些消费者没有达到「最小分配数量」,这时候从maxCapacityMembers列表中取出一个来进行转移
//如果还有主题分区没有被分给消费者 就按照消费者顺序分
for (TopicPartition unassignedPartition : unassignedPartitions) {
String underCapacityConsumer = minCapacityMembers.poll();
if (underCapacityConsumer == null) {
throw new IllegalStateException("Some partitions are unassigned but all consumers are at maximum capacity");
}
// We can skip the bookkeeping of unassignedPartitions and maxCapacityMembers here since we are at the end
assignment.get(underCapacityConsumer).add(unassignedPartition);
if (allRevokedPartitions.contains(unassignedPartition))
partitionsTransferringOwnership.put(unassignedPartition, underCapacityConsumer);
}
return assignment;
步骤5:经过了以上步骤,所有的消费者都超过了最小容量,但是还有一些没有分配,就按照顺序分配给达到最小容量的节点。
++(通过上面的流程,感觉步骤五实际上不太可能进入)++
总结:由于是订阅的主题都一致,所以可以无脑进行分配,基本逻辑就是根据上一次分配情况进行调整
第二种情况
private Map<String, List<TopicPartition>> generalAssign(Map<String, Integer> partitionsPerTopic,
Map<String, Subscription> subscriptions) {
//获取当前消费者前一次被分配分区情况 (虽然是currentAssignment,但是表示这一次之前)
Map<String, List<TopicPartition>> currentAssignment = new HashMap<>();
//获取上一次某个主题分区给了哪个消费者
Map<TopicPartition, ConsumerGenerationPair> prevAssignment = new HashMap<>();
prepopulateCurrentAssignments(subscriptions, currentAssignment, prevAssignment);
该方法一开始也是获取上一次分配情况,prepopulateCurrentAssignments就不看了,大同小异
//因为订阅的主题是不同的 所以要记录当前分区 可以分配给哪些消费者
final Map<TopicPartition, List<String>> partition2AllPotentialConsumers = new HashMap<>();
// 记录消费者可以被分配分区列表
final Map<String, List<TopicPartition>> consumer2AllPotentialPartitions = new HashMap<>();
//初始化 partition2AllPotentialConsumers
for (Entry<String, Integer> entry: partitionsPerTopic.entrySet()) {
for (int i = 0; i < entry.getValue(); ++i)
partition2AllPotentialConsumers.put(new TopicPartition(entry.getKey(), i), new ArrayList<>());
}
for (Entry<String, Subscription> entry: subscriptions.entrySet()) {
String consumerId = entry.getKey();
consumer2AllPotentialPartitions.put(consumerId, new ArrayList<>());
entry.getValue().topics().stream().filter(topic -> partitionsPerTopic.get(topic) != null).forEach(topic -> {
for (int i = 0; i < partitionsPerTopic.get(topic); ++i) {
TopicPartition topicPartition = new TopicPartition(topic, i);
consumer2AllPotentialPartitions.get(consumerId).add(topicPartition);
partition2AllPotentialConsumers.get(topicPartition).add(consumerId);
}
});
// 适用于新增了消费者情况 put新消费者
if (!currentAssignment.containsKey(consumerId))
currentAssignment.put(consumerId, new ArrayList<>());
}
由于每个消费者订阅的主题是不同的,所以根据当前订阅信息和分区信息解析出两个Map:
- 该主题分区可以被分配给哪些消费者:partition2AllPotentialConsumers
- 该消费者可以被分配哪些主题:consumer2AllPotentialPartitions
同时新增了消费者,要添加消费者到currentAssignment中,currentAssignment是上一次分配情况,这一次分配会在上一次基础上进行修改
//给上一次分配情况进行「分区-消费者」映射
Map<TopicPartition, String> currentPartitionConsumer = new HashMap<>();
for (Map.Entry<String, List<TopicPartition>> entry: currentAssignment.entrySet())
for (TopicPartition topicPartition: entry.getValue())
currentPartitionConsumer.put(topicPartition, entry.getKey());
List<TopicPartition> sortedPartitions = sortPartitions(partition2AllPotentialConsumers);
将上一次分区进行映射,同时获取当前能够被分配的「主题-分区」,进行排序
//从排序后的分区开始
List<TopicPartition> unassignedPartitions = new ArrayList<>(sortedPartitions);
boolean revocationRequired = false;
for (Iterator<Entry<String, List<TopicPartition>>> it = currentAssignment.entrySet().iterator(); it.hasNext();) {
Map.Entry<String, List<TopicPartition>> entry = it.next();
//如果消费者从消费者组中离线了 删除它
if (!subscriptions.containsKey(entry.getKey())) {
for (TopicPartition topicPartition: entry.getValue())
currentPartitionConsumer.remove(topicPartition);
it.remove();
} else {
for (Iterator<TopicPartition> partitionIter = entry.getValue().iterator(); partitionIter.hasNext();) {
TopicPartition partition = partitionIter.next();
//如果说主题调整了不存在了 删除它
if (!partition2AllPotentialConsumers.containsKey(partition)) {
partitionIter.remove();
currentPartitionConsumer.remove(partition);
//如果当前消费者没订阅主题了 删除它
} else if (!subscriptions.get(entry.getKey()).topics().contains(partition.topic())) {
partitionIter.remove();
revocationRequired = true;
} else
unassignedPartitions.remove(partition);
}
}
}
然后修修整上一次分配信息,维持原来分配的基础上,删除那些已经失效的主题或者分区
以上就是一次预分配,在上一次的基础上预分配,和第一种情况差不多的步骤,区别就是订阅主题的不同,需要转换获取更多信息
// 按照已分配的数量进行排序
TreeSet<String> sortedCurrentSubscriptions = new TreeSet<>(new SubscriptionComparator(currentAssignment));
sortedCurrentSubscriptions.addAll(currentAssignment.keySet());
balance(currentAssignment, prevAssignment, sortedPartitions, unassignedPartitions, sortedCurrentSubscriptions,
consumer2AllPotentialPartitions, partition2AllPotentialConsumers, currentPartitionConsumer, revocationRequired);
return currentAssignment;
按照被分配分区数量对消费者进行排序,进入到balance方法,进行平衡操作
private void balance(Map<String, List<TopicPartition>> currentAssignment,
Map<TopicPartition, ConsumerGenerationPair> prevAssignment,
List<TopicPartition> sortedPartitions,
List<TopicPartition> unassignedPartitions,
TreeSet<String> sortedCurrentSubscriptions,
Map<String, List<TopicPartition>> consumer2AllPotentialPartitions,
Map<TopicPartition, List<String>> partition2AllPotentialConsumers,
Map<TopicPartition, String> currentPartitionConsumer,
boolean revocationRequired) {
boolean initializing = currentAssignment.get(sortedCurrentSubscriptions.last()).isEmpty();
boolean reassignmentPerformed = false;
//遍历未被分配的分区
for (TopicPartition partition: unassignedPartitions) {
if (partition2AllPotentialConsumers.get(partition).isEmpty())
continue;
//因为我们按照了消费者被分配的数量进行排序,所以优先分配给小的消费者 就遍历这个排序集合
//剩下的只需要判断这个分区是否能够分配就行了
assignPartition(partition, sortedCurrentSubscriptions, currentAssignment,
consumer2AllPotentialPartitions, currentPartitionConsumer);
}
private void assignPartition(TopicPartition partition,
TreeSet<String> sortedCurrentSubscriptions,
Map<String, List<TopicPartition>> currentAssignment,
Map<String, List<TopicPartition>> consumer2AllPotentialPartitions,
Map<TopicPartition, String> currentPartitionConsumer) {
for (String consumer: sortedCurrentSubscriptions) {
if (consumer2AllPotentialPartitions.get(consumer).contains(partition)) {
sortedCurrentSubscriptions.remove(consumer);
currentAssignment.get(consumer).add(partition);
currentPartitionConsumer.put(partition, consumer);
sortedCurrentSubscriptions.add(consumer);
break;
}
}
}
initializing获取了「被分配分区数量」排序最大值的消费者的分配情况,如果为空说明当前没有分配(上一次分配和现在毫无关系,topic或消费者完全变更),如果这种情况说明以后不能进行回退分配
reassignmentPerformed说明是否发生了公平操作
++首先将没有被分配的分区全部分配出去,不管是否平衡++
// 到目前为止 已经分配出去了 但是可能不均衡
//这一步是计算那些可能有多个消费者的分区 会触发重新分配
Set<TopicPartition> fixedPartitions = new HashSet<>();
for (TopicPartition partition: partition2AllPotentialConsumers.keySet())
//过滤掉只有一个消费者的分区 这种分区就无法被重新分配
if (!canParticipateInReassignment(partition, partition2AllPotentialConsumers))
fixedPartitions.add(partition);
sortedPartitions.removeAll(fixedPartitions);
unassignedPartitions.removeAll(fixedPartitions);
Map<String, List<TopicPartition>> preBalanceAssignment = deepCopy(currentAssignment);
Map<TopicPartition, String> preBalancePartitionConsumers = new HashMap<>(currentPartitionConsumer);
//能否进行重分配,分区能被分给2个以上的消费者
private boolean canParticipateInReassignment(TopicPartition partition,
Map<TopicPartition, List<String>> partition2AllPotentialConsumers) {
return partition2AllPotentialConsumers.get(partition).size() >= 2;
}
然后是缩小重新公平分配的「分区」范围,过滤那些分区只能分配给一个消费者的情况,比如说只有一个消费者订阅了该主题,这种情况就不能执行重分配
Map<String, List<TopicPartition>> fixedAssignments = new HashMap<>();
for (String consumer: consumer2AllPotentialPartitions.keySet())
if (!canParticipateInReassignment(consumer, currentAssignment,
consumer2AllPotentialPartitions, partition2AllPotentialConsumers)) {
sortedCurrentSubscriptions.remove(consumer);
fixedAssignments.put(consumer, currentAssignment.remove(consumer));
}
private boolean canParticipateInReassignment(String consumer,
Map<String, List<TopicPartition>> currentAssignment,
Map<String, List<TopicPartition>> consumer2AllPotentialPartitions,
Map<TopicPartition, List<String>> partition2AllPotentialConsumers) {
List<TopicPartition> currentPartitions = currentAssignment.get(consumer);
int currentAssignmentSize = currentPartitions.size();
int maxAssignmentSize = consumer2AllPotentialPartitions.get(consumer).size();
if (currentAssignmentSize > maxAssignmentSize)
log.error("The consumer {} is assigned more partitions than the maximum possible.", consumer);
//触发重新分配条件
if (currentAssignmentSize < maxAssignmentSize)
return true;
//如果说现在分配的分区最大能分配的相等 表示一个主题的所有分区全被分配给一个消费者了
//需要判断一下是否有多个消费者订阅该主题 触发重分配操作
for (TopicPartition partition: currentPartitions)
if (canParticipateInReassignment(partition, partition2AllPotentialConsumers))
return true;
return false;
}
这一步是筛选能执行重新分配的消费者,条件是:该消费当前分配的数量 < 能够最大分配数量,如果相等则要按照上一步判断,是否该分区能够被分配给多个消费者
这里很绕,但是核心思想就是去除某个只被一个消费者订阅的主题
Map<String, List<TopicPartition>> preBalanceAssignment = deepCopy(currentAssignment);
Map<TopicPartition, String> preBalancePartitionConsumers = new HashMap<>(currentPartitionConsumer);
// 如果说没有取消订阅一些主题 使用unassignedPartitions来作为参数
if (!revocationRequired) {
performReassignments(unassignedPartitions, currentAssignment, prevAssignment, sortedCurrentSubscriptions,
consumer2AllPotentialPartitions, partition2AllPotentialConsumers, currentPartitionConsumer);
}
//执行所有分区的重分配
reassignmentPerformed = performReassignments(sortedPartitions, currentAssignment, prevAssignment, sortedCurrentSubscriptions,
consumer2AllPotentialPartitions, partition2AllPotentialConsumers, currentPartitionConsumer);
对原始分配情况进行拷贝,调用performReassignments方法进行重新分配,这里有一个判断:是否取消过订阅,区别是performReassignments的首个参数不同,看了这个方法的源码才知道为什么
private boolean performReassignments(List<TopicPartition> reassignablePartitions,
Map<String, List<TopicPartition>> currentAssignment,
Map<TopicPartition, ConsumerGenerationPair> prevAssignment,
TreeSet<String> sortedCurrentSubscriptions,
Map<String, List<TopicPartition>> consumer2AllPotentialPartitions,
Map<TopicPartition, List<String>> partition2AllPotentialConsumers,
Map<TopicPartition, String> currentPartitionConsumer) {
boolean reassignmentPerformed = false;
boolean modified;
do {
modified = false;
Iterator<TopicPartition> partitionIterator = reassignablePartitions.iterator();
while (partitionIterator.hasNext() && !isBalanced(currentAssignment, sortedCurrentSubscriptions, consumer2AllPotentialPartitions)) {
TopicPartition partition = partitionIterator.next();
// 分区必须的有两个消费者才能重新分配
if (partition2AllPotentialConsumers.get(partition).size() <= 1)
log.error("Expected more than one potential consumer for partition '{}'", partition);
// 分区必须的有消费者
String consumer = currentPartitionConsumer.get(partition);
if (consumer == null)
log.error("Expected partition '{}' to be assigned to a consumer", partition);
//这个分区前一代被分配过 但是这一代分配数量比上一代还多 触发重新分配
if (prevAssignment.containsKey(partition) &&
currentAssignment.get(consumer).size() > currentAssignment.get(prevAssignment.get(partition).consumer).size() + 1) {
reassignPartition(partition, currentAssignment, sortedCurrentSubscriptions, currentPartitionConsumer, prevAssignment.get(partition).consumer);
reassignmentPerformed = true;
modified = true;
continue;
}
// 因为一个分区可以分配给多个消费者 被分配的消费者数量比其他要多 可以转移
for (String otherConsumer: partition2AllPotentialConsumers.get(partition)) {
if (currentAssignment.get(consumer).size() > currentAssignment.get(otherConsumer).size() + 1) {
reassignPartition(partition, currentAssignment, sortedCurrentSubscriptions, currentPartitionConsumer, consumer2AllPotentialPartitions);
reassignmentPerformed = true;
modified = true;
break;
}
}
}
} while (modified);
return reassignmentPerformed;
}
需要对「主题-分区」进行遍历,来判断是否达到了平衡,如何才需要移动?
比如当前我们将「分区1」分配给了「消费者1」,这个分区可以分配给「消费者1」和「消费者2」,如果「消费者1」被分配的数量大于「消费者2」的数量+1,说明可以把这个分区移动给「消费者2」
private boolean isBalanced(Map<String, List<TopicPartition>> currentAssignment,
TreeSet<String> sortedCurrentSubscriptions,
Map<String, List<TopicPartition>> allSubscriptions) {
//按照排序结果 被分配最大数量和最小数量相差1 说明全部消费者平衡了
//但是这是理想情况 对于多个主题来说不太可能 这里是简单判断
int min = currentAssignment.get(sortedCurrentSubscriptions.first()).size();
int max = currentAssignment.get(sortedCurrentSubscriptions.last()).size();
if (min >= max - 1)
return true;
// 将当前已被分区的情况进行映射
final Map<TopicPartition, String> allPartitions = new HashMap<>();
Set<Entry<String, List<TopicPartition>>> assignments = currentAssignment.entrySet();
for (Map.Entry<String, List<TopicPartition>> entry: assignments) {
List<TopicPartition> topicPartitions = entry.getValue();
for (TopicPartition topicPartition: topicPartitions) {
if (allPartitions.containsKey(topicPartition))
log.error("{} is assigned to more than one consumer.", topicPartition);
allPartitions.put(topicPartition, entry.getKey());
}
}
for (String consumer: sortedCurrentSubscriptions) {
List<TopicPartition> consumerPartitions = currentAssignment.get(consumer);
int consumerPartitionCount = consumerPartitions.size();
// 如果当前消费者被分配了它能拥有的所有分区 则跳过
if (consumerPartitionCount == allSubscriptions.get(consumer).size())
continue;
//遍历当前消费者所有能被分配的主题分区 去验证是否被分区
//如果没有被分区 说明被其他消费者分配了 但是自己的数量被其他消费者数量少
//说明可以把对方的分区拿过来
List<TopicPartition> potentialTopicPartitions = allSubscriptions.get(consumer);
for (TopicPartition topicPartition: potentialTopicPartitions) {
if (!currentAssignment.get(consumer).contains(topicPartition)) {
String otherConsumer = allPartitions.get(topicPartition);
int otherConsumerPartitionCount = currentAssignment.get(otherConsumer).size();
if (consumerPartitionCount < otherConsumerPartitionCount) {
log.debug("{} can be moved from consumer {} to consumer {} for a more balanced assignment.",
topicPartition, otherConsumer, consumer);
return false;
}
}
}
}
return true;
}
以上是判断分配是否平衡的方法,对每一个消费者进行遍历获取当前「被分配的数量」,然后遍历「当前消费者可以被分配的所有分区」,如果这里面的分区被分配给了其他消费者,其他消费者「被分配的数量」又很多,说明可以抢过来!于是不平衡
//如果触发过重分配 但是分配还不如以前
if (!initializing && reassignmentPerformed && getBalanceScore(currentAssignment) >= getBalanceScore(preBalanceAssignment)) {
deepCopy(preBalanceAssignment, currentAssignment);
currentPartitionConsumer.clear();
currentPartitionConsumer.putAll(preBalancePartitionConsumers);
}
// 将那些无法改变的分配恢复
for (Entry<String, List<TopicPartition>> entry: fixedAssignments.entrySet()) {
String consumer = entry.getKey();
currentAssignment.put(consumer, entry.getValue());
sortedCurrentSubscriptions.add(consumer);
}
fixedAssignments.clear();
经过一些骚操作过后,发现还不如原来分配得好,就回退
总结
代码虽然贼上,但是绝大部份都是转换映射关系。核心思路就是:
- 获取上一次分配情况,结合这一次订阅主题进行枝剪,删除过时的分配情况,剩下的就是有效分配。
- 对这一次有效分配进行平衡,如果该分区可以分配给很多个消费者,那么一定是分配给当前最少的消费者,如果不是这样就需要转移分区
CooperativeStickyAssignor
Kafka的新版有两种协议:COOPERATIVE和EAGER,上面介绍的三种重分配策略就属于EAGER,而CooperativeStickyAssignor则属于COOPERATIVE
如果是大规模集群,几百上千个集群组成,随时都在发生上线下线,订阅的改变。那么EAGER每次都会大规模重新分配,虽然有黏性平衡策略,但是还是会慢,cooperative协议将一次全局重平衡,改成每次小规模重平衡,直至最终收敛平衡的过程
@Override
protected MemberData memberData(Subscription subscription) {
return new MemberData(subscription.ownedPartitions(), Optional.empty());
}
@Override
public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
Map<String, Subscription> subscriptions) {
Map<String, List<TopicPartition>> assignments = super.assign(partitionsPerTopic, subscriptions);
Map<TopicPartition, String> partitionsTransferringOwnership = super.partitionsTransferringOwnership == null ?
computePartitionsTransferringOwnership(subscriptions, assignments) :
super.partitionsTransferringOwnership;
adjustAssignment(assignments, partitionsTransferringOwnership);
return assignments;
}
当发生重分配的时候,各个消费者会将自己的分配信息发出来保存为ownedPartitions,而这个memberData方法就是获取ownedPartitions
(这里很奇怪就是,为什么StickyAssignor是通过解析二进制数据来获取上一次分配信息,而CooperativeStickyAssignor直接通过消费者发送分配分区?)
而且通过assign代码来看,还是完整执行了AbstractStickyAssignor方法,和StickyAssignor的分配方式一样,然后下面的操作就是把重新分配的结果给删除了!
private void adjustAssignment(Map<String, List<TopicPartition>> assignments,
Map<TopicPartition, String> partitionsTransferringOwnership) {
for (Map.Entry<TopicPartition, String> partitionEntry : partitionsTransferringOwnership.entrySet()) {
assignments.get(partitionEntry.getValue()).remove(partitionEntry.getKey());
}
}
将分配好的分区给删除。。。
#总结
以上就是四种分区分配流程,对于最后一种CooperativeStickyAssignor,还需要结合服务端才可以看出来,坑留到后面解决