KafkaConsumer的poll分析1之加入消费者群组

1,736 阅读15分钟

1 前言

KafkaConsumer进行轮询对消息进行消费时,大概流程如下:

今天,就来分析下updateAssignmentMetadataIfNeeded方法中的第一步,即调用ConsumerCoordinator类的poll方法,加入消费者群组。

2 源码分析

ConsumerCoordinator类的poll方法,确保了该group的coordinator是已知的,并且这个consumer是已经加入到group中,也用于offset周期性的commit

2.1 流程梳理

public boolean poll(Timer timer) {
        // 执行已完成的offset提交请求的回调函数
        invokeCompletedOffsetCommitCallbacks();

        //如果分区负载算法是自动分配的(Kafka根据消费者个数与分区数动态负载)
        //即以subscribe方式订阅的(其他方式见 SubscriptionState )
        if (subscriptions.partitionsAutoAssigned()) {
            // Always update the heartbeat last poll time so that the heartbeat thread does not leave the
            // group proactively due to application inactivity even if (say) the coordinator cannot be found.
            // 检查心跳线程是否运行正常,如果心跳线程失败则抛出异常,反之则更新pol调用时间
            pollHeartbeat(timer.currentTimeMs());
            if (coordinatorUnknown() && !ensureCoordinatorReady(timer)) {  // 如果不存在协调器或者协调器已断开连接,返回false,结束本次拉取
                return false;
            }

            //判断是否需要重新加入group,如果订阅的partition变化或者分配的partition变化
            if (rejoinNeededOrPending()) {
                
                if (subscriptions.hasPatternSubscription()) {     // 判断订阅方式是不是Auto-Pattern
                    // 返回下一次可以进行更新cluster元数据信息的时间间隔,为0说明当前是可以更新的
                    if (this.metadata.timeToAllowUpdate(time.milliseconds()) == 0) {
                        //设置needUpdate为true
                        this.metadata.requestUpdate();
                    }

                    // 判断是否可以刷新元数据
                    // 1.满足needUpdate属性为true 或者 2.下次更新cluster元数据信息等待时间为0
                    // 刷新元数据信息,最终调用的是NetworkClient中的poll方法
                    if (!client.ensureFreshMetadata(timer)) {
                        return false;
                    }
                }

                // 向GroupCoordinator发送请求
                if (!ensureActiveGroup(timer)) {    //确保group是active的;加入group;分配订阅的partition
                    return false;
                }
            }
        } else {
            if (metadata.updateRequested() && !client.hasReadyNodes(timer.currentTimeMs())) {
                client.awaitMetadataUpdate(timer);
            }
        }
        maybeAutoCommitOffsetsAsync(timer.currentTimeMs());
        return true;
    }

在poll方法中,具体实现,可以分为三个步骤:

1.如果是通过subscribe()方法订阅topic的,并且ConsumerCooridinator是未知的,就在ensureCoordinatorReady()中实现初始化ConsumerCoordinator,主要就是发送GroupCoordinator请求,并且建立连接。

2.通过rejoinNeededOrPending()判断是否需要重新加入group中,通过ensureActiveGroup发送join-group、sync-group请求,加入到group中并且获取其分配的TopicPartition列表(在这一步中,判断如果订阅方式是Auto-Pattern的,还需要强制更新元数据信息)

3.如果是通过assign()方式进行订阅的,则不需要进行Cooridinator相关的过程,只需要更新元数据信息,并且与相应的node连接准备好即可

4.无论是哪一种方式,果设置的是自动commit,如果定时达到自动commit

2.2 ensureCoordinatorReady(),初始化ConsumerCoordinator

该方法的作用就是:选择一个请求连接数最少的broker,向其发送GroupCoordinator请求,并且建立相应的TCP连接。

  • 其方法调用的流程为:ensureCoordinatorReady() –> lookupCoordinator() –> sendGroupCoordinatorRequest()
  • 如果 client 获取到 Server response,那么就会与 GroupCoordinator 建立连接;

ensureCoordinatorReady():

   // 确保coordinator已经准备好,返回true
    protected synchronized boolean ensureCoordinatorReady(final Timer timer) {
        if (!coordinatorUnknown())
            return true;

        do {
            final RequestFuture<Void> future = lookupCoordinator();  //获取Group Coordinator,并且建立连接
            client.poll(future, timer);

            if (!future.isDone()) { // 获取过程未完成(超时等),直接跳出循环返回false
                // ran out of time
                break;
            }

            if (future.failed()) {  // 获取过程失败了
                if (future.isRetriable()) { 
                    log.debug("Coordinator discovery failed, refreshing metadata");
                    client.awaitMetadataUpdate(timer);
                } else
                    throw future.exception();
            } else if (coordinator != null && client.isUnavailable(coordinator)) {
                // we found the coordinator, but the connection has failed, so mark
                // it dead and backoff before retrying discovery
                markCoordinatorUnknown();
                timer.sleep(retryBackoffMs);
            }
        } while (coordinatorUnknown() && timer.notExpired());  //在没有ConsumerCoordinator或者时间还没有过期

        return !coordinatorUnknown();
    }

lookupCoordinator():选择一个连接最小的节点,发送groupCoordinator请求

    protected synchronized RequestFuture<Void> lookupCoordinator() {
        if (findCoordinatorFuture == null) {
            // find a node to ask about the coordinator
            Node node = this.client.leastLoadedNode();  //选择一个连接请求数最少的节点
            if (node == null) {
                log.debug("No broker available to send FindCoordinator request");
                return RequestFuture.noBrokersAvailable();
            } else
            //发送请求,并对 response 进行处理
                findCoordinatorFuture = sendFindCoordinatorRequest(node); 
        }
        return findCoordinatorFuture;
    }

sendFindCoordinatorRequestGroupCoordinatorResponseHandler对GroupCoordinator的response进行回调处理

    //发送GroupCoordinator的请求,并且对response进行处理
    private RequestFuture<Void> sendFindCoordinatorRequest(Node node) {
        // initiate the group metadata request
        log.debug("Sending FindCoordinator request to broker {}", node);
        FindCoordinatorRequest.Builder requestBuilder =
                new FindCoordinatorRequest.Builder(FindCoordinatorRequest.CoordinatorType.GROUP, this.groupId);
        return client.send(node, requestBuilder)
                     .compose(new FindCoordinatorResponseHandler()); // compose的作用是就是将GroupCoordinatorResponseHandler类转换为RequestFuture.实际上就是为返回的Future类重置onSuccess()和onFailure()方法
    }

    // 对GroupCoordinator的response进行处理,回调
    private class FindCoordinatorResponseHandler extends RequestFutureAdapter<ClientResponse, Void> {

        @Override
        public void onSuccess(ClientResponse resp, RequestFuture<Void> future) {
            log.debug("Received FindCoordinator response {}", resp);
            clearFindCoordinatorFuture();

            FindCoordinatorResponse findCoordinatorResponse = (FindCoordinatorResponse) resp.responseBody();
            Errors error = findCoordinatorResponse.error();
            if (error == Errors.NONE) {
                //如果正确获取GroupCoordinator时,建立连接并且更新心跳时间
                synchronized (AbstractCoordinator.this) {
                    // use MAX_VALUE - node.id as the coordinator id to allow separate connections
                    // for the coordinator in the underlying network client layer
                    int coordinatorConnectionId = Integer.MAX_VALUE - findCoordinatorResponse.node().id();

                    AbstractCoordinator.this.coordinator = new Node(
                            coordinatorConnectionId,
                            findCoordinatorResponse.node().host(),
                            findCoordinatorResponse.node().port());
                    log.info("Discovered group coordinator {}", coordinator);
                    client.tryConnect(coordinator);  //初始化tcp连接
                    heartbeat.resetSessionTimeout(); //更新心跳时间
                }
                future.complete(null);
            } else if (error == Errors.GROUP_AUTHORIZATION_FAILED) {
                future.raise(new GroupAuthorizationException(groupId));
            } else {
                log.debug("Group coordinator lookup failed: {}", error.message());
                future.raise(error);
            }
        }

        @Override
        public void onFailure(RuntimeException e, RequestFuture<Void> future) {
            clearFindCoordinatorFuture();
            super.onFailure(e, future);
        }
    }

2.3 ensureActiveGroup(),向 GroupCoordinator 发送 join-group、sync-group 请求

  • ensureActiveGroup方法的调用过程:ensureActiveGroup() –> ensureCoordinatorReady() –> startHeartbeatThreadIfNeeded() –> joinGroupIfNeeded();
  • joinGroupIfNeeded()方法中最重要的是initiateJoinGroup(),该方法的调用过程为:initiateJoinGroup() –> sendJoinGroupRequest() –> JoinGroupResponseHandler.handle().succeed –> onJoinLeader()/onJoinFollower() –> sendSyncGroupRequest() –> SyncGroupResponseHandler

ensureActiveGroup方法

    boolean ensureActiveGroup(final Timer timer) {
        // always ensure that the coordinator is ready because we may have been disconnected
        // when sending heartbeats and does not necessarily require us to rejoin the group.
        if (!ensureCoordinatorReady(timer)) {  // 确保已经与Group Coordinator建立连接
            return false;
        }

        startHeartbeatThreadIfNeeded();  //启动心跳发送线程(并不一定立即发送心跳,满足条件后才会发送心跳)
        return joinGroupIfNeeded(timer); //发送 JoinGroup 请求,对返回的信息进行处理
    }

join-group请求是在joinGroupIfNeeded()实现

    boolean joinGroupIfNeeded(final Timer timer) {
        while (rejoinNeededOrPending()) {
            if (!ensureCoordinatorReady(timer)) {
                return false;
            }
            // 触发onJoinPrepare,包括 offset commit 和 rebalance listener
            if (needsJoinPrepare) {
                onJoinPrepare(generation.generationId, generation.memberId);
                needsJoinPrepare = false;
            }
            // 初始化 JoinGroup 请求,并且发送该请求
            final RequestFuture<ByteBuffer> future = initiateJoinGroup();
            client.poll(future, timer); //客户端轮询确保异步请求完成后返回
            if (!future.isDone()) {
                // we ran out of time
                return false;
            }

            if (future.succeeded()) {  //请求完成,根据结果处理回调
                // Duplicate the buffer in case `onJoinComplete` does not complete and needs to be retried.
                ByteBuffer memberAssignment = future.value().duplicate();
                onJoinComplete(generation.generationId, generation.memberId, generation.protocol, memberAssignment);

                // We reset the join group future only after the completion callback returns. This ensures
                // that if the callback is woken up, we will retry it on the next joinGroupIfNeeded.
                resetJoinGroupFuture();
                needsJoinPrepare = true;
            } else {
                resetJoinGroupFuture();
                final RuntimeException exception = future.exception();
                if (exception instanceof UnknownMemberIdException ||
                        exception instanceof RebalanceInProgressException ||
                        exception instanceof IllegalGenerationException ||
                        exception instanceof MemberIdRequiredException)
                    continue;
                else if (!future.isRetriable())
                    throw exception;

                timer.sleep(retryBackoffMs);
            }
        }
        return true;
    }

sendJoinGroupRequest()方法是initiateJoinGroup()方法来调用的

    // 发送joinGroup请求,并且添加 listener
    private synchronized RequestFuture<ByteBuffer> initiateJoinGroup() {
        if (joinFuture == null) {
            // fence off the heartbeat thread explicitly so that it cannot interfere with the join group.
            // Note that this must come after the call to onJoinPrepare since we must be able to continue
            // sending heartbeats if that callback takes some time.
            //在rebalance期间,心跳线程停止
            disableHeartbeatThread();
            //将成员状态标记为rebalance
            state = MemberState.REBALANCING;
            //发送JoinGroup请求
            joinFuture = sendJoinGroupRequest();
            joinFuture.addListener(new RequestFutureListener<ByteBuffer>() {
                @Override
                public void onSuccess(ByteBuffer value) {
                    // handle join completion in the callback so that the callback will be invoked
                    // even if the consumer is woken up before finishing the rebalance
                    synchronized (AbstractCoordinator.this) {
                        log.info("Successfully joined group with generation {}", generation.generationId);
                        state = MemberState.STABLE;  //标记 Consumer状态 为stable
                        rejoinNeeded = false;

                        if (heartbeatThread != null)
                            heartbeatThread.enable();
                    }
                }

                @Override
                public void onFailure(RuntimeException e) {
                    // we handle failures below after the request finishes. if the join completes
                    // after having been woken up, the exception is ignored and we will rejoin
                    synchronized (AbstractCoordinator.this) {
                        state = MemberState.UNJOINED;  //标记 Consumer状态为 unjoined
                    }
                }
            });
        }
        return joinFuture;
    }

sendJoinGroupRequest()及其处理如下。

    // 发送JoinGroup请求,并且返回分区指定方案
    RequestFuture<ByteBuffer> sendJoinGroupRequest() {
        if (coordinatorUnknown())
            return RequestFuture.coordinatorNotAvailable();

        // send a join group request to the coordinator
        log.info("(Re-)joining group");
        // 消费者创建“加入组请求”,包括消费者的元数据作为请求的数据内容
        // 消费者发送请求用到的元数据,Assignor(分区分配器)会用在具体分区分配器的算法执行上,即assign方法上
        // subscriptions表示每个消费者的订阅信息,让消费者都发送自己的订阅信息给协调者,协调者就可以收集到所有消费者订阅的主题;
        // metadata是集群的元数据,记录了每个主题的相关信息,包括主题的分区数。这样协调者就可以将对应主题的分区,分配给所有订阅这些主题的消费者
        JoinGroupRequest.Builder requestBuilder = new JoinGroupRequest.Builder(
                groupId,  //消费者组id
                this.sessionTimeoutMs, //会话超时时间
                this.generation.memberId, //消费者成员编号
                protocolType(),  //协议类型
                metadata())  //元数据
                .setRebalanceTimeout(this.rebalanceTimeoutMs);

        log.debug("Sending JoinGroup ({}) to coordinator {}", requestBuilder, this.coordinator);

        // Note that we override the request timeout using the rebalance timeout since that is the
        // maximum time that it may block on the coordinator. We add an extra 5 seconds for small delays.

        int joinGroupTimeoutMs = Math.max(rebalanceTimeoutMs, rebalanceTimeoutMs + 5000);
        // 消费者发送“加入组请求”,采用组合模式返回一个新的异步请求对象,并且定义回调器
        return client.send(coordinator, requestBuilder, joinGroupTimeoutMs)
                .compose(new JoinGroupResponseHandler());
    }

    //同步group信息
    private class JoinGroupResponseHandler extends CoordinatorResponseHandler<JoinGroupResponse, ByteBuffer> {
        @Override
        public void handle(JoinGroupResponse joinResponse, RequestFuture<ByteBuffer> future) {
            Errors error = joinResponse.error();
            if (error == Errors.NONE) {
                log.debug("Received successful JoinGroup response: {}", joinResponse);
                sensors.joinLatency.record(response.requestLatencyMs());

                synchronized (AbstractCoordinator.this) {
                    if (state != MemberState.REBALANCING) {
                        // if the consumer was woken up before a rebalance completes, we may have already left
                        // the group. In this case, we do not want to continue with the sync group.
                        future.raise(new UnjoinedGroupException());
                    } else {
                        AbstractCoordinator.this.generation = new Generation(joinResponse.generationId(),
                                joinResponse.memberId(), joinResponse.groupProtocol());
                        // Join Group成功之后,需要进行sync-group,获取分配的TopicPartition列表
                        // 协调者在收集完所有的消费者及其订阅消息后,并不执行具体的任务分配算法,而是交给其中一个消费者作为主消费者执行分区分配任务
                        if (joinResponse.isLeader()) {
                            onJoinLeader(joinResponse).chain(future);
                        } else {
                            onJoinFollower().chain(future);
                        }
                    }
                }
            } else if (error == Errors.COORDINATOR_LOAD_IN_PROGRESS) {
                log.debug("Attempt to join group rejected since coordinator {} is loading the group.", coordinator());
                // backoff and retry
                future.raise(error);
            } else if (error == Errors.UNKNOWN_MEMBER_ID) {
                // reset the member id and retry immediately
                resetGeneration();
                log.debug("Attempt to join group failed due to unknown member id.");
                future.raise(Errors.UNKNOWN_MEMBER_ID);
            } else if (error == Errors.COORDINATOR_NOT_AVAILABLE
                    || error == Errors.NOT_COORDINATOR) {
                // re-discover the coordinator and retry with backoff
                markCoordinatorUnknown();
                log.debug("Attempt to join group failed due to obsolete coordinator information: {}", error.message());
                future.raise(error);
            } else if (error == Errors.INCONSISTENT_GROUP_PROTOCOL
                    || error == Errors.INVALID_SESSION_TIMEOUT
                    || error == Errors.INVALID_GROUP_ID
                    || error == Errors.GROUP_AUTHORIZATION_FAILED
                    || error == Errors.GROUP_MAX_SIZE_REACHED) {
                log.error("Attempt to join group failed due to fatal error: {}", error.message());
                if (error == Errors.GROUP_MAX_SIZE_REACHED) {
                    future.raise(new GroupMaxSizeReachedException(groupId));
                } else if (error == Errors.GROUP_AUTHORIZATION_FAILED) {
                    future.raise(new GroupAuthorizationException(groupId));
                } else {
                    future.raise(error);
                }
            } else if (error == Errors.MEMBER_ID_REQUIRED) {
                // Broker requires a concrete member id to be allowed to join the group. Update member id
                // and send another join group request in next cycle.
                synchronized (AbstractCoordinator.this) {
                    AbstractCoordinator.this.generation = new Generation(OffsetCommitRequest.DEFAULT_GENERATION_ID,
                        joinResponse.memberId(), null);
                    AbstractCoordinator.this.rejoinNeeded = true;
                    AbstractCoordinator.this.state = MemberState.UNJOINED;
                }
                future.raise(Errors.MEMBER_ID_REQUIRED);
            } else {
                // unexpected error, throw the exception
                log.error("Attempt to join group failed due to unexpected error: {}", error.message());
                future.raise(new KafkaException("Unexpected error in join group response: " + error.message()));
            }
        }
    }

sendJoinGroupRequest():向 GroupCoordinator 发送 join-group 请求

对应GroupCoordinator的handleJoinGroup方法

  1. 如果group是新的group.id,那么创建GroupMetadata实例,此时group初始化状态为Empty
  2. 当 GroupCoordinator 接收到 consumer 的 join-group 请求后,由于此时这个 group 的 member 列表还是空(group 是新建的,每个 consumer 实例被称为这个 group 的一个 member),第一个加入的 member 将被选为 leader,也就是说,对于一个新的 consumer group 而言,当第一个 consumer 实例加入后将会被选为 leader;
  3. 如果 GroupCoordinator 接收到 leader 发送 join-group 请求,将会触发 rebalance,group 的状态变为 PreparingRebalance;
  4. 此时,GroupCoordinator 将会等待一定的时间,如果在一定时间内,接收到 join-group 请求的 consumer 将被认为是依然存活的,此时 group 会变为 AwaitSync 状态,并且 GroupCoordinator 会向这个 group 的所有 member 返回其 response;
  5. consumer 在接收到 GroupCoordinator 的 response 后,如果这个 consumer 是 group 的 leader,那么这个 consumer 将会负责为整个 group assign partition 订阅安排(默认是按 range 的策略,目前也可选 roundrobin),然后 leader 将分配后的信息以sendSyncGroupRequest() 请求的方式发给 GroupCoordinator,而作为 follower 的 consumer 实例会发送一个空列表;
  6. GroupCoordinator 在接收到 leader 发来的请求后,会将 assign 的结果返回给所有已经发送 sync-group 请求的 consumer 实例,并且 group 的状态将会转变为 Stable,如果后续再收到 sync-group 请求,由于 group 的状态已经是 Stable,将会直接返回其分配结果。

sync-group请求的发送

    // 当consumer为follower时,从 GroupCoordinator 拉取分配结果
    // new SyncGroupRequest.Builder 最后一个参数为空列表
    private RequestFuture<ByteBuffer> onJoinFollower() {
        // send follower's sync group with an empty assignment
        SyncGroupRequest.Builder requestBuilder =
                new SyncGroupRequest.Builder(groupId, generation.generationId, generation.memberId,
                        Collections.<String, ByteBuffer>emptyMap());
        log.debug("Sending follower SyncGroup to coordinator {}: {}", this.coordinator, requestBuilder);
        return sendSyncGroupRequest(requestBuilder);
    }

    //当consumer为leader时,对group下的所有实例进行分配,将 assign 的结果发送到 GroupCoordinator
    private RequestFuture<ByteBuffer> onJoinLeader(JoinGroupResponse joinResponse) {
        try {
            // perform the leader synchronization and send back the assignment for the group
            Map<String, ByteBuffer> groupAssignment = performAssignment(joinResponse.leaderId(), joinResponse.groupProtocol(),
                    joinResponse.members());

            SyncGroupRequest.Builder requestBuilder =
                    new SyncGroupRequest.Builder(groupId, generation.generationId, generation.memberId, groupAssignment);
            log.debug("Sending leader SyncGroup to coordinator {}: {}", this.coordinator, requestBuilder);
            // 发送 sync-group 请求
            return sendSyncGroupRequest(requestBuilder);
        } catch (RuntimeException e) {
            return RequestFuture.failure(e);
        }
    }

    private RequestFuture<ByteBuffer> sendSyncGroupRequest(SyncGroupRequest.Builder requestBuilder) {
        if (coordinatorUnknown())
            return RequestFuture.coordinatorNotAvailable();
        return client.send(coordinator, requestBuilder)
                .compose(new SyncGroupResponseHandler());
    }

    private class SyncGroupResponseHandler extends CoordinatorResponseHandler<SyncGroupResponse, ByteBuffer> {
        @Override
        public void handle(SyncGroupResponse syncResponse,
                           RequestFuture<ByteBuffer> future) {
            Errors error = syncResponse.error();
            if (error == Errors.NONE) { //同步成功
                sensors.syncLatency.record(response.requestLatencyMs());
                future.complete(syncResponse.memberAssignment());
            } else {
                requestRejoin();

                if (error == Errors.GROUP_AUTHORIZATION_FAILED) {
                    future.raise(new GroupAuthorizationException(groupId));
                } else if (error == Errors.REBALANCE_IN_PROGRESS) {
                    log.debug("SyncGroup failed because the group began another rebalance");
                    future.raise(error);
                } else if (error == Errors.UNKNOWN_MEMBER_ID
                        || error == Errors.ILLEGAL_GENERATION) {
                    log.debug("SyncGroup failed: {}", error.message());
                    resetGeneration();
                    future.raise(error);
                } else if (error == Errors.COORDINATOR_NOT_AVAILABLE
                        || error == Errors.NOT_COORDINATOR) {
                    log.debug("SyncGroup failed: {}", error.message());
                    markCoordinatorUnknown();
                    future.raise(error);
                } else {
                    future.raise(new KafkaException("Unexpected error from SyncGroup: " + error.message()));
                }
            }
        }
    }

注意:如果是协调者负责分区的分配工作,消费者发送完“加入组请求”后,就可以从“加入组响应”中获得分区,但是,实际协调者并不会执行分区分配,所以它返回的“加入组响应”没有分配结果

协调者返回给主消费者的是:所有消费者成员列表及其对应的订阅信息 ;返回给普通消费者的则没有这些消息。

由于消费者接受的“加入组响应”不是分配的分区,所以不能直接完成“加入组”的异步请求,而应该再次发送“同步组请求”,即在onJoinFollower和onJoinLeader方法中发送sync-group请求

onJoinLeader:不同于onJoinFollower,在收到“加入组响应”后立即发送sync-group请求,而是先获取执行分区分配过程中需要用到的数据,然后调用performAssignment()执行分区分配

onJoinComplete()

//todo

3 join-group和async-group流程总结

加入消费者群组的流程一般来说:

  1. 消费者发送订阅消息给协调者
  2. 协调者收集所有的消费者,以及它们对应的订阅消息
  3. 协调者执行任务分配算法,即具体如何将不同的分区分配给不同的消费者
  4. 分配结果确定后,协调者将分区返回给消费者,消费者分配到分区开始工作

但是,协调者不负责分配分区结果,改进后的具体步骤:

  1. 消费者发送订阅消息给协调者
  2. 协调者收集所有的消费者,以及它们对应的订阅消息
  3. 协调者将所有的消费者成员列表及其订阅消息发送给主消费者
  4. 主消费者将执行具体的分区分配算法
  5. 主消费者将分配结果同步回协调者
  6. 协调者收到主消费者的分配结果,将分区返回给每个消费者

4 主消费者执行分配任务

JoinGroupRequest:"加入组请求"

    private final String groupId; // 消费组编号
    private final int sessionTimeout; // 会话超时时间
    private final int rebalanceTimeout; // 再平衡超时时间
    private final String memberId;  // 消费者成员编号
    private final String protocolType; // 协议类型
    private final List<ProtocolMetadata> groupProtocols; //元数据

JoinGroupResponse:"加入组响应"

    private final int throttleTimeMs;
    private final Errors error;
    private final int generationId; // 纪元编号
    private final String groupProtocol; // 统一的消费组协议
    private final String memberId; // 消费者成员编号
    private final String leaderId; // 主消费者编号,memberId=leaderId就是主消费者
    private final Map<String, ByteBuffer> members; // 所有消费者成员消息(包含编号,还有订阅消息)

performAssignment方法:

    // 在主消费者(ConsumerCoordinator)执行分区分配,返回每个消费者的分区分配结果
    @Override
    protected Map<String, ByteBuffer> performAssignment(String leaderId,
                                                        String assignmentStrategy,
                                                        Map<String, ByteBuffer> allSubscriptions) {
        // 根据协调者指定的消费组协议,获取唯一的分区分配器
        PartitionAssignor assignor = lookupAssignor(assignmentStrategy);
        if (assignor == null)
            throw new IllegalStateException("Coordinator selected invalid assignment protocol: " + assignmentStrategy);

        Set<String> allSubscribedTopics = new HashSet<>();
        // subscriptions是从所有消费者的订阅元数据中解析出来的
        Map<String, Subscription> subscriptions = new HashMap<>();
        for (Map.Entry<String, ByteBuffer> subscriptionEntry : allSubscriptions.entrySet()) {
            // 反序列化消费者的订阅消息
            Subscription subscription = ConsumerProtocol.deserializeSubscription(subscriptionEntry.getValue());
            // 消费者订阅消息的键是消费者成员编号,值是订阅的主题
            subscriptions.put(subscriptionEntry.getKey(), subscription);
            // 所以消费者订阅的所有主题,集群元数据会获取这些主题的所有分区
            allSubscribedTopics.addAll(subscription.topics());
        }

        // the leader will begin watching for changes to any of the topics the group is interested in,
        // which ensures that all metadata changes will eventually be seen
        this.subscriptions.groupSubscribe(allSubscribedTopics);
        metadata.setTopics(this.subscriptions.groupSubscription());

        // update metadata (if needed) and keep track of the metadata used for assignment so that
        // we can check after rebalance completion whether anything has changed
        if (!client.ensureFreshMetadata(time.timer(Long.MAX_VALUE))) throw new TimeoutException();

        isLeader = true;

        log.debug("Performing assignment using strategy {} with subscriptions {}", assignor.name(), subscriptions);
        // 根据分配策略,为所有消费者分配分区。返回值表示每个消费者的分配结果
        Map<String, Assignment> assignment = assignor.assign(metadata.fetch(), subscriptions);

        // user-customized assignor may have created some topics that are not in the subscription list
        // and assign their partitions to the members; in this case we would like to update the leader's
        // own metadata with the newly added topics so that it will not trigger a subsequent rebalance
        // when these topics gets updated from metadata refresh.
        //
        // TODO: this is a hack and not something we want to support long-term unless we push regex into the protocol
        //       we may need to modify the PartitionAssignor API to better support this case.
        Set<String> assignedTopics = new HashSet<>();
        for (Assignment assigned : assignment.values()) {
            for (TopicPartition tp : assigned.partitions())
                assignedTopics.add(tp.topic());
        }

        if (!assignedTopics.containsAll(allSubscribedTopics)) {
            Set<String> notAssignedTopics = new HashSet<>(allSubscribedTopics);
            notAssignedTopics.removeAll(assignedTopics);
            log.warn("The following subscribed topics are not assigned to any members: {} ", notAssignedTopics);
        }

        if (!allSubscribedTopics.containsAll(assignedTopics)) {
            Set<String> newlyAddedTopics = new HashSet<>(assignedTopics);
            newlyAddedTopics.removeAll(allSubscribedTopics);
            log.info("The following not-subscribed topics are assigned, and their metadata will be " +
                    "fetched from the brokers: {}", newlyAddedTopics);

            allSubscribedTopics.addAll(assignedTopics);
            this.subscriptions.groupSubscribe(allSubscribedTopics);
            metadata.setTopics(this.subscriptions.groupSubscription());
            if (!client.ensureFreshMetadata(time.timer(Long.MAX_VALUE))) throw new TimeoutException();
        }

        assignmentSnapshot = metadataSnapshot;

        log.debug("Finished assignment for group: {}", assignment);

        Map<String, ByteBuffer> groupAssignment = new HashMap<>();
        for (Map.Entry<String, Assignment> assignmentEntry : assignment.entrySet()) {
            ByteBuffer buffer = ConsumerProtocol.serializeAssignment(assignmentEntry.getValue());
            groupAssignment.put(assignmentEntry.getKey(), buffer);
        }

        return groupAssignment;
    }

获取分区列表过程:

5 分区器的不同实现类

AbstractPartitionAssignor实现了PartitionAssigner的assign()分区分配方法,但是也定义了一个参数类型不同的assign()抽象方法

public abstract class AbstractPartitionAssignor implements PartitionAssignor {

    public abstract Map<String, List<TopicPartition>> assign(
    // 每个主题的分区数量
    Map<String, Integer> partitionsPerTopic,
    // 每个消费者订阅的主题列表
    Map<String, Subscription> subscriptions);
                                                            
    @Override
    public Map<String, Assignment> assign(
        // 集群元数据
        Cluster metadata,
        // 所有消费者的订阅消息
        Map<String, Subscription> subscriptions) {
        Set<String> allSubscribedTopics = new HashSet<>();
        for (Map.Entry<String, Subscription> subscriptionEntry : subscriptions.entrySet())
            allSubscribedTopics.addAll(subscriptionEntry.getValue().topics());

        Map<String, Integer> partitionsPerTopic = new HashMap<>();
        for (String topic : allSubscribedTopics) {
            Integer numPartitions = metadata.partitionCountForTopic(topic);
            if (numPartitions != null && numPartitions > 0)
                partitionsPerTopic.put(topic, numPartitions);
            else
                log.debug("Skipping assignment for topic {} since no metadata is available", topic);
        }
        // 调用上面的assign方法
        Map<String, List<TopicPartition>> rawAssignments = assign(partitionsPerTopic, subscriptions);

        // this class maintains no user data, so just wrap the results
        Map<String, Assignment> assignments = new HashMap<>();
        for (Map.Entry<String, List<TopicPartition>> assignmentEntry : rawAssignments.entrySet())
            assignments.put(assignmentEntry.getKey(), new Assignment(assignmentEntry.getValue()));
        return assignments;
    }

分配器的三个实现类:

  • RangeAssignor
  • RoundRobinAssignor
  • StickyAssignor

具体分析请看另外一篇文章