Kafka源码分析6-一个batch什么条件可以发送在 Kafka源码分析5-sender线程流程初探已经分析了sen

欢迎大家关注 github.com/hsfxuebao/j… ，希望对大家有所帮助，要是觉得可以的话麻烦给点一下Star哈

在 Kafka源码分析5-sender线程流程初探已经分析了sender线程的整体流程，本文重点分析步骤二：一个batch满足什么条件才能发送出去？代码如下：

/**
 * 步骤二：
 *      首先是判断哪些partition有消息可以发送，获取到这个partition的leader partition对应的broker主机。
 *      哪些broker上面需要我们去发送消息？
 */
RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);

ready方法如下：

public ReadyCheckResult ready(Cluster cluster, long nowMs) {
    Set<Node> readyNodes = new HashSet<>();
    long nextReadyCheckDelayMs = Long.MAX_VALUE;
    Set<String> unknownLeaderTopics = new HashSet<>();

    //waiters里面有数据
    //如果exhausted的值等于true，说明内存池里面的内存不够用了。
    boolean exhausted = this.free.queued() > 0;
    for (Map.Entry<TopicPartition, Deque<RecordBatch>> entry : this.batches.entrySet()) {
        TopicPartition part = entry.getKey();
        Deque<RecordBatch> deque = entry.getValue();
        //根据分区 可以获取到这个分区的leader partition在哪一台kafka的主机上面。
        Node leader = cluster.leaderFor(part);
        synchronized (deque) {
            //如果没有找到对应主机。 unknownLeaderTopics
            if (leader == null && !deque.isEmpty()) {
                // This is a partition for which leader is not known, but messages are available to send.
                // Note that entries are currently not removed from batches when deque is empty.
                unknownLeaderTopics.add(part.topic());
            } else if (!readyNodes.contains(leader) && !muted.contains(part)) {
                //首先从队列的队头获取到批次
                RecordBatch batch = deque.peekFirst();
                //如果这个batch不null，我们判断一下是否可以发送这个批次。
                if (batch != null) {
                    /**
                     * batch.attempts:重试的次数
                     * batch.lastAttemptMs：上一次重试的时间
                     * retryBackoffMs：重试的时间间隔
                     *
                     * backingOff：重新发送数据的时间到了
                     */
                    boolean backingOff = batch.attempts > 0 && batch.lastAttemptMs + retryBackoffMs > nowMs;
                    /**
                     * nowMs: 当前时间
                     * batch.lastAttemptMs： 上一次重试的时间。
                     * waitedTimeMs:这个批次已经等了多久了。
                     */
                    long waitedTimeMs = nowMs - batch.lastAttemptMs;
                    /**
                     * 但是我们用场景驱动的方式去分析，因为我们第一次发送数据。
                     * 所以之前也没有消息发送出去过，也就没有重试这一说。
                     *
                     * timeToWaitMs = lingerMs
                     * lingerMs
                     * 这个值默认是0，如果这个值默认是0 的话，那代表着来一条消息
                     * 就发送一条消息，那很明显是不合适的。
                     * 所以我们发送数据的时候，大家一定要记得去配置这个参数。
                     * 假设我们配置的是100ms
                     * timeToWaitMs = linerMs = 100ms
                     * 消息最多存多久就必须要发送出去了。
                     */
                    long timeToWaitMs = backingOff ? retryBackoffMs : lingerMs;
                    /**
                     * timeToWaitMs: 最多能等待多久
                     * waitedTimeMs： 已经等待了多久
                     * timeLeftMs： 还要在等待多久
                     */
                    long timeLeftMs = Math.max(timeToWaitMs - waitedTimeMs, 0);
                    /**
                     *如果队列大于1，说明这个队列里面至少有一个批次肯定是写满了
                     * 如果批次写满了肯定是可以发送数据了。
                     *当然也有可能就是这个队列里面只有一个批次，然后刚好这个批次
                     * 写满了，也可以发送数据。
                     *
                     * full：是否有写满的批次
                     */
                    boolean full = deque.size() > 1 || batch.isFull();
                    /**
                     * waitedTimeMs:已经等待了多久
                     * timeToWaitMs：最多需要等待多久
                     * expired： 时间到了，到了发送消息的时候了
                     * 如果expired=true 代表就是时间到了，到了发送消息的时候了
                     */
                    boolean expired = waitedTimeMs >= timeToWaitMs;
                    /**
                     * 1）full: 如果一个批次写满了（无论时间有没有到）
                     * 2）expired：时间到了（批次没写满也得发送）
                     * 3）exhausted：内存不够（消息发送出去以后，就会释放内存）
                     */
                    boolean sendable = full || expired || exhausted || closed || flushInProgress();
                    if (sendable && !backingOff) {
                        //把可以发送批次的partition的leader partition所在的主机加入到readyNodes
                        readyNodes.add(leader);
                    } else {
                        // Note that this results in a conservative estimate since an un-sendable partition may have
                        // a leader that will later be found to have sendable data. However, this is good enough
                        // since we'll just wake up and then sleep again for the remaining time.
                        nextReadyCheckDelayMs = Math.min(timeLeftMs, nextReadyCheckDelayMs);
                    }
                }
            }
        }
    }

    return new ReadyCheckResult(readyNodes, nextReadyCheckDelayMs, unknownLeaderTopics);
}

batch 能否发送是由boolean sendable = full || expired || exhausted || closed || flushInProgress() 决定的，true 的条件如下：

① full :队列的大小大于1 (说明这个队列里面至少有一个批次肯定是写满了)或当前批次满了

② expired 发送的时间到了，主要有两种情况：

第一种情况消息为重试的消息，此时超过了重试消息的时间retryBackoffMs
第二种情况消息是第一次发送该批次的消息超过了等待时间lingerMs （如果一直凑不成一个批次，超过限定的时间，批次也会发送出去）

③ exhausted 内存池里的内存已经耗尽，可能有人阻塞在写操作，无法申请到内存，在等待新的内存块空闲出来才可以创建新的Batch

④ closed 当前客户端要关闭掉，此时就必须立马把内存缓冲的Batch都发送出去，就是当前强制必须把所有数据都flush出去到网络里面去，此时就必须得发送

⑤ flushInProgress() 强制刷新处理中的消息

我们在说一下生产者消息重试机制：

重试次数 10（默认，可配置）
重试的时间间隔 100ms(默认可配置)

参考文档：

史上最详细kafka源码注释（kafka-0.10.2.0-src）

kafka技术内幕-图文详解Kafka源码设计与实现

Kafka 源码分析系列