【kafka 源码】kafka 生产者消息发送过程前文讲述了 kafka 实例化过程。实例化完成后，就可以发送消息体。消

前文讲述了 kafka 实例化过程。实例化完成后，就可以发送消息体。消息体经过拦截器，序列化器，分区器到达消息累加器，最后由 Sender 线程将消息体发送到 kafka 。

public class KafkaProducer<K, V> implements Producer<K, V> {
    
    public Future<RecordMetadata> send(...) {
        
        // 消息拦截器，遍历处理拦截器集中所有拦截器，返回经过拦截器处理过后的消息体
        ProducerRecord<K, V> interceptedRecord = this.interceptors.onSend(...);
        
        // key 序列化
        byte[] serializedKey = keySerializer.serialize(...);
        
        // value 序列化
        byte[] serializedValue = valueSerializer.serialize(...);
        
        // 消息发送分区
        int partition = partitioner.partition(...);
        
        // 消息追加到消息累加器
        RecordAccumulator.RecordAppendResult result = accumulator.append(...);
        if (result.batchIsFull || result.newBatchCreated) {
            
            // 唤醒线程，发送消息
            sender.wakeup();
        }
        
    }
    
}

消息体

发送消息前，首先需要组装消息体。 kafka 生产者发送的消息并非我们发送的对象，而是会将对象放入 value 中，封装为 ProducerRecord 对象。

public class ProducerRecord<K, V> {
    /**
     * 主题
     */
    private final String topic;
    /**
     * 分区号
     */
    private final Integer partition;
    /**
     * 消息头
     */
    private final Headers headers;
    /**
     * 消息 key
     */
    private final K key;
    /**
     * 消息 value
     */
    private final V value;
    /**
     * 时间戳
     */
    private final Long timestamp;
}

其中， topic 和 value 不能为空。 key 和 partition 影响消息发送分区。

拦截器

消息体组装完成后，首先会经过拦截器处理。默认情况下，拦截器集合为空。拦截器处理发生异常后不会抛出。只会打印 warn 级日志。

自定义拦截器

由【kafka 源码】 kafka 生产者初始化过程中实例化时可知，如果想要自定义拦截器只需要实现 ProducerInterceptor 接口。

public class CustomProducerInterceptor implements ProducerInterceptor<String, String > {
    
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> record) {
        // 为消息体拼接前缀 logan-
        String modifiedValue = "logan-" + record.value();
        return new ProducerRecord<>(record.topic(), record.partition(),
                        record.timestamp(),record.key(), modifiedValue, record.headers());
    }
    
}

配置

自定义拦截器编写完成后，需要在定义生产者是配置该拦截器。

public class LoganProducer {
    private static Properties initConfig() {
        // 添加自定义拦截器
        props.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, CustomInterceptor.class.getName());
    }
}

序列化器

消息在网络上是以字节传输的，所以，需要将消息序列化为字节数组。定义生产者时，序列化器为必填参数。 kafka 提供了很多序列化类以供使用，包括了字节，长整型，短整型，双浮点数，单浮点数等序列化器。

public class StringSerializer implements Serializer<String> {
    
    private String encoding = "UTF8";
    
    public byte[] serialize(String topic, String data) {
        // 将字符串序列化为字节数组
        return data.getBytes(encoding);
    }
    
}

StringSerializer 按 UTF8 格式将字符串转为字节数组。如果需要自定义序列化器只需要实现 Serializer 接口，然后如拦截器一般在生产者定义是配置即可。

分区器

ProducerRecord 中 partition 字段如果不为空，则直接此分区号，否则，使用分区器计算分区号。 Kafka 默认情况下，使用 DefaultPartitioner 分区器。该分区器有如下分区规则：

key 不为空，调用 Utils.murmur2(keyBytes) 求 hash 值，与分区数取模
key 为空，循环使用分区

public class DefaultPartitioner implements Partitioner {
    
    public int partition(...) {
        
        // 主题对应分区数
        int numPartitions = cluster.partitionsForTopic(topic).size();
        
        if (keyBytes == null) {
            
            // 调用次数
            // kafka 通过调用次数以达到循环使用主题分区
            int nextValue = nextValue(topic);
            
            List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
            if (availablePartitions.size() > 0) {
                // topic 有可用分区，使用可用分区
                int part = Utils.toPositive(nextValue) % availablePartitions.size();
                return availablePartitions.get(part).partition();
            } else {
                // 没有可用分区，直接循环所有分区
                return Utils.toPositive(nextValue) % numPartitions;
            }
            
        } else {
            // key 的 hash 值与分区数取模
            return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
        }
    }
    
}

KafkaProducer 使用 AtomicInteger 累计该 topic 发送消息的次数，并将 topic 与次数存储在 ConcurrentMap 中，每次循环分区时，调用 getAndIncrement() 方法获取次数并增 1 存储。

public class DefaultPartitioner implements Paritioner {
    
    /**
     * Topic 与累计次数
     */
    private final ConcurrentMap<String, AtomicInteger> topicCounterMap =
        new ConcurrentHashMap<>();
    
    private int nextValue(String topic) {
        
        AtomicInteger counter = topicCounterMap.get(topic);
    
        if (null == counter) {
        
            // 不存在，则插入
            counter = new AtomicInteger(ThreadLocalRandom.current().nextInt());
        
            // 插入，如果存在则返回存在的（通过 ConcurrentMap 解决多线程并发问题）
            AtomicInteger currentCounter = topicCounterMap.putIfAbsent(topic, counter);
            if (currentCounter != null) {
                counter = currentCounter;
            }
        }
        
        return counter.getAndIncrement();
    }
}

RecordAccumulator 将消息封装成批次。 Sender 线程将批次封装成请求发送到 kafka server 。涉及内容较多以后单独详述。