Kafka生产者源码
Kafka中的生产者有若干个入口,比如在命令行使用bin目录下的kafka-console-producer.sh命令,或者是调用Java API、Python API等方式。
这里选择使用kafka-console-producer.sh命令作为kafka生产者的入口,来对kafka生产者的源码一窥究竟。
$ bin/kafka-console-producer.sh --broker-list 127.0.0.1:9092 --topic topic_name
以上命令就是安装kafka后,在KAFKA_HOME目录下执行的一条向kafka的某个topic以命令行输入的方式发送消息的命令,下面是kafka-console-producer.sh的代码:
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xmx512M"
fi
exec $(dirname $0)/kafka-run-class.sh kafka.tools.ConsoleProducer "$@"
核心的代码为第四行的exec执行命令,$(dirname $0) 是获取kafka-console-producer.sh命令的路径,也就是获取KAFKA_HOME/bin ,然后执行bin目录下的kafka-run-class.sh,见名知意是执行kafka.tools.ConsoleProducer 类,并且将kafka-console-producer.sh的参数($@表示所有的参数)传给kafka.tools.ConsoleProducer
接下来就去找到kafka.tools.ConsoleProducer 的代码:
object ConsoleProducer {
def main(args: Array[String]): Unit = {
try {
// 将kafka-console-producer.sh后面跟着的参数封装成ProducerConfig对象
val config = new ProducerConfig(args)
// 实例化一个「MessageReader」,用来读取命令行的内容
val reader = Class.forName(config.readerClass).getDeclaredConstructor().newInstance().asInstanceOf[MessageReader]
reader.init(System.in, getReaderProps(config))
// 根据命令中的参数(broker-list、topic等信息),构造一个KafkaProducer生产者对象
val producer = new KafkaProducer[Array[Byte], Array[Byte]](producerProps(config))
......
var record: ProducerRecord[Array[Byte], Array[Byte]] = null
do {
record = reader.readMessage()
if (record != null) {
// 发送消息
send(producer, record, config.sync)
}
} while (record != null)
} catch {
......
}
Exit.exit(0)
}
}
首先在main方法中把传入的参数封装成ProducerConfig对象,参数就是使用命令时加上的参数:--broker-list 127.0.0.1:9092 --topic topic_name,封装的过程中,也包括一些key值检查、value类型转换之类的操作。接着实例化一个MessageReader对象,这个对象用来监听命令行的输入,MessageReader是一个接口,对象实际上是LineMessageReader,它实现了readMessage方法:
override def readMessage() = {
lineNumber += 1
print(">")
(reader.readLine(), parseKey) match {
case (null, _) => null
case (line, true) =>
line.indexOf(keySeparator) match {
case -1 =>
if (ignoreError) new ProducerRecord(topic, line.getBytes(StandardCharsets.UTF_8))
else throw new KafkaException(s"No key found on line $lineNumber: $line")
case n =>
val value = (if (n + keySeparator.size > line.size) "" else line.substring(n + keySeparator.size)).getBytes(StandardCharsets.UTF_8)
new ProducerRecord(topic, line.substring(0, n).getBytes(StandardCharsets.UTF_8), value)
}
case (line, false) =>
new ProducerRecord(topic, line.getBytes(StandardCharsets.UTF_8))
}
}
以上为LineMessageReader的readMessage方法实现,首先会打印一个>符号,所以在命令行中每次输入的时候,在最前面就会显示这个符号。之后在第13行能够看到,把用户输入的消息处理过后,封装成一个ProducerRecord对象,这个对象主要包括了topic、消息的key、消息的value这些数据。
在kafka.tools.ConsoleProducer的main方法的第16行显示了一个循环,这个循环每次先去读取用户输入的内容,封装成了一个ProducerRecord对象后,调用send方法发送消息:
private def send(producer: KafkaProducer[Array[Byte], Array[Byte]],
record: ProducerRecord[Array[Byte], Array[Byte]], sync: Boolean): Unit = {
// 实际上还是使用生产者KafkaProducer对象的send方法发送消息
if (sync)
producer.send(record).get()
else
producer.send(record, new ErrorLoggingCallback(record.topic, record.key, record.value, false))
}
在send方法中判断了发送消息的方式是同步或是异步,不管通过哪种方式发送,最终还是使用KafkaProducer的send方法把KafkaRecord发送出去。在main方法的第13行能够看到kafkaProducer构造过程,将参数(broker-list、topic等)配置成一组Properties(除了用户提供的,还包括一些默认的参数),以此构建出一个KafkaProducer生产者对象。以下为KafkaProducer的构造方法:
KafkaProducer(Map<String, Object> configs,
Serializer<K> keySerializer,
Serializer<V> valueSerializer,
ProducerMetadata metadata,
KafkaClient kafkaClient,
ProducerInterceptors interceptors,
Time time) {
ProducerConfig config = new ProducerConfig(ProducerConfig.addSerializerToConfig(configs, keySerializer,
valueSerializer));
try {
// 获取用户提供的参数集合
Map<String, Object> userProvidedConfigs = config.originals();
this.producerConfig = config;
this.time = time;
// 如果用户在命令行参数传transactionId,则使用用户提供的,否则为null
String transactionalId = userProvidedConfigs.containsKey(ProducerConfig.TRANSACTIONAL_ID_CONFIG) ?
(String) userProvidedConfigs.get(ProducerConfig.TRANSACTIONAL_ID_CONFIG) : null;
// 指定分区器
this.partitioner = config.getConfiguredInstance(ProducerConfig.PARTITIONER_CLASS_CONFIG, Partitioner.class);
// 指定消息发送重试间隔
long retryBackoffMs = config.getLong(ProducerConfig.RETRY_BACKOFF_MS_CONFIG);
// 指定key和value的序列化器
if (keySerializer == null) {
this.keySerializer = config.getConfiguredInstance(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
Serializer.class);
this.keySerializer.configure(config.originals(), true);
} else {
config.ignore(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG);
this.keySerializer = keySerializer;
}
if (valueSerializer == null) {
this.valueSerializer = config.getConfiguredInstance(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
Serializer.class);
this.valueSerializer.configure(config.originals(), false);
} else {
config.ignore(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG);
this.valueSerializer = valueSerializer;
}
// 加载拦截器
userProvidedConfigs.put(ProducerConfig.CLIENT_ID_CONFIG, clientId);
ProducerConfig configWithClientId = new ProducerConfig(userProvidedConfigs, false);
List<ProducerInterceptor<K, V>> interceptorList = (List) configWithClientId.getConfiguredInstances(
ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, ProducerInterceptor.class);
if (interceptors != null)
this.interceptors = interceptors;
else
this.interceptors = new ProducerInterceptors<>(interceptorList);
ClusterResourceListeners clusterResourceListeners = configureClusterResourceListeners(keySerializer,
valueSerializer, interceptorList, reporters);
this.maxRequestSize = config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG);
this.totalMemorySize = config.getLong(ProducerConfig.BUFFER_MEMORY_CONFIG);
this.compressionType = CompressionType.forName(config.getString(ProducerConfig.COMPRESSION_TYPE_CONFIG));
this.maxBlockTimeMs = config.getLong(ProducerConfig.MAX_BLOCK_MS_CONFIG);
this.transactionManager = configureTransactionState(config, logContext, log);
int deliveryTimeoutMs = configureDeliveryTimeout(config, log);
// 生成消息累加器
this.accumulator = new RecordAccumulator(logContext,
config.getInt(ProducerConfig.BATCH_SIZE_CONFIG),
this.compressionType,
lingerMs(config),
retryBackoffMs,
deliveryTimeoutMs,
metrics,
PRODUCER_METRIC_GROUP_NAME,
time,
apiVersions,
transactionManager,
new BufferPool(this.totalMemorySize, config.getInt(ProducerConfig.BATCH_SIZE_CONFIG), metrics, time, PRODUCER_METRIC_GROUP_NAME));
List<InetSocketAddress> addresses = ClientUtils.parseAndValidateAddresses(
config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG),
config.getString(ProducerConfig.CLIENT_DNS_LOOKUP_CONFIG));
if (metadata != null) {
this.metadata = metadata;
} else {
this.metadata = new ProducerMetadata(retryBackoffMs,
config.getLong(ProducerConfig.METADATA_MAX_AGE_CONFIG),
logContext,
clusterResourceListeners,
Time.SYSTEM);
this.metadata.bootstrap(addresses);
}
......
} catch (Throwable t) {
......
}
}
在KafkaProducer的构造方法中,首先还是获取生产者相关的配置(config),然后区分出了用户提供的配置(userProvidedConfigs),主要是用来判断取配置的时候,用户提供的参数优先级更高一些。在第19行指定了分区器,默认使用org.apache.kafka.clients.producer.internals.DefaultPartitioner的分区器。接着在23行指定key和value的序列化器,以及在41行指定拦截器。分区器、序列化器、拦截器都是由用户指定,如果用户没有指定,则采用默认的各种器。序列化器内容没有什么,拦截器也就只是在发送消息前做一次处理,下面说一说分区器:
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
if (keyBytes == null) {
return stickyPartitionCache.partition(topic, cluster);
}
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
// hash the keyBytes to choose a partition
return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
以上为org.apache.kafka.clients.producer.internals.DefaultPartitioner的partition方法,消息发往哪个分区都是通过这个方法计算出来的,在默认分区器的partition方法中,首先判断keyBytes是否为null,为空的话就是使用stickyPartitionCache的partition方法,否则使用Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions计算出一个int值(分区号),这里采用了murmurHash的方式去计算,murmurHash是一种运算效率高且碰撞率低的一种hash算法。以上代码可以理解为,如果有在消息中提供key的话,则根据key发送到对应的partition中,相同的key往往发送到一个partition中,没有提供key的话,则从元数据中随机选择一个分区,并且将此分区记录在本地缓存中,下次直接复用(元数据为生产者向kafka集群拉取到的一些主题、分区数等信息的对象,每隔一段时间会更新一次)。
接着回到KafkaProducer的生产者构造方法,在指定了分区器、拦截器、序列化器后还指定了一系列配置比如:最大请求长度、压缩类型等等。之后指定了一个非常重要的东西——消息累加器,消息累加器的作用是用来把数据攒批然后一起发送给集群,减少网络开销,提供发送效率。
以上就是KafkaProducer的构造方法,接下来看KafkaProducer的send方法。
public Future<RecordMetadata> send(ProducerRecord<K, V> record) {
return send(record, null);
}
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
// 在消息发送前,会先经过拦截器
// 默认拦截器为空集合,所以不会拦截,如果有自定义拦截器,则需要实现ProducerInterceptor接口,record(消息)就会在自定义拦截器的onSend方法中处理
ProducerRecord<K, V> interceptedRecord = this.interceptors.onSend(record);
return doSend(interceptedRecord, callback);
}
private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
TopicPartition tp = null;
try {
// 首先需要确认元数据是否有效
ClusterAndWaitTime clusterAndWaitTime;
try {
clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
} catch (KafkaException e) {
if (metadata.isClosed())
throw new KafkaException("Producer closed while send in progress", e);
throw e;
}
long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
Cluster cluster = clusterAndWaitTime.cluster;
byte[] serializedKey;
// 把key和value都通过序列化器序列化
try {
serializedKey = keySerializer.serialize(record.topic(), record.headers(), record.key());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
" specified in key.serializer", cce);
}
byte[] serializedValue;
try {
serializedValue = valueSerializer.serialize(record.topic(), record.headers(), record.value());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
" specified in value.serializer", cce);
}
// 默认使用stickyPartitionCache.partition(topic, cluster)分区
int partition = partition(record, serializedKey, serializedValue, cluster);
// 根据topic和partition创建TopicPartition对象
tp = new TopicPartition(record.topic(), partition);
setReadOnly(record.headers());
Header[] headers = record.headers().toArray();
// 获取消息序列化之后的size(消息包括key、value、headers),compressionType还可能会有压缩
int serializedSize = AbstractRecords.estimateSizeInBytesUpperBound(apiVersions.maxUsableProduceMagic(),
compressionType, serializedKey, serializedValue, headers);
// 确保消息的size不超过maxRequestSize和totalMemorySize(这两个参数可以配置)
ensureValidRecordSize(serializedSize);
// 设置消息的时间戳,如果提供了则使用提供的值,否则使用当前时间戳
long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
// producer callback will make sure to call both 'callback' and interceptor callback
Callback interceptCallback = new InterceptorCallback<>(callback, this.interceptors, tp);
if (transactionManager != null && transactionManager.isTransactional()) {
transactionManager.failIfNotReadyForSend();
}
// 将消息append到RecordAccumulate(消息累加器)中
RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey,
serializedValue, headers, interceptCallback, remainingWaitMs, true);
// append方法的最后一个参数abortOnNewBatch:
// 当消息不够存放到消息累加器的当前topic-partition下的Deque中最后一个ProducerBatch中,则中断去使用新的Batch(可能新分配了partition)
if (result.abortForNewBatch) {
int prevPartition = partition;
partitioner.onNewBatch(record.topic(), cluster, prevPartition);
partition = partition(record, serializedKey, serializedValue, cluster);
tp = new TopicPartition(record.topic(), partition);
if (log.isTraceEnabled()) {
log.trace("Retrying append due to new batch creation for topic {} partition {}. The old partition was {}", record.topic(), partition, prevPartition);
}
// producer callback will make sure to call both 'callback' and interceptor callback
interceptCallback = new InterceptorCallback<>(callback, this.interceptors, tp);
result = accumulator.append(tp, timestamp, serializedKey,
serializedValue, headers, interceptCallback, remainingWaitMs, false);
}
// 判断是否需要进行事务性操作
if (transactionManager != null && transactionManager.isTransactional())
transactionManager.maybeAddPartitionToTransaction(tp);
// 如果消息累加器中的ProducerBatch已满或者新创建了ProducerBatch,则唤醒sender线程去发送ProducerBatch的数据
if (result.batchIsFull || result.newBatchCreated) {
log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
this.sender.wakeup();
}
return result.future;
}
// catch......
}
第一个send方法是不带callback的,它会调用第二个send方法。
第二个send方法除了带callback,还会先经过一层拦截器,最后调用doSend方法。
doSend方法就是核心的发送消息的方法,首先会去校验一次元数据是否有效,无效或过期了的话则会更新一次。然后第28行开始对key和value进行序列化操作。第44行根据topic,序列化后的key和value,以及cluster集群信息进行分区操作,找到这一条数据应该进入topic的哪一个分区。topic和分区号会组装成一个TopicPartition对象。在第64行会根据topic-partition、key、value、timestamp等信息追加到消息累加器中(后面进行详解),追加操作结束后,在第90行判断消息累加器是否攒够了一个批次去发送,攒够了就会唤醒sender线程,sender线程就是专门用来发送消息给集群的。
下面来看一下消息累加器在追加的过程中做了些什么操作:
public RecordAppendResult append(TopicPartition tp,
long timestamp,
byte[] key,
byte[] value,
Header[] headers,
Callback callback,
long maxTimeToBlock,
boolean abortOnNewBatch) throws InterruptedException {
try {
// 根据topic和partiton得到或者创建一个Deque,topic和partition的组合会和唯一的Deque对应
Deque<ProducerBatch> dq = getOrCreateDeque(tp);
synchronized (dq) {
if (closed)
throw new KafkaException("Producer closed while send in progress");
// 尝试追加到Deque的最后一个ProducerBatch中
RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
// append成功
if (appendResult != null)
return appendResult;
}
// we don't have an in-progress record batch try to allocate a new batch
if (abortOnNewBatch) {
// Return a result that will cause another call to append.
return new RecordAppendResult(null, false, false, true);
}
byte maxUsableMagic = apiVersions.maxUsableProduceMagic();
int size = Math.max(this.batchSize, AbstractRecords.estimateSizeInBytesUpperBound(maxUsableMagic, compression, key, value, headers));
log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(), tp.partition());
buffer = free.allocate(size, maxTimeToBlock);
synchronized (dq) {
// Need to check if producer is closed again after grabbing the dequeue lock.
if (closed)
throw new KafkaException("Producer closed while send in progress");
RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
if (appendResult != null) {
// Somebody else found us a batch, return the one we waited for! Hopefully this doesn't happen often...
return appendResult;
}
MemoryRecordsBuilder recordsBuilder = recordsBuilder(buffer, maxUsableMagic);
ProducerBatch batch = new ProducerBatch(tp, recordsBuilder, time.milliseconds());
FutureRecordMetadata future = Objects.requireNonNull(batch.tryAppend(timestamp, key, value, headers,
callback, time.milliseconds()));
dq.addLast(batch);
incomplete.add(batch);
// Don't deallocate this buffer in the finally block as it's being used in the record batch
buffer = null;
return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true, false);
}
} finally {
if (buffer != null)
free.deallocate(buffer);
appendsInProgress.decrementAndGet();
}
}
private RecordAppendResult tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers,
Callback callback, Deque<ProducerBatch> deque) {
// peek Deque队列的最后一条
ProducerBatch last = deque.peekLast();
if (last != null) {
// 尝试将消息追加到队列的最后一个ProducerBatch中,不够存放返回null
FutureRecordMetadata future = last.tryAppend(timestamp, key, value, headers, callback, time.milliseconds());
// 消息不够存放到最后一个ProducerBatch中,则关闭此ProducerBatch追加的状态
if (future == null)
last.closeForRecordAppends();
// append成功
else
return new RecordAppendResult(future, deque.size() > 1 || last.isFull(), false, false);
}
return null;
}
首先在append方法的开始,会根据TopicPartition对象获取到一个Deque<ProducerBatch>,在一个topic下,每一个partition都会在生产者中对应一个Deque,ProducerBatch就是用来存放一批record的对象。一个topic下每一个partition都会对应一个Deque<ProducerBatch>,在获取到对应的Deque之后,会尝试将这一条消息追加到这个Deque中,在第17行调用的tryAppend方法里实现了追加的操作,首先取出该Deque的最后一个ProducerBatch,ProducerBatch可以看作成若干个ProducerRecord的集合 ,当向一个ProducerBatch追加ProducerRecord的时候,会收到消息大小限制,如果消息非常大,则无法追加到这一个ProducerBatch中,ProducerBatch还够存放这一条ProducerRecord,则进行追加。在tryAppend的第71行可以看到,当future==null(ProducerBatch不够存放)的时候,则关闭这一个ProducerBatch的追加状态,否则返回RecordAppendResult对象。append方法的后续代码也是相同逻辑,就不再赘述了。
回到doSend方法的最后几行,当result.batchIsFull || result.newBatchCreated条件成立的时候,意思就是在Deque中已经满了的ProducerBatch待发送,所以就唤醒sender线程去发送消息。以下是sender线程的核心代码:
private long sendProducerData(long now) {
// 获取元数据
Cluster cluster = metadata.fetch();
// 从消息累加器中获取对应Deque的「fisrt-第一个」ProducerBatch
RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);
// 如果leader无响应,则强制刷新元数据
if (!result.unknownLeaderTopics.isEmpty()) {
// The set of topics with unknown leader contains topics with leader election pending as well as
// topics which may have expired. Add the topic again to metadata to ensure it is included
// and request metadata update, since there are messages to send to the topic.
for (String topic : result.unknownLeaderTopics)
this.metadata.add(topic);
log.debug("Requesting metadata update due to unknown leader topics from the batched records: {}",
result.unknownLeaderTopics);
this.metadata.requestUpdate();
}
// 移除不需要发送的node
Iterator<Node> iter = result.readyNodes.iterator();
long notReadyTimeout = Long.MAX_VALUE;
while (iter.hasNext()) {
Node node = iter.next();
if (!this.client.ready(node, now)) {
iter.remove();
notReadyTimeout = Math.min(notReadyTimeout, this.client.pollDelayMs(node, now));
}
}
// create produce requests
Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes, this.maxRequestSize, now);
addToInflightBatches(batches);
if (guaranteeMessageOrder) {
// Mute all the partitions drained
for (List<ProducerBatch> batchList : batches.values()) {
for (ProducerBatch batch : batchList)
this.accumulator.mutePartition(batch.topicPartition);
}
}
accumulator.resetNextBatchExpiryTime();
List<ProducerBatch> expiredInflightBatches = getExpiredInflightBatches(now);
List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(now);
expiredBatches.addAll(expiredInflightBatches);
// Reset the producer id if an expired batch has previously been sent to the broker. Also update the metrics
// for expired batches. see the documentation of @TransactionState.resetProducerId to understand why
// we need to reset the producer id here.
if (!expiredBatches.isEmpty())
log.trace("Expired {} batches in accumulator", expiredBatches.size());
for (ProducerBatch expiredBatch : expiredBatches) {
String errorMessage = "Expiring " + expiredBatch.recordCount + " record(s) for " + expiredBatch.topicPartition
+ ":" + (now - expiredBatch.createdMs) + " ms has passed since batch creation";
failBatch(expiredBatch, -1, NO_TIMESTAMP, new TimeoutException(errorMessage), false);
if (transactionManager != null && expiredBatch.inRetry()) {
// This ensures that no new batches are drained until the current in flight batches are fully resolved.
transactionManager.markSequenceUnresolved(expiredBatch.topicPartition);
}
}
sensors.updateProduceRequestMetrics(batches);
// If we have any nodes that are ready to send + have sendable data, poll with 0 timeout so this can immediately
// loop and try sending more data. Otherwise, the timeout will be the smaller value between next batch expiry
// time, and the delay time for checking data availability. Note that the nodes may have data that isn't yet
// sendable due to lingering, backing off, etc. This specifically does not include nodes with sendable data
// that aren't ready to send since they would cause busy looping.
long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
pollTimeout = Math.min(pollTimeout, this.accumulator.nextExpiryTimeMs() - now);
pollTimeout = Math.max(pollTimeout, 0);
if (!result.readyNodes.isEmpty()) {
log.trace("Nodes with data ready to send: {}", result.readyNodes);
// if some partitions are already ready to be sent, the select time would be 0;
// otherwise if some partition already has some data accumulated but not ready yet,
// the select time will be the time difference between now and its linger expiry time;
// otherwise the select time will be the time difference between now and the metadata expiry time;
pollTimeout = 0;
}
sendProduceRequests(batches, now);
return pollTimeout;
}
sender线程会先从消息累加器中,先获取到可以用来发送的ProducerBatch,以下是对应的ready方法实现:
public ReadyCheckResult ready(Cluster cluster, long nowMs) {
Set<Node> readyNodes = new HashSet<>();
long nextReadyCheckDelayMs = Long.MAX_VALUE;
Set<String> unknownLeaderTopics = new HashSet<>();
boolean exhausted = this.free.queued() > 0;
for (Map.Entry<TopicPartition, Deque<ProducerBatch>> entry : this.batches.entrySet()) {
Deque<ProducerBatch> deque = entry.getValue();
synchronized (deque) {
// 取出Deque中的第一个ProducerBatch
ProducerBatch batch = deque.peekFirst();
if (batch != null) {
TopicPartition part = entry.getKey();
Node leader = cluster.leaderFor(part);
if (leader == null) {
// 当此topic-partition对应的leader节点不可用时,加入到unknownLeaderTopics集合中
unknownLeaderTopics.add(part.topic());
} else if (!readyNodes.contains(leader) && !isMuted(part, nowMs)) {
long waitedTimeMs = batch.waitedTimeMs(nowMs);
boolean backingOff = batch.attempts() > 0 && waitedTimeMs < retryBackoffMs;
long timeToWaitMs = backingOff ? retryBackoffMs : lingerMs;
boolean full = deque.size() > 1 || batch.isFull();
boolean expired = waitedTimeMs >= timeToWaitMs;
boolean sendable = full || expired || exhausted || closed || flushInProgress();
if (sendable && !backingOff) {
readyNodes.add(leader);
} else {
long timeLeftMs = Math.max(timeToWaitMs - waitedTimeMs, 0);
// Note that this results in a conservative estimate since an un-sendable partition may have
// a leader that will later be found to have sendable data. However, this is good enough
// since we'll just wake up and then sleep again for the remaining time.
nextReadyCheckDelayMs = Math.min(timeLeftMs, nextReadyCheckDelayMs);
}
}
}
}
}
return new ReadyCheckResult(readyNodes, nextReadyCheckDelayMs, unknownLeaderTopics);
}
ready方法先获取消息累加器中所有的Deque,对每一个Deque取出第一个ProducerBatch,然后看每一个ProducerBatch(实际上就是看每一个分区)的leader节点是否可用,可用的话就加入到readyNodes中,不可用的话就加入到unknownLeaderTopics集合中。
回到sender的sendProducerData方法,在第8行会对leader失效的unknownLeaderTopics中的节点强制刷新一次,然后再移除真正不可用的node节点。在第32行,消息累加器把所有待发送的ProducerBatch封装成一个Map<Integer, List<ProducerBatch>>对象,这个map的key就是broker-id,value就是需要发送到这一个broker节点上的所有ProducerBatch(因为可能在一个node节点上有多个分区),最后再进行一些其他操作,比如加入到缓存(inFlightBatches),或是一些故障恢复的机制操作之后,在第79行,就正式把数据发送到集群中去。
再之后的send操作就不再详细深入了,内容就是一些读取Kafka集群配置,该发往哪台机器,使用了类似于JavaNIO的KafkaChannel和Selector用来发送数据等,有兴趣可以再深入研究。
总结
Kafka生产者的整个流程就是:
-
消息经过拦截器处理
-
消息通过序列化器处理
-
消息通过分区器计算出分区号
-
消息发往消息累加器等待发送
发送的过程中也有如下逻辑:
-
消息先追加到消息累加器的分区对应的Deque中
-
每次追加到Deque都是先取出Deque的最后一个ProducerBatch,如果能装得下消息,则装,如果装不下,就会新建一个ProducerBatch
-
当ProducerBatch装满了,或者装不下导致新创建了一个ProducerBatch,则会唤醒Sender线程去进行发送操作
-
Sender线程先获取消息累加器中的所有可用leader节点和不可用leader节点,不可用的leader节点会先强制刷新元数据,然后再取出掉真正不可用的节点
-
Sender每次获取ProducerBatch都是拿Deque的第一个ProducerBatch对象进行操作,所以追加数据是追加到最后一个,拿数据是拿第一个
-
Sender把所有待发送的消息组装成Map<Integer, List<ProducerBatch>>对象