此处只写应用,不讲分区原理,后续文章将会展开
疑问
如果消费者订阅了一个新的主题,它根本没有可以查找的committedOffset。此时Kafka会根据消费者客户端参数进行决定从何处进行消费
#默认是latest,从分区末尾开始消费消息
auto.offset.reset = "latest"
#从0开始消费
auto.offset.reset = "earliest"
#会抛出异常 NoOffsetForPartitionException
auto.offset.reset = "none"
订阅主题以后,consumer通过poll()方法来拉取消息,这个过程中有两点需要被确定
- consumer如何分配到具体的TopicPartition
- 对于某个TopicPartition,consumer从哪开始消费
位移消费
可以通过kafka提供的seek()方法来解决第二个问题
import java.time.Duration;
import java.util.Collections;
import java.util.HashSet;
import java.util.Properties;
import java.util.Set;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.serialization.StringDeserializer;
public class SeekConsumer {
public static final String brokerList = "****";
public static final String topic = "kafka_demo_analysis";
/** 消费组的名称 */
public static final String groupId = "kafka-learner";
public static Properties initConfig() {
Properties properties = new Properties();
properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokerList);
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
properties.put(ConsumerConfig.GROUP_ID_CONFIG, groupId); //消费组
properties.put(ConsumerConfig.CLIENT_ID_CONFIG, "0");
properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
return properties;
}
public static void main(String[] args) {
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(SeekConsumer.initConfig());
consumer.subscribe(Collections.singleton(topic));
/**
* 注意,此时消费者只是订阅了主题,但是还没有被分配到指定的分区 直接指定会报异常
**/
//consumer.seek(new TopicPartition(topic, 100), 10);
Set<TopicPartition> assignment = new HashSet<>();
while (assignment.isEmpty()) {
consumer.poll(Duration.ofMillis(1000));
assignment = consumer.assignment();
}
/** 确保被分配了指定的分区,再用seek指定这些分区开始位移消费的位置 */
for (TopicPartition tp : assignment) {
consumer.seek(tp, 10);
}
while (true) {
ConsumerRecords<String, String> poll = consumer.poll(Duration.ofMillis(1000));
//TODO
}
}
}
Rebalance
再均衡是指分区的所属权从一个消费者转移到另一消费者的行为,在此期间消费者无法读取消息。当某个分区被重新分配给另一个消费者后,其消费的状态将会丢失。如果没有及时进行位移提交,原来被消费完的那部分消息又被重新消费一遍。。一般情况下,应尽量避免不必要的再均衡的发生。
import java.time.Duration;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import org.apache.kafka.clients.consumer.*;
import org.apache.kafka.common.TopicPartition;
import dhu.tonghao.kafka.KafkaConsumerAnalysis;
public class RebalanceListenerConsumer {
public static void main(String[] args) {
Map<TopicPartition, OffsetAndMetadata> offsets = new HashMap<>();
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(KafkaConsumerAnalysis.initConfig());
consumer.subscribe(Collections.singletonList(KafkaConsumerAnalysis.topic), new ConsumerRebalanceListener() {
@Override
//再均衡之前被调用,在此处进行位移提交,可以防止不必要的重复消费
public void onPartitionsRevoked(Collection<TopicPartition> collection) {
consumer.commitSync(offsets);
offsets.clear();
}
@Override
public void onPartitionsAssigned(Collection<TopicPartition> collection) {
//ToDO
}
});
try {
while (true) {
ConsumerRecords<String, String> poll = consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, String> consumerRecord : poll) {
// process the record
offsets.put(new TopicPartition(consumerRecord.topic(), consumerRecord.partition()),
new OffsetAndMetadata(consumerRecord.offset() + 1));
}
consumer.commitAsync(offsets, null);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
consumer.close();
}
}
}