- 需求:多线程读取kafka里的消息并统计到list集合
- 初步做法:每个partition开一个线程,每个线程用一个consumer异步读取,最终汇总所有的结果
- 碰到的难题:
一.每个消费者读取定量消息,中途不会有新的消息进入
CompletableFuture future = CompletableFuture.supplyAsync(() -> {
ArrayList<Object> dataList = new ArrayList<>();
try( Consumer consumer = factory.createConsumer();) {
consumer.subscribe(topicList);
int count = 0;
while (true) {
ConsumerRecords<Object, Object> records = consumer.poll(100);
for (ConsumerRecord<Object, Object> record : records) {
Object data = record.value();
currentOffsets.put(new TopicPartition(record.topic(), record.partition()), new OffsetAndMetadata(record.offset() + 1, "no metadata"));
if (count % 10000 == 0) {
consumer.commitAsync(currentOffsets, null);
}
count++;
dataList.add(data);
}
consumer.commitAsync();
}
}catch (Exception e){
e.printStackTrace();
}
return dataList;
}, pool);
可以看到 return dataList时不会被执行的,想要每个consumer读取分区消息,然后返回一个数据list最后统计所有consumer的数据list,可是该如何控制while循环呢