探究Spring Boot集成Kafka的正确姿势 一(基础篇)

4,769 阅读4分钟

探究Spring Boot集成Kafka的正确姿势 一(基础篇)

内容大纲:
本文仅以达成使用Springboot连通kafka消费消息的目的为主,暂不涉及kafka的深度使用和复杂的操作,比如多线程并发消费,指定位移消费,消息位移的手动提交,超大数据量场景下的正确使用方法等等。


版本号:
kafka 2.3.0
spring-boot 2.2.2 
spring-kafka 2.3.4(依据官网描述,2.3.x的spring-kafka基于kafka-clients2.3.1)

1. 完全使用SpringBoot自动配置

  • 配置项
-- 核心依赖
-- spring-kafka
        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka</artifactId>
            <version>2.3.4.RELEASE</version>
        </dependency>
-- springboot    
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.2.2.RELEASE</version>
    </parent>
### kafka配置
spring:
  kafka:
    bootstrap-servers: 192.168.1.5:9092
    producer:
      key-serializer: org.apache.kafka.weight.serialization.StringSerializer
      value-serializer: org.apache.kafka.weight.serialization.StringSerializer
    consumer:
      group-id: spt1
      enable-auto-commit: true
      auto-commit-interval: 1000
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      # kafka offset机制:存储当前消费分区的偏移量,即使挂了或者再均衡问题引发重新分配partation,也能从		正确的位置继续消费
      # latest: 有提交的offset时,从提交的offset开始,无提交的offset时,消费新产生的数据
      # earlist: 有提交的offset时,从提交的offset开始,无提交的offset时,从头开始消费
      # none:  有提交的offset时,从提交的offset开始,存在一个未提交的offset的分区时,抛出异常
      auto-offset-reset: latest

  • 生产者

    @Component
    public class MyBean {
    
        @Autowired
        private final KafkaTemplate kafkaTemplate;
    
        @Autowired
        public MyBean(KafkaTemplate kafkaTemplate) {
            this.kafkaTemplate = kafkaTemplate;
        }
    	
        public void send(String msg, String topic){
            kafkaTemplate.send(topic,msg)
        }
        
    }
    
  • 消费者

    @Component
    public class MyBean {
    
        @KafkaListener(topics = "someTopic")
        public void processMessage(String content) {
            // ...
        }
    
    }
    
  • 特点

    配置简单,适合快速完成简单的消费任务,定制化程度低

2. 手动注入kafka相关bean,代替spring boot的自动配置

优点如下:

  • 多线程生产者
  • 多线程消费者
  • 大幅提高kafka吞吐能力

配置文件

spring.profiles.active=dev
# 测试环境kafka配置
spring.kafka.test.bootstrap-servers=192.168.1.5:9092
spring.kafka.test.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.test.producer.value-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.test.consumer.group-id=spt1
spring.kafka.test.consumer.enable-auto-commit=true
spring.kafka.test.consumer.auto-commit-interval=1000
spring.kafka.test.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.test.consumer.value-deserializer=org.apache.kafka.common.serialization.StringDeserializer
# kafka offset机制:存储当前消费分区的偏移量,即使挂了或者再均衡问题引发重新分配partation,也能从正确的位置继续消费
# latest: 有提交的offset时,从提交的offset开始,无提交的offset时,消费新产生的数据
# earlist: 有提交的offset时,从提交的offset开始,无提交的offset时,从头开始消费
# none:  有提交的offset时,从提交的offset开始,存在一个未提交的offset的分区时,抛出异常
spring.kafka.test.consumer.auto-offset-reset=latest


核心配置类

@Configuration
@Data
@Profile({"prod","dev"})
public class KafkaEnvBeanConfiguration {

    @Value("${spring.kafka.test.bootstrap-servers}")
    private String bootstrapServers;

    @Value("${spring.kafka.test.consumer.key-deserializer}")
    private String consumerDk;

    @Value("${spring.kafka.test.consumer.value-deserializer}")
    private String consumerDv;

    @Value("${spring.kafka.test.producer.key-serializer}")
    private String producerDk;

    @Value("${spring.kafka.test.producer.value-serializer}")
    private String producerDv;

    //设置并发消费者容器,多线程模式
    @Bean("concurrentKafkaListenerContainerFactory")
    KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>>
    kafkaListenerContainerFactory() {
        ConcurrentKafkaListenerContainerFactory<String, String> factory =
                new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        factory.setConcurrency(3);
        factory.getContainerProperties().setPollTimeout(3000);
        return factory;
    }

    //设置并发消费者容器,多线程模式,批量接收消息
    @Bean("concurrentKafkaListenerContainerBatchFactory")
    KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>>
    kafkaListenerContainerBatchFactory() {
        ConcurrentKafkaListenerContainerFactory<String, String> factory =
                new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        factory.setConcurrency(3);
        factory.setBatchListener(true);
        factory.getContainerProperties().setPollTimeout(3000);
        return factory;
    }

    //消费者工厂
    public ConsumerFactory<String, Object> consumerFactory() {
        return new DefaultKafkaConsumerFactory<>(consumerConfigs());
    }
    
	//消费者配置
    public Map<String, Object> consumerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,consumerDk);
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,consumerDv);
        props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 50);
        return props;
    }




    @Bean
    public ProducerFactory<String, String> producerFactory() {
        return new DefaultKafkaProducerFactory<>(producerConfigs());
    }
	
	//生产者配置
    public Map<String, Object> producerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrapServers);
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, producerDk);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, producerDv);
        return props;
    }


    @Bean
    public KafkaTemplate<String, String> kafkaTemplate() {
        return new KafkaTemplate<String, String>(producerFactory());
    }

    @Bean
    public KafkaAdmin admin() {
        Map<String, Object> configs = new HashMap<>();
        configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        return new KafkaAdmin(configs);
    }
    
    @Bean
    public NewTopic topic1() {
        return TopicBuilder.name("thing1")
                .partitions(10)
                .replicas(1)
                .compact()
                .build();
    }


    @Bean
    public NewTopic topic2() {
        return TopicBuilder.name("thing2")
                .partitions(10)
                .replicas(1)
                .build();
    }

    @Bean
    public NewTopic topic3() {
        return TopicBuilder.name("thing3")
                .partitions(10)
                .replicas(1)
                .build();
    }


  • 消费者类(包含不同的消费模式)
@Component
@Profile({"prod","dev"})
public class TestListener {

    //多线程消费,此处可以覆盖配置类中的concurrency的原始值
    @KafkaListener(id = "id_1",topics = "test",containerFactory = "concurrentKafkaListenerContainerFactory", concurrency = "${listen.concurrency:3}",groupId = "mutiThread",autoStartup = "false")
    private void ListenTestConcurrent(ConsumerRecord<String,String> consumerRecord){
        System.out.println(consumerRecord.value().toString());
    }

    //单线程消费,根据官网描述可推测,并发容器通过委托1-n个单线程容器来做多线程消费的,可以推测当把并发数设置为1时可等价于原始最简配置
    @KafkaListener(id = "id_2",topics = "test",containerFactory = "concurrentKafkaListenerContainerFactory", concurrency = "${listen.concurrency:1}",groupId = "single")
    private void ListenTestSingle(ConsumerRecord<String,String> consumerRecord){

        System.out.println(consumerRecord.value().toString());
    }

    //批量消费
    @KafkaListener(id = "id_2",topics = "test",containerFactory = "concurrentKafkaListenerContainerBatchFactory", groupId = "batch")
    private void ListenTestBatch(List<ConsumerRecord<?, ?>> records){
        System.out.println(records.size());
    }



    //分区批量消费模式, 同组不同分区
    @KafkaListener(id = "id_3",topicPartitions = { @TopicPartition(topic = "thing3", partitions = "0" ) },groupId = "fq",containerFactory = "concurrentKafkaListenerContainerBatchFactory")
    private void ListenTestSinglePattationOne(List<ConsumerRecord<String,String>> consumerRecord){
        System.out.println(consumerRecord.size());
    }
    //分区批量消费模式,同组不同分区
    @KafkaListener(id = "id_6",topicPartitions = { @TopicPartition(topic = "thing3", partitions = "1" ) },groupId = "fq",containerFactory = "concurrentKafkaListenerContainerBatchFactory")
    private void ListenTestSinglePattationTwo(List<ConsumerRecord<String,String>> consumerRecord){
        System.out.println(consumerRecord.size());
    }
}
  • 生产者

    生产者的KafkaTemplate本身就是对KafkaProducer的封装,而该类的实现是线程安全的,所以生产者无论在单线程或者多线程模式下无太大区别,官网在2.3.0以后给DefaultKafkaProducerFactory该类增加了setProducerPerThread(true|false)方法,用于给每个线程的ThreadLocal设置单独的kafkaproducer实例,以减缓当多线程共享同一实例时,调用flush()方法,而给其他线程带来的延迟。同时注意,当此开关开启时,必须显示调用closeThreadBoundProducer(),才能清除掉ThreadLocal上的实例。

    生产者配置更改为多线程模式:

        @Bean
        public ProducerFactory<String, String> producerFactory() {
            DefaultKafkaProducerFactory<String, String> objectObjectDefaultKafkaProducerFactory = new DefaultKafkaProducerFactory<>(producerConfigs());
            objectObjectDefaultKafkaProducerFactory.setProducerPerThread(true);
            return objectObjectDefaultKafkaProducerFactory;
        }
    
        @Bean
        public Map<String, Object> producerConfigs() {
            Map<String, Object> props = new HashMap<>();
            props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrapServers);
            props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, producerDk);
            props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, producerDv);
            return props;
        }
    

随手写的生产者,注入KafkaTemplate实例即可

@Component
@Profile({"prod","dev"})
public class KafkaSender implements CommandLineRunner {
    @Autowired
    private KafkaTemplate<String,String> kafkaTemplate;

    @Override
    public void run(String... args) throws Exception {
        for(;;){
            ListenableFuture<SendResult<String, String>> send = kafkaTemplate.send("thing3", "Hello World");
            send.addCallback(new ListenableFutureCallback<SendResult<String, String>>() {
                @Override
                public void onSuccess(SendResult<String, String> result) {
                    RecordMetadata recordMetadata = result.getRecordMetadata();
                    System.out.println(recordMetadata.offset());
                }
                @Override
                public void onFailure(Throwable ex) {
                    System.out.println("fail");
                }
            });
        }
    }
}

结语

本文涉及的内容大多是在集成层面如何粗粒度的使用springboot集成kafka,并未牵扯到实际的应用场景和kafka的诸多细节问题,后续有时间会继续从更深入的角度探讨。