Apache Kafka 实战指南:分布式消息系统从入门到精通
目录
- 1. Kafka 简介
- 2. Kafka 核心架构
- 3. 环境搭建与配置
- 4. Producer 生产者实战
- 5. Consumer 消费者实战
- 6. 消息可靠性保证
- 7. 分区与负载均衡
- 8. Kafka Streams 流处理
- 9. Spring Kafka 集成
- 10. 生产环境实战场景
- 11. 性能优化最佳实践
- 12. 监控与运维
- 13. 总结
1. Kafka 简介
Apache Kafka 是一个分布式流处理平台,最初由LinkedIn开发并开源。它以高吞吐量、低延迟、高可用性和可扩展性著称,是目前最流行的消息中间件之一。
1.1 核心特性
- 高吞吐量: 单机支持百万级TPS
- 低延迟: 毫秒级消息延迟
- 持久化: 消息持久化到磁盘
- 分布式: 支持集群部署和水平扩展
- 容错性: 数据副本机制保证高可用
- 顺序保证: 分区内消息严格有序
- 回溯消费: 支持从任意位置读取消息
1.2 核心概念
Kafka 核心概念
Producer (生产者)
│
│ 发送消息
▼
┌─────────────────────────────────────────┐
│ Kafka Cluster │
│ │
│ Topic: user-events │
│ ┌────────────────────────────────┐ │
│ │ Partition 0 [msg1, msg2, ...] │ │
│ ├────────────────────────────────┤ │
│ │ Partition 1 [msg3, msg4, ...] │ │
│ ├────────────────────────────────┤ │
│ │ Partition 2 [msg5, msg6, ...] │ │
│ └────────────────────────────────┘ │
│ │
└─────────────────────────────────────────┘
│
│ 消费消息
▼
Consumer Group (消费者组)
├─ Consumer 1 (消费 P0)
├─ Consumer 2 (消费 P1)
└─ Consumer 3 (消费 P2)
关键术语:
- Topic: 消息主题/队列
- Partition: 分区,Topic的物理分组
- Offset: 消息在分区中的位置
- Broker: Kafka服务器节点
- Consumer Group: 消费者组,实现负载均衡
- Replication: 副本,保证高可用
1.3 应用场景
- 消息队列: 解耦系统、异步处理、削峰填谷
- 日志聚合: 收集分散的日志,统一存储和分析
- 用户行为追踪: 收集用户行为数据,实时分析
- 流处理: 实时计算和数据处理
- 事件溯源: 事件驱动架构的基础设施
- 指标监控: 收集系统指标,实时监控告警
2. Kafka 核心架构
2.1 整体架构
Kafka 集群架构
ZooKeeper Cluster
┌──────────────────┐
│ Metadata │
│ - Topics │
│ - Partitions │
│ - Brokers │
│ - Controller │
└────────┬─────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Broker 1│ │ Broker 2│ │ Broker 3│
│ │ │ │ │ │
│ Topic A │ │ Topic A │ │ Topic B │
│ P0(L) │◀───────▶│ P0(F) │◀───────▶│ P0(L) │
│ P1(F) │ │ P1(L) │ │ P1(F) │
│ │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘
▲ ▲ ▲
│ │ │
│ │ │
Producers Producers Consumers
L = Leader (主副本)
F = Follower (从副本)
2.2 消息存储结构
Partition 存储结构
Topic: user-events
├── Partition 0
│ ├── 00000000000000000000.log (Segment 1)
│ ├── 00000000000000000000.index
│ ├── 00000000000000100000.log (Segment 2)
│ ├── 00000000000000100000.index
│ └── ...
├── Partition 1
│ ├── 00000000000000000000.log
│ ├── 00000000000000000000.index
│ └── ...
└── Partition 2
└── ...
Segment 文件结构:
┌─────────────────────────────────────────┐
│ Offset: 0 | Key | Value | Timestamp │
├─────────────────────────────────────────┤
│ Offset: 1 | Key | Value | Timestamp │
├─────────────────────────────────────────┤
│ Offset: 2 | Key | Value | Timestamp │
├─────────────────────────────────────────┤
│ ... │
└─────────────────────────────────────────┘
特点:
1. 顺序写入,高性能
2. 零拷贝技术,减少内存消耗
3. 消息不可变,只能追加
4. 通过索引快速定位消息
2.3 副本机制
Leader-Follower 副本机制
Topic: orders, Partition: 0, Replication: 3
Broker 1 (Leader) Broker 2 (Follower) Broker 3 (Follower)
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Offset: 0 │ │ Offset: 0 │ │ Offset: 0 │
│ Offset: 1 │───────▶│ Offset: 1 │─────▶│ Offset: 1 │
│ Offset: 2 │ │ Offset: 2 │ │ Offset: 2 │
│ Offset: 3 (NEW) │ │ │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────┘
│ │ │
│ │ │
HW=2 (高水位) ISR (In-Sync Replicas)
流程:
1. Producer 写入 Leader
2. Leader 写入本地日志
3. Follower 从 Leader 拉取数据
4. Follower 写入本地日志并发送 ACK
5. Leader 收到所有 ISR 的 ACK 后,更新 HW
6. Consumer 只能读取到 HW 位置的消息
优点:
- 数据不丢失
- 自动故障转移
- 读写分离
3. 环境搭建与配置
3.1 Docker 快速启动
# docker-compose.yml
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.5.0
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- "2181:2181"
kafka:
image: confluentinc/cp-kafka:7.5.0
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
# 启动
docker-compose up -d
# 创建 Topic
docker exec -it kafka kafka-topics --create \
--topic user-events \
--bootstrap-server localhost:9092 \
--partitions 3 \
--replication-factor 1
3.2 Maven 依赖配置
<!-- pom.xml -->
<properties>
<kafka.version>3.6.0</kafka.version>
<spring.kafka.version>3.0.12</spring.kafka.version>
<jackson.version>2.15.3</jackson.version>
</properties>
<dependencies>
<!-- Kafka 客户端 -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>${kafka.version}</version>
</dependency>
<!-- Kafka Streams -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>${kafka.version}</version>
</dependency>
<!-- Spring Kafka (可选) -->
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>${spring.kafka.version}</version>
</dependency>
<!-- JSON 序列化 -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>${jackson.version}</version>
</dependency>
<!-- 日志 -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>
</dependencies>
4. Producer 生产者实战
4.1 基础生产者
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;
import java.util.concurrent.Future;
/**
* Kafka Producer 基础示例
*/
public class BasicProducerExample {
private static final String BOOTSTRAP_SERVERS = "localhost:9092";
private static final String TOPIC = "user-events";
/**
* 创建生产者配置
*/
public static Properties createProducerConfig() {
Properties props = new Properties();
// 必需配置
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
// 性能配置
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
props.put(ProducerConfig.LINGER_MS_CONFIG, 10);
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
// 可靠性配置
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, 3);
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
return props;
}
/**
* 发送消息 - 同步方式
*/
public static void sendSync() {
try (KafkaProducer<String, String> producer =
new KafkaProducer<>(createProducerConfig())) {
for (int i = 0; i < 10; i++) {
String key = "user_" + i;
String value = "event_data_" + i;
ProducerRecord<String, String> record =
new ProducerRecord<>(TOPIC, key, value);
try {
// 同步发送,等待结果
RecordMetadata metadata = producer.send(record).get();
System.out.printf(
"消息发送成功 - Topic: %s, Partition: %d, Offset: %d\n",
metadata.topic(),
metadata.partition(),
metadata.offset()
);
} catch (Exception e) {
System.err.println("消息发送失败: " + e.getMessage());
}
}
}
}
/**
* 发送消息 - 异步方式 (推荐)
*/
public static void sendAsync() {
try (KafkaProducer<String, String> producer =
new KafkaProducer<>(createProducerConfig())) {
for (int i = 0; i < 10; i++) {
String key = "user_" + i;
String value = "event_data_" + i;
ProducerRecord<String, String> record =
new ProducerRecord<>(TOPIC, key, value);
// 异步发送,带回调
producer.send(record, new Callback() {
@Override
public void onCompletion(RecordMetadata metadata,
Exception exception) {
if (exception == null) {
System.out.printf(
"消息发送成功 - Partition: %d, Offset: %d\n",
metadata.partition(),
metadata.offset()
);
} else {
System.err.println("消息发送失败: " +
exception.getMessage());
}
}
});
}
// 确保所有消息发送完成
producer.flush();
}
}
/**
* 发送消息到指定分区
*/
public static void sendToPartition() {
try (KafkaProducer<String, String> producer =
new KafkaProducer<>(createProducerConfig())) {
// 发送到分区 0
ProducerRecord<String, String> record =
new ProducerRecord<>(TOPIC, 0, "key1", "value1");
producer.send(record, (metadata, exception) -> {
if (exception == null) {
System.out.println("消息已发送到分区: " +
metadata.partition());
}
});
producer.flush();
}
}
public static void main(String[] args) {
System.out.println("=== 同步发送 ===");
sendSync();
System.out.println("\n=== 异步发送 ===");
sendAsync();
System.out.println("\n=== 发送到指定分区 ===");
sendToPartition();
}
}
4.2 自定义序列化器
import com.fasterxml.jackson.databind.ObjectMapper;
import org.apache.kafka.common.serialization.Serializer;
import java.util.Map;
/**
* 用户事件实体
*/
class UserEvent {
private String userId;
private String eventType;
private String page;
private long timestamp;
public UserEvent() {}
public UserEvent(String userId, String eventType,
String page, long timestamp) {
this.userId = userId;
this.eventType = eventType;
this.page = page;
this.timestamp = timestamp;
}
// Getters and Setters
public String getUserId() { return userId; }
public void setUserId(String userId) { this.userId = userId; }
public String getEventType() { return eventType; }
public void setEventType(String eventType) { this.eventType = eventType; }
public String getPage() { return page; }
public void setPage(String page) { this.page = page; }
public long getTimestamp() { return timestamp; }
public void setTimestamp(long timestamp) { this.timestamp = timestamp; }
@Override
public String toString() {
return String.format(
"UserEvent{userId='%s', eventType='%s', page='%s', timestamp=%d}",
userId, eventType, page, timestamp
);
}
}
/**
* 自定义 JSON 序列化器
*/
class JsonSerializer<T> implements Serializer<T> {
private final ObjectMapper objectMapper = new ObjectMapper();
@Override
public void configure(Map<String, ?> configs, boolean isKey) {
// 配置初始化
}
@Override
public byte[] serialize(String topic, T data) {
if (data == null) {
return null;
}
try {
return objectMapper.writeValueAsBytes(data);
} catch (Exception e) {
throw new RuntimeException("JSON 序列化失败", e);
}
}
@Override
public void close() {
// 清理资源
}
}
/**
* 使用自定义序列化器
*/
public class CustomSerializerProducer {
public static void sendUserEvents() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
JsonSerializer.class.getName());
try (KafkaProducer<String, UserEvent> producer =
new KafkaProducer<>(props)) {
for (int i = 0; i < 10; i++) {
UserEvent event = new UserEvent(
"user_" + (i % 5),
"page_view",
"/home",
System.currentTimeMillis()
);
ProducerRecord<String, UserEvent> record =
new ProducerRecord<>("user-events", event.getUserId(), event);
producer.send(record, (metadata, exception) -> {
if (exception == null) {
System.out.println("事件已发送: " + event);
} else {
System.err.println("发送失败: " + exception.getMessage());
}
});
}
producer.flush();
}
}
public static void main(String[] args) {
sendUserEvents();
}
}
4.3 分区策略
import org.apache.kafka.clients.producer.Partitioner;
import org.apache.kafka.common.Cluster;
import org.apache.kafka.common.PartitionInfo;
import java.util.List;
import java.util.Map;
/**
* 自定义分区器
* 策略: VIP用户固定分区,普通用户轮询分配
*/
public class CustomPartitioner implements Partitioner {
private static final String VIP_PARTITION = "0";
@Override
public int partition(String topic, Object key, byte[] keyBytes,
Object value, byte[] valueBytes, Cluster cluster) {
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (keyBytes == null) {
// 没有 key,随机分配
return (int) (Math.random() * numPartitions);
}
String keyString = new String(keyBytes);
// VIP 用户固定到分区 0
if (keyString.startsWith("vip_")) {
return 0;
}
// 普通用户按 hash 分配
return Math.abs(keyString.hashCode()) % numPartitions;
}
@Override
public void close() {}
@Override
public void configure(Map<String, ?> configs) {}
}
/**
* 使用自定义分区器
*/
class CustomPartitionerExample {
public static void main(String[] args) {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
// 指定自定义分区器
props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG,
CustomPartitioner.class.getName());
try (KafkaProducer<String, String> producer =
new KafkaProducer<>(props)) {
// VIP 用户,应该固定到分区 0
producer.send(new ProducerRecord<>("user-events",
"vip_user_001", "VIP event"));
// 普通用户,按 hash 分配
producer.send(new ProducerRecord<>("user-events",
"user_001", "Normal event"));
producer.flush();
}
}
}
5. Consumer 消费者实战
5.1 基础消费者
import org.apache.kafka.clients.consumer.*;
import org.apache.kafka.common.serialization.StringDeserializer;
import java.time.Duration;
import java.util.Collections;
import java.util.Properties;
/**
* Kafka Consumer 基础示例
*/
public class BasicConsumerExample {
private static final String BOOTSTRAP_SERVERS = "localhost:9092";
private static final String TOPIC = "user-events";
private static final String GROUP_ID = "user-event-consumer-group";
/**
* 创建消费者配置
*/
public static Properties createConsumerConfig() {
Properties props = new Properties();
// 必需配置
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS);
props.put(ConsumerConfig.GROUP_ID_CONFIG, GROUP_ID);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
// 自动提交配置
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
// 消费位置配置
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
// 性能配置
props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024);
props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 500);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 500);
return props;
}
/**
* 自动提交 offset
*/
public static void consumeAutoCommit() {
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(createConsumerConfig())) {
// 订阅 Topic
consumer.subscribe(Collections.singletonList(TOPIC));
System.out.println("开始消费消息...");
while (true) {
// 拉取消息
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.printf(
"收到消息 - Topic: %s, Partition: %d, Offset: %d, " +
"Key: %s, Value: %s\n",
record.topic(),
record.partition(),
record.offset(),
record.key(),
record.value()
);
}
}
}
}
/**
* 手动提交 offset (推荐)
*/
public static void consumeManualCommit() {
Properties props = createConsumerConfig();
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(Collections.singletonList(TOPIC));
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
try {
// 处理消息
processRecord(record);
// 手动提交 offset
consumer.commitSync();
System.out.printf("处理并提交 - Offset: %d\n",
record.offset());
} catch (Exception e) {
System.err.println("处理失败: " + e.getMessage());
// 不提交 offset,下次重新消费
}
}
}
}
}
/**
* 批量手动提交
*/
public static void consumeBatchCommit() {
Properties props = createConsumerConfig();
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(Collections.singletonList(TOPIC));
int messageCount = 0;
final int BATCH_SIZE = 100;
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
processRecord(record);
messageCount++;
// 每处理 100 条消息提交一次
if (messageCount % BATCH_SIZE == 0) {
consumer.commitSync();
System.out.println("已提交 " + messageCount + " 条消息");
}
}
}
}
}
/**
* 订阅多个 Topic
*/
public static void consumeMultipleTopics() {
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(createConsumerConfig())) {
// 订阅多个 Topic
consumer.subscribe(
java.util.Arrays.asList("user-events", "order-events")
);
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.printf("Topic: %s, Message: %s\n",
record.topic(), record.value());
}
}
}
}
private static void processRecord(ConsumerRecord<String, String> record) {
// 模拟业务处理
System.out.printf("处理消息: %s\n", record.value());
}
public static void main(String[] args) {
// 选择一种方式运行
consumeManualCommit();
}
}
5.2 消费者组与负载均衡
/**
* 消费者组示例
* 演示多个消费者如何协作消费
*/
public class ConsumerGroupExample {
/**
* 消费者组工作原理
*
* Topic: user-events (3个分区)
* Consumer Group: group-1 (3个消费者)
*
* 场景1: 消费者数 == 分区数 (最佳)
* ┌──────────────┐ ┌──────────────┐
* │ Partition 0 │───▶│ Consumer 1 │
* ├──────────────┤ ├──────────────┤
* │ Partition 1 │───▶│ Consumer 2 │
* ├──────────────┤ ├──────────────┤
* │ Partition 2 │───▶│ Consumer 3 │
* └──────────────┘ └──────────────┘
*
* 场景2: 消费者数 < 分区数
* ┌──────────────┐ ┌──────────────┐
* │ Partition 0 │───▶│ Consumer 1 │
* ├──────────────┤ │ │
* │ Partition 1 │───▶│ │
* ├──────────────┤ ├──────────────┤
* │ Partition 2 │───▶│ Consumer 2 │
* └──────────────┘ └──────────────┘
*
* 场景3: 消费者数 > 分区数 (浪费)
* ┌──────────────┐ ┌──────────────┐
* │ Partition 0 │───▶│ Consumer 1 │
* ├──────────────┤ ├──────────────┤
* │ Partition 1 │───▶│ Consumer 2 │
* ├──────────────┤ ├──────────────┤
* │ Partition 2 │───▶│ Consumer 3 │
* └──────────────┘ ├──────────────┤
* │ Consumer 4 │(空闲)
* └──────────────┘
*/
/**
* 启动多个消费者实例
*/
public static void startConsumerGroup(int consumerCount) {
for (int i = 0; i < consumerCount; i++) {
final int consumerId = i;
new Thread(() -> {
Properties props = BasicConsumerExample.createConsumerConfig();
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(Collections.singletonList("user-events"));
System.out.println("消费者 " + consumerId + " 启动");
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.printf(
"[消费者 %d] Partition: %d, Offset: %d, " +
"Message: %s\n",
consumerId,
record.partition(),
record.offset(),
record.value()
);
}
}
}
}).start();
}
}
/**
* 再均衡监听器
*/
static class RebalanceListener implements ConsumerRebalanceListener {
@Override
public void onPartitionsRevoked(
java.util.Collection<org.apache.kafka.common.TopicPartition> partitions) {
System.out.println("分区被撤销: " + partitions);
// 可以在这里提交 offset
}
@Override
public void onPartitionsAssigned(
java.util.Collection<org.apache.kafka.common.TopicPartition> partitions) {
System.out.println("分配到新分区: " + partitions);
// 可以在这里重置消费位置
}
}
/**
* 使用再均衡监听器
*/
public static void consumeWithRebalanceListener() {
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(
BasicConsumerExample.createConsumerConfig())) {
consumer.subscribe(
Collections.singletonList("user-events"),
new RebalanceListener()
);
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.println("处理消息: " + record.value());
}
}
}
}
public static void main(String[] args) {
// 启动 3 个消费者组成消费者组
startConsumerGroup(3);
}
}
5.3 精确一次语义
import org.apache.kafka.clients.consumer.OffsetAndMetadata;
import org.apache.kafka.common.TopicPartition;
import java.util.HashMap;
import java.util.Map;
/**
* 精确一次消费语义 (Exactly-Once)
*/
public class ExactlyOnceConsumer {
/**
* 方式1: 手动管理 offset + 幂等性处理
*/
public static void exactlyOnceWithIdempotent() {
Properties props = BasicConsumerExample.createConsumerConfig();
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(Collections.singletonList("user-events"));
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
Map<TopicPartition, OffsetAndMetadata> offsets =
new HashMap<>();
for (ConsumerRecord<String, String> record : records) {
try {
// 1. 幂等性处理 (检查是否已处理)
if (isMessageProcessed(record.key(), record.offset())) {
System.out.println("消息已处理,跳过: " +
record.offset());
continue;
}
// 2. 处理消息
processMessage(record);
// 3. 保存处理记录 (与业务操作在同一事务)
saveProcessedMessage(record.key(), record.offset());
// 4. 记录要提交的 offset
offsets.put(
new TopicPartition(record.topic(), record.partition()),
new OffsetAndMetadata(record.offset() + 1)
);
} catch (Exception e) {
System.err.println("处理失败,不提交 offset: " +
e.getMessage());
break;
}
}
// 5. 批量提交 offset
if (!offsets.isEmpty()) {
consumer.commitSync(offsets);
System.out.println("已提交 offset: " + offsets.size());
}
}
}
}
/**
* 方式2: 事务性消费 (Kafka 0.11+)
*/
public static void exactlyOnceWithTransaction() {
// 配置事务性 Consumer
Properties consumerProps = BasicConsumerExample.createConsumerConfig();
consumerProps.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
consumerProps.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG,
"read_committed");
// 配置事务性 Producer
Properties producerProps = BasicProducerExample.createProducerConfig();
producerProps.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG,
"my-transactional-id");
producerProps.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(consumerProps);
KafkaProducer<String, String> producer =
new KafkaProducer<>(producerProps)) {
// 初始化事务
producer.initTransactions();
consumer.subscribe(Collections.singletonList("input-topic"));
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
if (records.isEmpty()) {
continue;
}
// 开始事务
producer.beginTransaction();
try {
Map<TopicPartition, OffsetAndMetadata> offsets =
new HashMap<>();
for (ConsumerRecord<String, String> record : records) {
// 处理并转发消息
String processedValue = processMessage(record);
producer.send(new ProducerRecord<>(
"output-topic",
record.key(),
processedValue
));
offsets.put(
new TopicPartition(record.topic(), record.partition()),
new OffsetAndMetadata(record.offset() + 1)
);
}
// 在事务中提交 offset
producer.sendOffsetsToTransaction(
offsets,
consumer.groupMetadata()
);
// 提交事务
producer.commitTransaction();
System.out.println("事务提交成功");
} catch (Exception e) {
// 回滚事务
producer.abortTransaction();
System.err.println("事务回滚: " + e.getMessage());
}
}
}
}
private static boolean isMessageProcessed(String key, long offset) {
// 检查数据库或缓存,判断消息是否已处理
return false;
}
private static void processMessage(ConsumerRecord<String, String> record) {
// 业务处理逻辑
System.out.println("处理消息: " + record.value());
}
private static String processMessage(ConsumerRecord<String, String> record) {
// 业务处理逻辑,返回处理后的结果
return record.value().toUpperCase();
}
private static void saveProcessedMessage(String key, long offset) {
// 保存处理记录到数据库
}
}
6. 消息可靠性保证
6.1 生产者可靠性
/**
* 生产者可靠性配置
*/
public class ReliableProducer {
/**
* 高可靠性配置
*/
public static Properties createReliableConfig() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
// 1. ACK 机制
// all/-1: 等待所有 ISR 副本确认 (最安全,性能最低)
// 1: 只等待 Leader 确认 (平衡)
// 0: 不等待确认 (最快,可能丢失)
props.put(ProducerConfig.ACKS_CONFIG, "all");
// 2. 重试机制
props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE);
props.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 100);
// 3. 幂等性 (防止重复)
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
// 4. 最大飞行请求数
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
// 5. 超时时间
props.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30000);
props.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120000);
return props;
}
/**
* 可靠性发送示例
*/
public static void sendReliably() {
try (KafkaProducer<String, String> producer =
new KafkaProducer<>(createReliableConfig())) {
String topic = "important-events";
for (int i = 0; i < 10; i++) {
ProducerRecord<String, String> record =
new ProducerRecord<>(topic, "key_" + i, "value_" + i);
try {
// 同步发送,等待确认
RecordMetadata metadata = producer.send(record).get();
System.out.printf(
"消息发送成功 - Partition: %d, Offset: %d, " +
"Timestamp: %d\n",
metadata.partition(),
metadata.offset(),
metadata.timestamp()
);
} catch (Exception e) {
// 发送失败,进行补偿
handleSendFailure(record, e);
}
}
}
}
private static void handleSendFailure(
ProducerRecord<String, String> record,
Exception exception) {
System.err.println("消息发送失败,记录到失败队列: " +
exception.getMessage());
// 可以:
// 1. 写入本地文件
// 2. 存入数据库
// 3. 发送到死信队列
// 4. 告警通知
}
}
6.2 消息丢失场景分析
消息丢失场景与解决方案
场景1: Producer 发送失败
┌──────────┐ X ┌──────────┐
│ Producer │ ────────────────▶ │ Broker │
└──────────┘ 网络故障/超时 └──────────┘
解决方案:
- 设置 acks=all
- 启用重试机制
- 使用同步发送或回调检查
- 记录发送失败的消息
场景2: Broker 未持久化就宕机
┌──────────┐ ┌──────────┐
│ Producer │ ─────────────────▶│ Leader │
└──────────┘ │ (内存中) │
│ X │
└──────────┘
宕机
解决方案:
- 设置副本因子 >= 3
- 设置 min.insync.replicas >= 2
- acks=all 等待所有 ISR 确认
场景3: Consumer 消费后未提交 offset 就宕机
┌──────────┐ ┌──────────┐
│ Broker │ ─────────────────▶│ Consumer │
└──────────┘ │ 处理中 │
│ X │
└──────────┘
宕机
解决方案:
- 先处理后提交
- 使用事务
- 幂等性处理
场景4: Consumer 提交 offset 后业务处理失败
┌──────────┐ ┌──────────┐
│ Broker │ ─────────────────▶│ Consumer │
└──────────┘ │ 提交OK │
│ 处理失败 │
└──────────┘
解决方案:
- 先处理后提交
- 业务处理与 offset 提交放在同一事务
7. 分区与负载均衡
7.1 分区管理
import org.apache.kafka.clients.admin.*;
import org.apache.kafka.common.config.TopicConfig;
import java.util.*;
import java.util.concurrent.ExecutionException;
/**
* Topic 和分区管理
*/
public class TopicPartitionManager {
private static final String BOOTSTRAP_SERVERS = "localhost:9092";
/**
* 创建 AdminClient
*/
public static AdminClient createAdminClient() {
Properties props = new Properties();
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
BOOTSTRAP_SERVERS);
return AdminClient.create(props);
}
/**
* 创建 Topic
*/
public static void createTopic(String topicName,
int numPartitions,
short replicationFactor)
throws ExecutionException, InterruptedException {
try (AdminClient adminClient = createAdminClient()) {
// Topic 配置
Map<String, String> configs = new HashMap<>();
configs.put(TopicConfig.RETENTION_MS_CONFIG,
String.valueOf(7 * 24 * 60 * 60 * 1000L)); // 7天
configs.put(TopicConfig.COMPRESSION_TYPE_CONFIG, "snappy");
configs.put(TopicConfig.MIN_IN_SYNC_REPLICAS_CONFIG, "2");
NewTopic newTopic = new NewTopic(
topicName,
numPartitions,
replicationFactor
).configs(configs);
CreateTopicsResult result =
adminClient.createTopics(Collections.singleton(newTopic));
result.all().get();
System.out.println("Topic 创建成功: " + topicName);
}
}
/**
* 列出所有 Topic
*/
public static void listTopics()
throws ExecutionException, InterruptedException {
try (AdminClient adminClient = createAdminClient()) {
ListTopicsResult topics = adminClient.listTopics();
Set<String> topicNames = topics.names().get();
System.out.println("Topics:");
for (String name : topicNames) {
System.out.println(" - " + name);
}
}
}
/**
* 查看 Topic 详情
*/
public static void describeTopicDetail(String topicName)
throws ExecutionException, InterruptedException {
try (AdminClient adminClient = createAdminClient()) {
DescribeTopicsResult result = adminClient.describeTopics(
Collections.singleton(topicName)
);
Map<String, TopicDescription> descriptions = result.all().get();
TopicDescription description = descriptions.get(topicName);
System.out.println("\nTopic: " + description.name());
System.out.println("Is Internal: " + description.isInternal());
System.out.println("Partitions: " +
description.partitions().size());
for (TopicPartitionInfo partition : description.partitions()) {
System.out.printf(
" Partition %d: Leader=%d, Replicas=%s, ISR=%s\n",
partition.partition(),
partition.leader().id(),
partition.replicas(),
partition.isr()
);
}
}
}
/**
* 增加分区数
*/
public static void increasePartitions(String topicName,
int newPartitionCount)
throws ExecutionException, InterruptedException {
try (AdminClient adminClient = createAdminClient()) {
Map<String, NewPartitions> newPartitions = new HashMap<>();
newPartitions.put(topicName,
NewPartitions.increaseTo(newPartitionCount));
CreatePartitionsResult result =
adminClient.createPartitions(newPartitions);
result.all().get();
System.out.println("分区数已增加到: " + newPartitionCount);
}
}
/**
* 删除 Topic
*/
public static void deleteTopic(String topicName)
throws ExecutionException, InterruptedException {
try (AdminClient adminClient = createAdminClient()) {
DeleteTopicsResult result = adminClient.deleteTopics(
Collections.singleton(topicName)
);
result.all().get();
System.out.println("Topic 已删除: " + topicName);
}
}
public static void main(String[] args) throws Exception {
// 创建 Topic
createTopic("test-topic", 3, (short) 1);
// 列出所有 Topic
listTopics();
// 查看 Topic 详情
describeTopicDetail("user-events");
// 增加分区
// increasePartitions("test-topic", 5);
// 删除 Topic
// deleteTopic("test-topic");
}
}
7.2 分区分配策略
分区分配策略
1. Range 策略 (默认)
按分区范围分配
Topic: T1 (7个分区), 3个消费者
┌────────────────────────────────────┐
│ Consumer 1: P0, P1, P2 │
├────────────────────────────────────┤
│ Consumer 2: P3, P4 │
├────────────────────────────────────┤
│ Consumer 3: P5, P6 │
└────────────────────────────────────┘
2. RoundRobin 策略
轮询分配
Topic: T1 (7个分区), 3个消费者
┌────────────────────────────────────┐
│ Consumer 1: P0, P3, P6 │
├────────────────────────────────────┤
│ Consumer 2: P1, P4 │
├────────────────────────────────────┤
│ Consumer 3: P2, P5 │
└────────────────────────────────────┘
3. Sticky 策略
尽量保持原有分配,减少再均衡开销
初始分配:
Consumer 1: P0, P1
Consumer 2: P2, P3
Consumer 3: P4, P5
Consumer 2 宕机后:
Consumer 1: P0, P1, P2
Consumer 3: P4, P5, P3
4. CooperativeSticky 策略 (推荐)
增量式再均衡,不会停止所有消费者
/**
* 配置分区分配策略
*/
public class PartitionAssignmentStrategy {
public static void configureStrategy() {
Properties props = BasicConsumerExample.createConsumerConfig();
// 配置分区分配策略
props.put(
ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
Arrays.asList(
// 推荐使用 CooperativeSticky
"org.apache.kafka.clients.consumer.CooperativeStickyAssignor",
// 或使用多个策略,按优先级排列
"org.apache.kafka.clients.consumer.StickyAssignor",
"org.apache.kafka.clients.consumer.RoundRobinAssignor",
"org.apache.kafka.clients.consumer.RangeAssignor"
)
);
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(Collections.singletonList("user-events"));
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.printf("Partition: %d, Message: %s\n",
record.partition(), record.value());
}
}
}
}
}
8. Kafka Streams 流处理
8.1 Kafka Streams 基础
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.*;
import java.time.Duration;
import java.util.Properties;
/**
* Kafka Streams 示例
*/
public class KafkaStreamsExample {
/**
* 创建 Streams 配置
*/
public static Properties createStreamsConfig() {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG,
"streams-wordcount-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG,
Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG,
Serdes.String().getClass());
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 1000);
return props;
}
/**
* 示例1: 单词计数
*/
public static void wordCount() {
StreamsBuilder builder = new StreamsBuilder();
// 输入流
KStream<String, String> textLines =
builder.stream("input-topic");
// 处理流程:
// 1. 转换为小写
// 2. 分词
// 3. 分组计数
// 4. 输出到结果 Topic
KTable<String, Long> wordCounts = textLines
.flatMapValues(value ->
Arrays.asList(value.toLowerCase().split("\\W+")))
.groupBy((key, word) -> word)
.count();
// 输出到 Topic
wordCounts.toStream()
.to("wordcount-output",
Produced.with(Serdes.String(), Serdes.Long()));
// 启动流处理
KafkaStreams streams =
new KafkaStreams(builder.build(), createStreamsConfig());
streams.start();
// 添加关闭钩子
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
}
/**
* 示例2: 实时用户行为统计
*/
public static void userBehaviorAnalysis() {
StreamsBuilder builder = new StreamsBuilder();
// 用户事件流
KStream<String, String> userEvents =
builder.stream("user-events");
// 5分钟滚动窗口统计
KTable<Windowed<String>, Long> windowedCounts = userEvents
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofMinutes(5)))
.count();
// 输出窗口统计结果
windowedCounts.toStream()
.map((windowedKey, count) -> {
String userId = windowedKey.key();
long windowStart = windowedKey.window().start();
long windowEnd = windowedKey.window().end();
String result = String.format(
"User: %s, Window: [%d, %d], Count: %d",
userId, windowStart, windowEnd, count
);
return KeyValue.pair(userId, result);
})
.to("user-stats-output");
KafkaStreams streams =
new KafkaStreams(builder.build(), createStreamsConfig());
streams.start();
}
/**
* 示例3: 流join
*/
public static void streamJoin() {
StreamsBuilder builder = new StreamsBuilder();
// 订单流
KStream<String, String> orders =
builder.stream("orders");
// 用户流
KTable<String, String> users =
builder.table("users");
// Join: 订单关联用户信息
KStream<String, String> enrichedOrders = orders
.join(users,
(order, user) -> "Order: " + order + ", User: " + user);
enrichedOrders.to("enriched-orders");
KafkaStreams streams =
new KafkaStreams(builder.build(), createStreamsConfig());
streams.start();
}
/**
* 示例4: 过滤和转换
*/
public static void filterAndTransform() {
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> events =
builder.stream("all-events");
// 过滤出重要事件
KStream<String, String> importantEvents = events
.filter((key, value) ->
value.contains("IMPORTANT") || value.contains("ERROR"))
.mapValues(value -> value.toUpperCase());
// 分支处理
Map<String, KStream<String, String>> branches = events
.split()
.branch((key, value) -> value.contains("ERROR"),
Branched.as("errors"))
.branch((key, value) -> value.contains("WARNING"),
Branched.as("warnings"))
.defaultBranch(Branched.as("normal"));
// 不同分支输出到不同 Topic
branches.get("errors").to("error-events");
branches.get("warnings").to("warning-events");
branches.get("normal").to("normal-events");
KafkaStreams streams =
new KafkaStreams(builder.build(), createStreamsConfig());
streams.start();
}
public static void main(String[] args) {
// 运行单词计数示例
wordCount();
// 或运行其他示例
// userBehaviorAnalysis();
// streamJoin();
// filterAndTransform();
}
}
9. Spring Kafka 集成
9.1 Spring Boot 配置
# application.yml
spring:
kafka:
bootstrap-servers: localhost:9092
# Producer 配置
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
acks: all
retries: 3
batch-size: 16384
linger-ms: 10
# Consumer 配置
consumer:
group-id: my-consumer-group
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
auto-offset-reset: earliest
enable-auto-commit: false
properties:
spring.json.trusted.packages: "*"
# Listener 配置
listener:
ack-mode: manual_immediate
concurrency: 3
9.2 Spring Kafka 使用
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.support.Acknowledgment;
import org.springframework.kafka.support.SendResult;
import org.springframework.stereotype.Service;
import org.springframework.web.bind.annotation.*;
import java.util.concurrent.CompletableFuture;
/**
* Spring Kafka 应用
*/
@SpringBootApplication
public class SpringKafkaApplication {
public static void main(String[] args) {
SpringApplication.run(SpringKafkaApplication.class, args);
}
}
/**
* Kafka Producer 服务
*/
@Service
class KafkaProducerService {
private final KafkaTemplate<String, Object> kafkaTemplate;
public KafkaProducerService(KafkaTemplate<String, Object> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
/**
* 发送消息
*/
public void sendMessage(String topic, String key, Object message) {
CompletableFuture<SendResult<String, Object>> future =
kafkaTemplate.send(topic, key, message);
future.whenComplete((result, ex) -> {
if (ex == null) {
System.out.printf(
"消息发送成功 - Topic: %s, Partition: %d, Offset: %d\n",
result.getRecordMetadata().topic(),
result.getRecordMetadata().partition(),
result.getRecordMetadata().offset()
);
} else {
System.err.println("消息发送失败: " + ex.getMessage());
}
});
}
/**
* 批量发送
*/
public void sendBatch(String topic, java.util.List<UserEvent> events) {
for (UserEvent event : events) {
sendMessage(topic, event.getUserId(), event);
}
kafkaTemplate.flush();
}
}
/**
* Kafka Consumer 服务
*/
@Service
class KafkaConsumerService {
/**
* 监听消息 - 自动 ACK
*/
@KafkaListener(topics = "user-events", groupId = "group-1")
public void listenAutoAck(String message) {
System.out.println("收到消息: " + message);
// 处理消息
}
/**
* 监听消息 - 手动 ACK
*/
@KafkaListener(
topics = "important-events",
groupId = "group-2",
containerFactory = "kafkaListenerContainerFactory"
)
public void listenManualAck(String message,
Acknowledgment acknowledgment) {
try {
System.out.println("处理重要消息: " + message);
// 业务处理
// 手动确认
acknowledgment.acknowledge();
} catch (Exception e) {
System.err.println("处理失败: " + e.getMessage());
// 不确认,下次重新消费
}
}
/**
* 监听消息 - 接收完整记录
*/
@KafkaListener(topics = "detailed-events", groupId = "group-3")
public void listenWithDetails(
org.apache.kafka.clients.consumer.ConsumerRecord<String, String> record) {
System.out.printf(
"Topic: %s, Partition: %d, Offset: %d, " +
"Key: %s, Value: %s, Timestamp: %d\n",
record.topic(),
record.partition(),
record.offset(),
record.key(),
record.value(),
record.timestamp()
);
}
/**
* 监听多个 Topic
*/
@KafkaListener(
topics = {"topic1", "topic2", "topic3"},
groupId = "multi-topic-group"
)
public void listenMultipleTopics(String message,
@org.springframework.messaging.handler.annotation.Header(
org.springframework.kafka.support.KafkaHeaders.RECEIVED_TOPIC
) String topic) {
System.out.printf("从 Topic %s 收到消息: %s\n", topic, message);
}
}
/**
* REST Controller
*/
@RestController
@RequestMapping("/api/kafka")
class KafkaController {
private final KafkaProducerService producerService;
public KafkaController(KafkaProducerService producerService) {
this.producerService = producerService;
}
/**
* 发送消息 API
*/
@PostMapping("/send")
public String sendMessage(
@RequestParam String topic,
@RequestParam String key,
@RequestBody String message) {
producerService.sendMessage(topic, key, message);
return "Message sent successfully";
}
/**
* 发送用户事件
*/
@PostMapping("/event")
public String sendUserEvent(@RequestBody UserEvent event) {
producerService.sendMessage("user-events", event.getUserId(), event);
return "Event sent successfully";
}
}
10. 生产环境实战场景
10.1 日志收集系统
/**
* 生产场景: 分布式日志收集系统
* 架构: 应用 -> Kafka -> 日志处理 -> Elasticsearch
*/
public class LogCollectionSystem {
/**
* 日志实体
*/
static class LogEntry {
private String application;
private String level;
private String message;
private String threadName;
private long timestamp;
private Map<String, String> metadata;
// Getters and Setters...
public String getApplication() { return application; }
public void setApplication(String application) {
this.application = application;
}
public String getLevel() { return level; }
public void setLevel(String level) { this.level = level; }
public String getMessage() { return message; }
public void setMessage(String message) { this.message = message; }
public String getThreadName() { return threadName; }
public void setThreadName(String threadName) {
this.threadName = threadName;
}
public long getTimestamp() { return timestamp; }
public void setTimestamp(long timestamp) { this.timestamp = timestamp; }
public Map<String, String> getMetadata() { return metadata; }
public void setMetadata(Map<String, String> metadata) {
this.metadata = metadata;
}
}
/**
* 日志生产者 (应用端)
*/
static class LogProducer {
private final KafkaProducer<String, LogEntry> producer;
private final String topic = "application-logs";
public LogProducer() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
JsonSerializer.class.getName());
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 32768);
props.put(ProducerConfig.LINGER_MS_CONFIG, 50);
this.producer = new KafkaProducer<>(props);
}
public void logInfo(String app, String message) {
sendLog(app, "INFO", message);
}
public void logError(String app, String message) {
sendLog(app, "ERROR", message);
}
private void sendLog(String app, String level, String message) {
LogEntry log = new LogEntry();
log.setApplication(app);
log.setLevel(level);
log.setMessage(message);
log.setThreadName(Thread.currentThread().getName());
log.setTimestamp(System.currentTimeMillis());
producer.send(new ProducerRecord<>(topic, app, log),
(metadata, exception) -> {
if (exception != null) {
System.err.println("日志发送失败: " +
exception.getMessage());
}
});
}
public void close() {
producer.close();
}
}
/**
* 日志消费者 (日志处理端)
*/
static class LogConsumer {
private final KafkaConsumer<String, LogEntry> consumer;
public LogConsumer() {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "log-processor-group");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
JsonDeserializer.class.getName());
props.put(JsonDeserializer.TRUSTED_PACKAGES, "*");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
this.consumer = new KafkaConsumer<>(props);
}
public void start() {
consumer.subscribe(Collections.singletonList("application-logs"));
while (true) {
ConsumerRecords<String, LogEntry> records =
consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, LogEntry> record : records) {
processLog(record.value());
}
}
}
private void processLog(LogEntry log) {
// 1. 解析日志
// 2. 过滤敏感信息
// 3. 写入 Elasticsearch
// 4. 如果是 ERROR,发送告警
System.out.printf(
"[%s] %s - %s: %s\n",
new java.util.Date(log.getTimestamp()),
log.getApplication(),
log.getLevel(),
log.getMessage()
);
if ("ERROR".equals(log.getLevel())) {
sendAlert(log);
}
}
private void sendAlert(LogEntry log) {
System.out.println("发送告警: " + log.getMessage());
// 发送告警通知
}
}
}
10.2 订单系统解耦
/**
* 生产场景: 电商订单系统解耦
* 流程: 创建订单 -> 发送 Kafka -> 多个下游系统消费
*/
public class OrderSystemDecoupling {
/**
* 订单实体
*/
static class Order {
private String orderId;
private String userId;
private List<OrderItem> items;
private BigDecimal totalAmount;
private String status;
private long createTime;
// Getters and Setters...
public String getOrderId() { return orderId; }
public void setOrderId(String orderId) { this.orderId = orderId; }
public String getUserId() { return userId; }
public void setUserId(String userId) { this.userId = userId; }
public List<OrderItem> getItems() { return items; }
public void setItems(List<OrderItem> items) { this.items = items; }
public BigDecimal getTotalAmount() { return totalAmount; }
public void setTotalAmount(BigDecimal totalAmount) {
this.totalAmount = totalAmount;
}
public String getStatus() { return status; }
public void setStatus(String status) { this.status = status; }
public long getCreateTime() { return createTime; }
public void setCreateTime(long createTime) {
this.createTime = createTime;
}
}
static class OrderItem {
private String productId;
private int quantity;
private BigDecimal price;
// Getters and Setters...
public String getProductId() { return productId; }
public void setProductId(String productId) {
this.productId = productId;
}
public int getQuantity() { return quantity; }
public void setQuantity(int quantity) { this.quantity = quantity; }
public BigDecimal getPrice() { return price; }
public void setPrice(BigDecimal price) { this.price = price; }
}
/**
* 订单服务 (生产者)
*/
@Service
static class OrderService {
private final KafkaTemplate<String, Order> kafkaTemplate;
public OrderService(KafkaTemplate<String, Order> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public String createOrder(Order order) {
// 1. 订单入库
saveOrderToDatabase(order);
// 2. 发送订单创建事件到 Kafka
kafkaTemplate.send("order-created", order.getOrderId(), order);
System.out.println("订单创建成功: " + order.getOrderId());
return order.getOrderId();
}
private void saveOrderToDatabase(Order order) {
// 保存到数据库
System.out.println("订单已入库: " + order.getOrderId());
}
}
/**
* 库存服务 (消费者)
*/
@Service
static class InventoryService {
@KafkaListener(topics = "order-created", groupId = "inventory-service")
public void handleOrderCreated(Order order) {
System.out.println("库存服务处理订单: " + order.getOrderId());
// 扣减库存
for (OrderItem item : order.getItems()) {
deductInventory(item.getProductId(), item.getQuantity());
}
}
private void deductInventory(String productId, int quantity) {
System.out.printf("扣减库存 - 商品: %s, 数量: %d\n",
productId, quantity);
}
}
/**
* 积分服务 (消费者)
*/
@Service
static class PointsService {
@KafkaListener(topics = "order-created", groupId = "points-service")
public void handleOrderCreated(Order order) {
System.out.println("积分服务处理订单: " + order.getOrderId());
// 计算并增加积分
int points = calculatePoints(order.getTotalAmount());
addPoints(order.getUserId(), points);
}
private int calculatePoints(BigDecimal amount) {
return amount.intValue() / 10; // 每10元1积分
}
private void addPoints(String userId, int points) {
System.out.printf("增加积分 - 用户: %s, 积分: %d\n",
userId, points);
}
}
/**
* 通知服务 (消费者)
*/
@Service
static class NotificationService {
@KafkaListener(topics = "order-created",
groupId = "notification-service")
public void handleOrderCreated(Order order) {
System.out.println("通知服务处理订单: " + order.getOrderId());
// 发送订单确认通知
sendOrderConfirmation(order.getUserId(), order.getOrderId());
}
private void sendOrderConfirmation(String userId, String orderId) {
System.out.printf("发送通知 - 用户: %s, 订单: %s\n",
userId, orderId);
}
}
}
11. 性能优化最佳实践
11.1 生产者优化
生产者性能优化
1. 批量发送
┌─────────────────────────────────┐
│ batch.size = 16384 (16KB) │
│ linger.ms = 10 │
│ │
│ Message 1 ─┐ │
│ Message 2 ├─▶ Batch ─▶ Send │
│ Message 3 ─┘ │
└─────────────────────────────────┘
2. 压缩
compression.type = snappy (推荐)
- none: 无压缩
- gzip: 压缩率高,CPU消耗大
- snappy: 平衡压缩率和CPU
- lz4: 性能最好
- zstd: Kafka 2.1+,综合最优
3. 异步发送
producer.send(record, callback)
4. 缓冲区调优
buffer.memory = 33554432 (32MB)
性能对比:
┌──────────────────┬─────────┬─────────┬─────────┐
│ Configuration │ TPS │ Latency │ CPU │
├──────────────────┼─────────┼─────────┼─────────┤
│ 同步+无压缩 │ 10K │ 50ms │ Low │
├──────────────────┼─────────┼─────────┼─────────┤
│ 异步+无压缩 │ 50K │ 10ms │ Low │
├──────────────────┼─────────┼─────────┼─────────┤
│ 异步+批量+snappy │ 100K+ │ 15ms │ Med │
└──────────────────┴─────────┴─────────┴─────────┘
11.2 消费者优化
/**
* 消费者性能优化
*/
public class ConsumerOptimization {
/**
* 优化配置
*/
public static Properties createOptimizedConfig() {
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "optimized-group");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class.getName());
// 1. 增大单次拉取消息数
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 1000);
// 2. 增大拉取数据量
props.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 10240); // 10KB
props.put(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, 52428800); // 50MB
props.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 1048576); // 1MB
// 3. 减少拉取等待时间
props.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 100);
// 4. 心跳配置
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 3000);
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 30000);
// 5. 关闭自动提交,手动批量提交
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
return props;
}
/**
* 多线程消费
*/
public static void multiThreadConsume() {
Properties props = createOptimizedConfig();
try (KafkaConsumer<String, String> consumer =
new KafkaConsumer<>(props)) {
consumer.subscribe(Collections.singletonList("user-events"));
// 创建线程池处理消息
ExecutorService executor = Executors.newFixedThreadPool(10);
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(100));
if (records.isEmpty()) {
continue;
}
// 提交处理任务到线程池
List<Future<?>> futures = new ArrayList<>();
for (ConsumerRecord<String, String> record : records) {
Future<?> future = executor.submit(() ->
processRecord(record)
);
futures.add(future);
}
// 等待所有任务完成
for (Future<?> future : futures) {
try {
future.get();
} catch (Exception e) {
System.err.println("处理失败: " + e.getMessage());
}
}
// 批量提交 offset
consumer.commitSync();
}
}
}
private static void processRecord(ConsumerRecord<String, String> record) {
// 业务处理
System.out.println("处理消息: " + record.value());
}
}
11.3 集群优化
Broker 集群优化
1. 副本配置
- replication.factor = 3 (推荐)
- min.insync.replicas = 2
2. 日志配置
- log.segment.bytes = 1073741824 (1GB)
- log.retention.hours = 168 (7天)
- log.cleanup.policy = delete
3. 网络配置
- num.network.threads = 8
- num.io.threads = 16
- socket.send.buffer.bytes = 102400
- socket.receive.buffer.bytes = 102400
4. 分区配置
- num.partitions = CPU核数 * 2
- 分区数 = 吞吐量 / 单分区吞吐量
5. 磁盘优化
- 使用 SSD
- RAID 10
- 定期清理日志
12. 监控与运维
12.1 关键监控指标
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.Metric;
import org.apache.kafka.common.MetricName;
import java.util.Map;
/**
* Kafka 监控
*/
public class KafkaMonitoring {
/**
* 获取 Producer 指标
*/
public static void monitorProducer(KafkaProducer<String, String> producer) {
Map<MetricName, ? extends Metric> metrics = producer.metrics();
for (Map.Entry<MetricName, ? extends Metric> entry : metrics.entrySet()) {
MetricName name = entry.getKey();
Metric metric = entry.getValue();
// 重要指标
if (name.name().equals("record-send-rate") ||
name.name().equals("record-error-rate") ||
name.name().equals("request-latency-avg") ||
name.name().equals("buffer-available-bytes")) {
System.out.printf("%s: %.2f\n",
name.name(), metric.metricValue());
}
}
}
/**
* 获取 Consumer 指标
*/
public static void monitorConsumer(KafkaConsumer<String, String> consumer) {
Map<MetricName, ? extends Metric> metrics = consumer.metrics();
for (Map.Entry<MetricName, ? extends Metric> entry : metrics.entrySet()) {
MetricName name = entry.getKey();
Metric metric = entry.getValue();
// 重要指标
if (name.name().equals("records-lag-max") ||
name.name().equals("records-consumed-rate") ||
name.name().equals("fetch-latency-avg")) {
System.out.printf("%s: %.2f\n",
name.name(), metric.metricValue());
}
}
}
}
12.2 监控指标列表
关键监控指标
Producer 指标:
├─ record-send-rate: 发送速率 (records/sec)
├─ record-error-rate: 错误率
├─ request-latency-avg: 平均延迟 (ms)
├─ buffer-available-bytes: 可用缓冲区
├─ batch-size-avg: 平均批量大小
└─ compression-rate-avg: 压缩率
Consumer 指标:
├─ records-lag-max: 最大延迟消息数
├─ records-consumed-rate: 消费速率
├─ fetch-latency-avg: 拉取延迟
├─ commit-latency-avg: 提交延迟
└─ assigned-partitions: 分配的分区数
Broker 指标:
├─ BytesInPerSec: 写入速率
├─ BytesOutPerSec: 读取速率
├─ MessagesInPerSec: 消息数
├─ UnderReplicatedPartitions: 未同步分区数
├─ ActiveControllerCount: 活跃Controller数
└─ OfflinePartitionsCount: 离线分区数
告警阈值建议:
┌──────────────────────────┬──────────────┐
│ Metric │ Threshold │
├──────────────────────────┼──────────────┤
│ records-lag-max │ > 10000 │
│ record-error-rate │ > 0.01 │
│ UnderReplicatedPartitions│ > 0 │
│ OfflinePartitionsCount │ > 0 │
└──────────────────────────┴──────────────┘
13. 总结
13.1 Kafka 核心优势
Kafka 核心优势总结
1. 高吞吐量 ★★★★★
- 单机百万级 TPS
- 顺序写磁盘
- 零拷贝技术
- 批量压缩
2. 低延迟 ★★★★★
- 毫秒级延迟
- 消息持久化
- 分区并行
3. 可扩展性 ★★★★★
- 水平扩展
- 动态增加分区
- 副本机制
4. 高可用性 ★★★★★
- 数据副本
- 自动故障转移
- ISR 机制
5. 消息可靠性 ★★★★★
- ACK 确认机制
- 事务支持
- 幂等性
13.2 最佳实践总结
-
Producer 最佳实践
- 使用异步发送 + 回调
- 启用批量和压缩
- 配置重试和幂等性
- 合理设置 acks
-
Consumer 最佳实践
- 手动提交 offset
- 合理设置消费者数量
- 使用多线程处理
- 幂等性处理消息
-
Topic 设计
- 合理规划分区数
- 设置合适的副本数
- 配置消息保留时间
- 使用命名规范
-
运维管理
- 监控关键指标
- 定期清理日志
- 备份重要配置
- 升级到稳定版本