Kafka Mirror Maker2 集群同步使用实践

3,622 阅读5分钟

本文主要分享一下使用kafka完成两个集群之间的存量与增量同步的实践以及遇到的一些问题; 有问题欢迎指正。。。

文章以操作为主: 理论知识大家可以看看大佬分享的 cloud.tencent.com/developer/a…

1. 背景

业务拆分,需要搭建一套新的kafka,把业务转移到新的kafka上面;为了保证用户的数据,需要对存量数据做一个迁移;并且迁移要快,对业务不要造成太大影响;

2. 方案确认

因为kafka 的mirror-maker2 新增了诸多特性,如权限同步,消费者组offset同步,动态同步topic与group信息,同步速度高的情况;所以使用mirror-maker2 来进行存量与增量同步;

在调研测试中发现当前使用的kafka2.7版本并不支持topic迁移后保持完全一样;在kafka3.0 版本的mirror-maker2 增加了支持topic单向不带前缀同步的功能;

3. 使用配置

config/connect-mirror-maker.properties

 contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see org.apache.kafka.clients.consumer.ConsumerConfig for more details

# Sample MirrorMaker 2.0 top-level configuration file
# Run with ./bin/connect-mirror-maker.sh connect-mirror-maker.properties

# specify any number of cluster aliases
clusters = A,B
#replication.policy.separator=""
#source.cluster.alias=""
#target.cluster.alias=""
# connection information for each cluster
# This is a comma separated host:port pairs for each cluster
# for e.g. "A_host1:9092, A_host2:9092, A_host3:9092"
A.bootstrap.servers = xxxx:9092
B.bootstrap.servers = yyyy:9092
A.security.protocol=SASL_PLAINTEXT
A.sasl.mechanism=SCRAM-SHA-512
A.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
 username="xxx" password="xxx";

B.security.protocol=SASL_PLAINTEXT
B.sasl.mechanism=SCRAM-SHA-512
B.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
 username="xxx" password="xxx";

# enable and configure individual replication flows
# 设置同步的流向
A->B.enabled = true


#A.producer.enable.idempotence = true
#B.producer.enable.idempotence = true


# regex which defines which topics gets replicated. For eg "foo-.*"
#A->B.topics = hadoopLogCollection,t_biz_act_mmetric

#设置同步的topic;支持正则
A->B.topics = xxxx,xxxx
#设置排除的topic:支持正则
A->B.topics.exclude= xxxx

#B->A.enabled = true
#B->A.topics = .*

# Setting replication factor of newly created remote topics
replication.factor=3

############################# Internal Topic Settings  #############################
# The replication factor for mm2 internal topics "heartbeats", "B.checkpoints.internal" and
# "mm2-offset-syncs.B.internal"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
sync.topic.configs.enabled=true
#同步配置的时间频率
sync.topic.configs.enabled.interval.seconds=60
checkpoints.topic.replication.factor=2
heartbeats.topic.replication.factor=2
offset-syncs.topic.replication.factor=2
#offset-syncs.topic.location = target

#启动同步的Task数量----启用几个线程进行同步
tasks.max = 5

# The replication factor for connect internal topics "mm2-configs.B.internal", "mm2-offsets.B.internal" and
# "mm2-status.B.internal"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offset.storage.replication.factor=2
status.storage.replication.factor=2
config.storage.replication.factor=2

# customize as needed
# replication.policy.separator = _
sync.topic.acls.enabled = true
emit.heartbeats.interval.seconds = 5

#开启topic动态和消费者组 动态同步与同步的周期
refresh.topics.enabled = true
refresh.topics.interval.seconds = 60
refresh.groups.enabled = true
refresh.groups.interval.seconds = 60

# 开始消费者组offset同步;设置同步的周期---注意:仅仅同步idle中的消费者的offset
sync.group.offsets.enabled = true
sync.group.offsets.interval.seconds = 5

#设置同步的topic Name命名规则;3.0版本提供了两种topic同步命名规则,默认会带上前缀,也可以手动不带前缀的----此时不能做双向同步
replication.policy.class = org.apache.kafka.connect.mirror.IdentityReplicationPolicy

4.启动命令:

bin/connect-mirror-maker.sh config/connect-mirror-maker.properties --clusters B

5. 开启运行监控

存量数据较多,同时为了不给集群造成太大压力,同步用户限速到了250MB/s;同步时间就较长,需要监控整体的同步情况; 我使用的jmx_exporter导出metrics搭配promethus与grafana进行监控;

export JMX_PORT="xxxx"
export KAFKA_OPTS="-javaagent:jmx/jmx_prometheus_javaagent-0.12.0.jar=xxxx:jmx/kafka-mm2.yml"

kafka-mm2.yml

lowercaseOutputName: true

rules:
- pattern: 'kafka.connect.mirror<type=MirrorCheckpointConnector, source=(.+), target=(.+), group=(.+), topic=(.+), partition=([0-9+])><>([a-z-]+)'
  name: kafka_mirror_checkpointConnector_$6
  type: GAUGE
  labels:
    group: $3
    topic: "$4"
    partition: "$5"
- pattern: 'kafka.connect.mirror<type=MirrorSourceConnector, target=(.+), topic=(.+), partition=([0-9]+)><>([a-z-]+)'
  name: kafka_mirror_sourceConnector_$4
  type: GAUGE
  labels:
    topic: "$2"
    partition: "$3"

具体的metrics可以查看官方文档的说明; kafka.apache.org/documentati…

image.png

6. 关于用户与权限同步:

mirror-maker2 并不会同步用户过去;这部分需要单独处理---如使用数据库里面的数据做成脚本进行执行;

同时,需要注意 sync.topic.acls.enabled = true 并不会同步写权限和消费者组权限;写权限官方明文说明了防止用户和mm2双写导致异常;(说的是怕被删掉) 这一块需要自行实现:

我的实现如下:

/**
 * 批量同步topic的写权限
 *
 * @param topics     topic列表
 * @param sourceProp source adminClient的配置
 * @param targetProp target adminClient的配置
 * @param dryRun     是否测试的跑;默认是true;即不会实际同步权限
 */
private void syncAcl(List<String> topics, HashMap<String, Object> sourceProp, HashMap<String, Object> targetProp, boolean dryRun) {
    logger.info("=============================开始同步Topic{}的用户写权限===============================", topics);
    AdminClient sourceAdmin = AdminClient.create(sourceProp);
    AdminClient targetAdmin = AdminClient.create(targetProp);
    try {
        Collection<AclBinding> aclBindings = sourceAdmin.describeAcls(new AclBindingFilter(new ResourcePatternFilter(ResourceType.TOPIC, null, PatternType.ANY), AccessControlEntryFilter.ANY))
                                                        .values()
                                                        .get();
        aclBindings.forEach(x -> {
            logger.debug("过滤前的acl为:{}", x);
        });
        List<AclBinding> bindings = aclBindings.stream()
                                               .filter(x -> x.pattern()
                                                             .resourceType() == ResourceType.TOPIC)
                                               .filter(x -> x.pattern()
                                                             .patternType() == PatternType.LITERAL)
                                               .filter(this::shouldReplicateAcl)
                                               .filter(x -> certainUser(x.pattern()
                                                                         .name(), topics))
                                               .collect(Collectors.toList());
        bindings.forEach(x -> {
            logger.info("过滤后的acl为:{}", x);
        });
        if (!dryRun) {
            updateTopicAcls(targetAdmin, bindings);
        }
    } catch (Exception e) {
        logger.error("权限同步失败", e);
    }
    logger.info("=============================同步Topic{}的所有用户写权限===============================", topics);
}

private void updateTopicAcls(AdminClient targetAdmin, List<AclBinding> bindings) {
    logger.trace("Syncing {} topic ACL bindings.", bindings.size());
    targetAdmin.createAcls(bindings)
               .values()
               .forEach((k, v) -> v.whenComplete((x, e) -> {
                   if (e != null) {
                       logger.warn("Could not sync ACL of topic {}.", k.pattern()
                                                                       .name(), e);
                   }
               }));
}

boolean shouldReplicateAcl(AclBinding aclBinding) {
    return aclBinding.entry()
                     .permissionType() == AclPermissionType.ALLOW && aclBinding.entry()
                                                                               .operation() == AclOperation.WRITE;
}

/**
 * 同步小组权限
 *
 * @param sourceProp source adminClient的配置
 * @param targetProp target adminClient的配置
 * @param dryRun     是否测试的跑;默认是true;即不会实际同步权限
 */
private void syncGroupAcl(HashMap<String, Object> sourceProp, HashMap<String, Object> targetProp, boolean dryRun) {
    logger.info("=============================开始同步非Topic用户权限===============================");
    AdminClient sourceAdmin = AdminClient.create(sourceProp);
    AdminClient targetAdmin = AdminClient.create(targetProp);
    try {
        Collection<AclBinding> aclBindings = sourceAdmin.describeAcls(new AclBindingFilter(new ResourcePatternFilter(ResourceType.GROUP, null, PatternType.ANY), AccessControlEntryFilter.ANY))
                                                        .values()
                                                        .get();
        List<AclBinding> bindings = aclBindings.stream()
                                               .filter(x -> x.pattern()
                                                             .resourceType() == ResourceType.GROUP)
                                               .filter(x -> x.pattern()
                                                             .patternType() == PatternType.LITERAL)
                                               .filter(this::certainUser)
                                               .collect(Collectors.toList());
        bindings.forEach(x -> {
            logger.info("过滤后的acl为:{}", x);
        });
        if (!dryRun) {
            updateTopicAcls(targetAdmin, bindings);
        }
        logger.info("=============================同步非Topic级的所有用户写权限,合计{}条===============================", bindings.size());
    } catch (Exception e) {
        logger.error("权限同步失败", e);
    }
}

7. 关于同步完成确认:

怎么确认同步完成了呢?

生产者如何确保自己切换的时候mm2已经把集群一的数据同步完了呢?

因为mm2与mm1 不同;使用的low api,无法直接从consumer lag中抓取到消费进度;

而理论上有一个topic offset sync是会存放上下游的topic 的 offset的,然而实际测试没有数据,因为项目紧张没有深入研究;有知道原因的请指教

只能另辟蹊径;

使用grafana展示了同步延时,同步数据的存放时间;

record-age-ms

通过这个参数基本可以确认到数据写入到被同步的时间; 再搭配一个校验小接口给到用户验证同步是否完成,那就可以让用户自助放心切换了

/**
 * 检查源集群和目标集群的最后一条数据是否一致;
 *
 * @param oldConsumer 原集群消费者;没有group
 * @param newConsumer 目标集群消费者;没有group
 * @param testTopic   校验的topic;
 * @return 检查是否ok
 * @throws InterruptedException
 */
private boolean checkLastData(KafkaConsumer<String, String> oldConsumer, KafkaConsumer<String, String> newConsumer, String testTopic) throws InterruptedException {
    logger.info("===================开始检查{}两边最后一条offset====================", testTopic);

    int count = 0;

    do {
        try {
            count++;
            //获取分区信息
            List<TopicPartition> topicPartitions = oldConsumer.partitionsFor(testTopic)
                                                              .stream()
                                                              .map(partitionInfo -> new TopicPartition(partitionInfo.topic(), partitionInfo.partition()))
                                                              .collect(Collectors.toList());

            oldConsumer.assign(topicPartitions);
            newConsumer.assign(topicPartitions);
            Map<TopicPartition, Long> OldTopicOffsets = oldConsumer.endOffsets(topicPartitions);
            Map<TopicPartition, Long> newTopicOffsets = newConsumer.endOffsets(topicPartitions);
            for (Map.Entry<TopicPartition, Long> entry : OldTopicOffsets.entrySet()) {
                TopicPartition topicPartition = entry.getKey();
                long oldOffset = entry.getValue() - 1;
                long newOffset = newTopicOffsets.get(topicPartition) - 1;
                oldOffset = oldOffset < 0 ? 0 : oldOffset;
                newOffset = newOffset < 0 ? 0 : newOffset;

                oldConsumer.seek(topicPartition, oldOffset);
                newConsumer.seek(topicPartition, newOffset);
                ConsumerRecords<String, String> oldRecords = oldConsumer.poll(Duration.ofSeconds(2L));
                ConsumerRecords<String, String> newRecords = newConsumer.poll(Duration.ofSeconds(2L));
                List<ConsumerRecord<String, String>> oldRecord = oldRecords.records(topicPartition);
                List<ConsumerRecord<String, String>> newRecord = newRecords.records(topicPartition);
                if (oldRecord.size() == 0 && newRecord.size() == 0) {
                    logger.info("{}两个集群都没有数据,继续下一个分区", topicPartition);
                    continue;
                }
                if (!(oldRecord.size() > 0 && newRecord.size() > 0)) {
                    throw new RuntimeException(topicPartition + "数据没有对上,两个集群一个取到数据一个没有取到数据");
                }
                RecordPojo oldLastRecord = new RecordPojo(oldRecord.get(oldRecord.size() - 1));
                RecordPojo newLastRecord = new RecordPojo(newRecord.get(newRecord.size() - 1));
                logger.info("{}取出来老集群与新集群最后一条数据为:\n{}\n{}", topicPartition, oldLastRecord, newLastRecord);
                if (!oldLastRecord.equals(newLastRecord)) {
                    throw new RuntimeException(topicPartition + "数据没有对上");
                }
            }
            logger.info("topic{}同步check成功", testTopic);
            return true;
        } catch (Exception e) {
            logger.error("数据对数异常", e);
            Thread.sleep(Long.parseLong(configUtil.getValue("sleep.time", "2000")));
        }
    } while (count < 6);

    return false;
}

后来发现还有一个topic mm2-offsets.A.internal存放了connect 的source的offset

# 默认在source集群,通过offset-syncs.topic.location设置所在位置(默认source,可选target) mm2-offset-syncs.B.internal:储存同步过程中的上下游offset;需要解码
# 默认在target集群 mm2-offsets.A.internal:OFFSET_STORAGE_TOPIC_CONFIG:储存connecter的消费offset的;

参考文档:cwiki.apache.org/confluence/…