ClickHouse系列之Zookeeper元数据丢失恢复

153 阅读2分钟

背景

Zookeeper在ClickHouse中起着非常重要的作用,ReplicatedMergeTree表的metadata、复制状态、集群的节点协调、分布式DDL执行都依赖Zookeeper。如果ReplicatedMergeTree表元数据丢失,处理起来非常棘手,那么在生产环境,我们应该如何处理zookeeper复制表元数据丢失问题?

问题现象

ClickHouse日志中,出现大量的如下异常信息:

2024.10.10 09:08:58.364703 [ 2485725 ] {} <Error> xxx.yyy(90e2203b-4fcf-4e67-a13b-cdd717731b49): void DB::StorageReplicatedMergeTree::queueUpdatingTask(): Code: 999. Coordination::Exception: Can't get data for node /clickhouse/tables/01-01/yyy/replicas/cluster01-01-01/log_pointer: node doesn't exist (No node). (KEEPER_EXCEPTION), Stack trace (when copying this message, always include the lines below):

 查看相关的入表服务的日志也出现错误,数据无法正常的入库。从现象来看,提示zk中元数据丢失。由于之前遇到过此类问题,很庆幸自己使用的是较高版本的CH(23.3)。

处理方式

 通过命令行连接到clickhouse,这块非常重要,需要保证clickhouse的主题服务正常。以下所有操作,都是基于clickhouse主题服务正常情况下的元数据恢复。恢复的命令很简单,直接参考官方文档(clickhouse.com/docs/en/sql…):

RESTORE REPLICA
Restores a replica if data is [possibly] present but Zookeeper metadata is lost.

Works only on readonly ReplicatedMergeTree tables.

One may execute query after:

ZooKeeper root / loss.
Replicas path /replicas loss.
Individual replica path /replicas/replica_name/ loss.
Replica attaches locally found parts and sends info about them to Zookeeper. Parts present on a replica before metadata loss are not re-fetched from other ones if not being outdated (so replica restoration does not mean re-downloading all data over the network).

Note
Parts in all states are moved to detached/ folder. Parts active before data loss (committed) are attached.

由于此命令只能对处于reader only状态ReplicatedMergeTree表进行操作,依次执行如下命令进行恢复:

#保证表为reader only状态
DETACH TABLE xxx.yyy;
ATTACH TABLE xxx.yyy;
#恢复zk元数据
SYSTEM RESTORE REPLICA xxx.yyy;

没有意外的情况,问题将得到解决。每天积累一点,总是好的,何况是自己喜欢做的事情,不是吗?