目的
验证ClickHouse 分级分层存储方案测试, 实现 1 volume N disk 方案,该方案会将数据存储在不同的盘中,但不能实现数据在盘与盘之间的数据的均匀分布。
测试背景
1 shard 1 replica 1 znode 集群, path为默认路径/var/lib/clickhouse
操作过程
prepare
mkdir /data/disk01 # dummy mount point
chown -R clickhouse:clickhouse /data/
mkdir -p /etc/clickhouse-server/config.d/
touch /etc/clickhouse-server/config.d/storage.xml
写配置文件
<yandex>
<storage_configuration>
<disks> <!-- 磁盘列表 -->
<disk0> <!-- 如果命名为default可以省略path tag -->
<keep_free_space_bytes>1024</keep_free_space_bytes>
<path>/var/lib/clickhouse/</path> <!-- 必须以/结尾 -->
</disk0>
<!--
also
<default>
<keep_free_space_bytes>1024</keep_free_space_bytes>
</default>
-->
<disk1>
<path>/data/disk01/</path>
</disk1>
</disks>
<policies>
<jbod_1> <!-- 存储策略名 -->
<volumes>
<jbod_volume_1> <!-- volume 名称 -->
<disk>disk0</disk> <!-- tag 中内容同<disk/>中tag名称 -->
<disk>disk1</disk>
</jbod_volume_1>
</volumes>
</jbod_1>
</policies>
</storage_configuration>
</yandex>
重启服务并验证
新策略
$:) select * from system.storage_policies;
SELECT *
FROM system.storage_policies
┌─policy_name─┬─volume_name───┬─volume_priority─┬─disks─────────────┬─max_data_part_size─┬─move_factor─┐
│ default │ default │ 1 │ ['default'] │ 0 │ 0 │
│ jbod_1 │ jbod_volume_1 │ 1 │ ['disk0','disk1'] │ 0 │ 0.1 │
└─────────────┴───────────────┴─────────────────┴───────────────────┴────────────────────┴─────────────┘
2 rows in set. Elapsed: 0.002 sec.
表采用jbod_1的策略后,存储路径变成了两个位置
$:) CREATE TABLE IF NOT EXISTS test_01.ping (time DateTime, agentId String) ENGINE = MergeTree() PARTITION BY toYYYYMMDD(time) ORDER BY (time, agentId) SETTINGS index_granularity=8192, storage_policy='jbod_1';
CREATE TABLE IF NOT EXISTS test_01.ping
(
`time` DateTime,
`agentId` String
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(time)
ORDER BY (time, agentId)
SETTINGS index_granularity = 8192, storage_policy = 'jbod_1'
Ok.
0 rows in set. Elapsed: 0.021 sec.
$:) select name, data_paths from system.tables where name='ping'
SELECT
name,
data_paths
FROM system.tables
WHERE name = 'ping'
┌─name─┬─data_paths───────────────────────────────────────────────────────────────────┐
│ ping │ ['/var/lib/clickhouse/data/test_01/ping/','/data/disk01/data/test_01/ping/'] │
└──────┴──────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.013 sec.
向表连续写入数据后,查看/var/lib/clickhouse/data/test_01/ping/ /data/disk01/data/test_01/ping/,会发现有数据打入,测试完成。请注意两者目录之间的差别
root@$:~ # ll /data/disk01/data/test_01/ping/
total 36
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_1_10_2
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_12_12_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_14_14_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_1_5_1
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_2_2_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_4_4_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_7_7_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_9_9_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:37 detached # here
root@$:~ # ll /var/lib/clickhouse/data/test_01/ping/
total 44
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_10_10_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_1_1_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_11_11_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_1_14_3
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_13_13_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_3_3_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_5_5_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_6_6_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:45 19700101_8_8_0
drwxr-x--- 2 clickhouse clickhouse 4096 May 27 19:37 detached
-rw-r----- 1 clickhouse clickhouse 1 May 27 19:37 format_version.txt # here