本文已参与「新人创作礼」活动,一起开启掘金创作之路。
ClickHouse简介
ClickHouse是战斗民族Yandex公司出品的OLAP开源数据库,简称CH,也有人简称CK,是目前市面上最快的OLAP数据库。性能远超Vertica、Sybase IQ等。
CH具有以下几个特点:
列式存储,因此数据压缩比高。
向量计算,且支持多核CPU并行计算,并且执行每个SQL时都力求榨干CPU性能。
基于Shared nothing架构,支持分布式方案。
支持主从复制架构。
兼容大部分SQL语法,其语法和MySQL尤其相近。
数据实时批量更新。
不支持事务,不适合高频更新数据。
建议多用宽表,但不建议总是查询整数据行中的所有列。
简言之,如果你有以下业务场景,可以考虑用CH:
海量数据,但又不希望单节点的存储空间消耗太高。
宽表,为了业务方便,可能会把很多相关数据列都整合到一个表里。
基于SQL的查询方式,提高程序的适用性和可移植性。
ClickHouse部署
为了方便快速,这里采用docker容器化部署
单节点部署
运行ck-server
docker run -d --name ck-server \
--ulimit nofile=262144:262144 \
--volume=/data/adhoc/clickhouse/database:/var/lib/clickhouse \
--net=host \
-p 8123:8123 \
-p 9000:9000 \
yandex/clickhouse-server:21.8.6.15
运行ck-client
docker run -it --name ck-client \
--link ck-server \
yandex/clickhouse-client:21.8.6.15 --host ck-server
集群部署
编辑配置文件: 创建 metrika.xml配置文件
<yandex>
<clickhouse_remote_servers>
<perftest_3shards_1replicas>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>11.51.197.5</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<internal_replication>true</internal_replication>
<host>11.51.197.6</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>11.51.197.7</host>
<port>9000</port>
</replica>
</shard>
</perftest_3shards_1replicas>
</clickhouse_remote_servers>
<!--zookeeper相关配置-->
<zookeeper-servers>
<node index="1">
<host>11.51.197.5</host>
<port>2181</port>
</node>
</zookeeper-servers>
<networks>
<ip>::/0</ip>
</networks>
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
<!-- 其中以上配置一样,以下的配置根据节点的IP/域名具体配置 -->
<macros>
<replica>11.51.197.7</replica>
</macros>
</yandex>
docker-compose 编排文件
version: '3'
services:
zk:
image: zookeeper
restart: always
container_name: zk
#volumes:
# - ./config:/conf
# - ./data:/data
# - ./logs:/datalog
ports:
- "2181:2181"
network_mode: "host"
ck01:
image: yandex/clickhouse-server:21.8.6.15
container_name: ck01
hostname: ck01
restart: "no"
depends_on:
- zk
ulimits:
nofile:
soft: "262144"
hard: "262144"
deploy:
resources:
limits:
cpus: '16.00'
memory: 64G
reservations:
cpus: '0.25'
memory: 100M
volumes:
- /data/adhoc/clickhouse/database:/var/lib/clickhouse:rw
- /data/adhoc/clickhouse/metrika.xml:/etc/metrika.xml
#network_mode: "host"
ck02:
image: yandex/clickhouse-server:21.8.6.15
container_name: ck02
hostname: ck02
restart: "no"
depends_on:
- zk
ulimits:
nofile:
soft: "262144"
hard: "262144"
deploy:
resources:
limits:
cpus: '16.00'
memory: 64G
reservations:
cpus: '0.25'
memory: 100M
volumes:
- /data/adhoc/clickhouse/database:/var/lib/clickhouse:rw
- /data/adhoc/clickhouse/metrika.xml:/etc/metrika.xml
#network_mode: "host"
ck03:
image: yandex/clickhouse-server:21.8.6.15
container_name: ck03
hostname: ck03
restart: "no"
depends_on:
- zk
ulimits:
nofile:
soft: "262144"
hard: "262144"
deploy:
resources:
limits:
cpus: '16.00'
memory: 64G
reservations:
cpus: '0.25'
memory: 100M
volumes:
- /data/adhoc/clickhouse/database:/var/lib/clickhouse:rw
- /data/adhoc/clickhouse/metrika.xml:/etc/metrika.xml
#network_mode: "host"