一、环境
系统:CentOS 7.9.2009 Clickhouse:24
二、安装
1. 切阿里云源
cp -r /etc/yum.repos.d /etc/yum.repos.d.bak
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache
2. 安装
yum install -y yum-utils
yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
yum install -y clickhouse-server clickhouse-client
其它版本
wget https://packages.clickhouse.com/rpm/lts/clickhouse-common-static-21.8.15.7-2.x86_64.rpm
wget https://packages.clickhouse.com/rpm/lts/clickhouse-server-21.8.15.7-2.noarch.rpm
wget https://packages.clickhouse.com/rpm/lts/clickhouse-client-21.8.15.7-2.noarch.rpm
3. 配置
核心配置文件 /etc/clickhouse-server/config.xml
<listen_host> # 监听端口
4.启动服务
/etc/init.d/clickhouse-server start
# or
service clickhouse-server start
# 停止Clickhouse
sudo service clickhouse-server stop
# 重启Clickhouse
sudo service clickhouse-server restart
# 查看服务状态
service clickhouse-server status
#设置开机自启
systemctl enable clickhouse-server
#查看开机自启是否成功
systemctl is-enabled clickhouse-server
三、 连接
使用clickhouse-client
默认情况下,使用 default
用户并不携带密码连接到 localhost:9000
。
还可以使用 --host
参数连接到指定服务器。
使用第三方工具,如 DBeaver 24.3.0
默认情况下,使用 default
用户并不携带密码连接到 localhost:8123
。
四、建表与查询
1. SQL文档
文档地址:clickhouse.com/docs/zh/sql…
2.导入
2.0 建表
CREATE TABLE `default`.game_order (
id UInt64 CODEC(Delta, ZSTD),
gid UInt32 CODEC(ZSTD),
sub_gid UInt32 CODEC(ZSTD),
transaction_id FixedString(100),
uid UInt32 CODEC(ZSTD),
bet_amount Decimal64(2) CODEC(ZSTD),
win_amount Decimal64(2) CODEC(ZSTD),
transfer_amount Decimal64(2) CODEC(ZSTD),
bet_time UInt32 CODEC(Delta, ZSTD),
settlement_time UInt32 CODEC(ZSTD),
valid_bet Decimal64(2),
round_id FixedString(100) CODEC(ZSTD),
game_type UInt8,
bet_status UInt8,
settle_status UInt32 CODEC(ZSTD),
extend String CODEC(ZSTD),
create_time UInt32 CODEC(Delta, ZSTD),
update_time UInt32 CODEC(Delta, ZSTD),
tax_amount Decimal64(2),
PRIMARY KEY id
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(toDate(create_time))
ORDER BY (id)
SETTINGS index_granularity = 8192
2.1 clickhouse-client
clickhouse-client 客户端 TCP连接导入 走9000端口
csv 导入 带表头
cat csv文件 | clickhouse-client --database default --query="INSERT INTO table_name FORMAT CSVWithNames"
csv 导入 不带表头、全字段
cat csv文件 | clickhouse-client --database default --query="INSERT INTO table_name FORMAT CSV"
sql文件导入
未测试成功,内存占用过大,效率过低,不推荐
2.2 HTTP导入
HTTP导入 走8123端口
支持客户端 DBeaver
有可视化的,直接选择下一步下一步完成即可
2.3 导入效率对比
导入效率对比
记录行数:85537725
clickhouse-client | DBeaver | |
---|---|---|
CSV | 1:50 | 16:34 |
2.4 体积压缩对比
体积对比
sql | csv | clickhouse | Mysql | |
---|---|---|---|---|
体积 | 20.4G | 18.1G | 6.6G | 12.21G |
压缩率 | - | - | 32.35% | 59.85% |
备注:Mysql的索引体积为23.49 G
2.5 查询效率对比
总记录数:85537725
SELECT COUNT(*) from `default`.game_order;
Mysql | Clickhouse | |
---|---|---|
执行时间 | 72.937 s | 0.005 s |
SELECT COUNT(*) from game_order_241211 WHERE uid=998077;
# mysql中,uid为索引字段
Mysql | Clickhouse | |
---|---|---|
执行时间 | 0.018 s | 0.196 s |
SELECT COUNT(*) from game_order_241211 WHERE sub_gid=990597;
# mysql条件字段无索引
Mysql | Clickhouse | |
---|---|---|
执行时间 | 99.391 s | 0.171 s |
五、PHP连接
1.走http方式(推荐,语法兼容性高)
加载包
composer require the-tinderbox/clickhouse-builder
使用
$server = new \Tinderbox\Clickhouse\Server('127.0.0.1', '8123', 'default', 'user', 'pass');
$serverProvider = (new \Tinderbox\Clickhouse\ServerProvider())->addServer($server);
$client = new \Tinderbox\Clickhouse\Client($serverProvider);
$builder = new Builder($client);
方法同TP ORM查询并且更非富 如:
$builder->select('column')->from('table', 'alias')->get();
文档地址:packagist.org/packages/th…
2.走TCP方式(不推荐,语法兼容性低)
2.1 环境
- 系统:CentOS8 (最低要求,要不然编译不过)
- PHP:7.4
2.2 安装
2.2.1 依赖
yum install lz4-devel
2.2.2 编译安装
$ git clone --recursive --depth=1 https://github.com/kolya7k/clickhouse-php.git
$ cd clickhouse-php
$ phpize
$ ./configure --with-php-config=/usr/local/php/bin/php-config #替换成自己的PHP安装地址
$ make -j 8 #替换成自己的CPU核数
$ make install
2.2.3 加入php.ini
echo "extension=/usr/local/php/lib/php/extensions/no-debug-non-zts-20200930/clickhouse.so" >> /usr/local/php/etc/php.ini
2.2.4 验证
# php -m | grep "clickhouse"
clickhouse
显示有 clickhouse 结果,代表添加成功
2.3 使用
案例 php test.php
<?php
require_once "secret.inc.php"; // CLICKHOUSE_* defines
$time = microtime(true);
$ch = new ClickHouse(CLICKHOUSE_HOST, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD, CLICKHOUSE_DATABASE, CLICKHOUSE_PORT);
$query = "SELECT
toUInt8(1),
toUInt8(-1),
toUInt16(1),
toUInt16(-1),
toUInt32(1),
toUInt32(-1),
toUInt64(1),
toUInt64(-1),
toUInt64(9223372036854775807),
toUInt64(18446744073709551615),
toInt8(1),
toInt8(-1),
toInt16(1),
toInt16(-1),
toInt32(1),
toInt32(-1),
toInt64(1),
toInt64(-1),
toInt64(9223372036854775807),
toInt64(-9223372036854775808),
toInt64(18446744073709551615),
toFloat64(1 / 3),
toFloat32(1 / 3),
toFixedString('test', 8),
toString('test'),
toDate('2021-01-01'),
toDateTime('2021-01-01 00:00:00'),
toInt8OrNull('123'),
toInt8OrNull('123qwe123'),
1 == 1,
NULL,
toDecimal128(123456789.123, 3)
";
$result = $ch->query($query) or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
$row = $result->fetch_assoc();
var_dump($row);
/*
$result = $ch->query("SELECT * FROM bottle.stats_actions LIMIT 1000000") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
var_dump($result->num_rows);
$total = 0;
while ($row = $result->fetch_assoc())
$total++;
*/
$ch->query("CREATE TABLE IF NOT EXISTS test (
id UInt64,
name String,
key FixedString(5),
nullable_date Nullable(Date),
nullable_fixed Nullable(FixedString(15))
) ENGINE = Memory") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
// Index-based data, columns names in separated array
$ch->insert("test",
array(
array(1, "a", "aa", "2020-01-01", "test1"),
array(2, "b", "bb", NULL, "test2"),
array(3, "c", "cc", "2020-01-03", NULL)
),
array("id", "name", "key", "nullable_date", "nullable_fixed")
) or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
// Associative array data, columns names inside data array, slower than index-based
$ch->insert("test",
array(
array('id' => 4, 'name' => "d", 'key' => "dd", 'nullable_date' => "2020-01-04", 'nullable_fixed' => NULL),
array('id' => 5, 'name' => "e", 'key' => "ee", 'nullable_date' => NULL, 'nullable_fixed' => "test5"),
array('id' => 6, 'name' => "f", 'key' => "ff", 'nullable_date' => "2020-01-06", 'nullable_fixed' => "test6")
)
) or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
$result = $ch->query("SELECT * FROM test") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
while ($row = $result->fetch_assoc())
var_dump($row);
$ch->query("DROP TABLE test") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
echo "Memory: ".memory_get_usage()."\n";
2.4 加载ORM包使用
六、数据冷热处理方案
6.1 大数据表处理,非实时同步
- 旧数据用CSV方式全量导入
- 新数据定时增量导入
6.2 中型数据表,频繁更新的 同步
- 实时增量插入数据(有并发风险,推荐走队列)
- 使用物化引擎(推荐,Mysql需要5.7+)
6.3 小型数据表,频繁更新或者不更新 同步
- 使用Mysql引擎(注意:clickhouse不存数据,直接去MYSQL库读取缓存)
注意:不管用clickhouse表用什么引擎,都可以联表查询
七、注意事项
- clickhouse 语法类型介于强类型与弱类型之间,建议聚合时,不同类型间先转为同类型再操作,要不然会发生莫名其妙的结果
- MergeTree引擎类,查询时需要添加
FINAL
修饰符来合并结果集