ClickHouse 安装与使用(单机部署版)

67 阅读4分钟

一、环境

系统:CentOS 7.9.2009 Clickhouse:24

二、安装

1. 切阿里云源

cp -r /etc/yum.repos.d /etc/yum.repos.d.bak
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache

2. 安装

yum install -y yum-utils
yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
yum install -y clickhouse-server clickhouse-client

其它版本

wget https://packages.clickhouse.com/rpm/lts/clickhouse-common-static-21.8.15.7-2.x86_64.rpm
wget https://packages.clickhouse.com/rpm/lts/clickhouse-server-21.8.15.7-2.noarch.rpm
wget https://packages.clickhouse.com/rpm/lts/clickhouse-client-21.8.15.7-2.noarch.rpm

3. 配置

核心配置文件 /etc/clickhouse-server/config.xml

<listen_host> # 监听端口

4.启动服务

/etc/init.d/clickhouse-server start
# or
service clickhouse-server start

# 停止Clickhouse 
sudo service clickhouse-server stop 

# 重启Clickhouse 
sudo service clickhouse-server restart 

# 查看服务状态 
service clickhouse-server status 

#设置开机自启 
systemctl enable clickhouse-server 

#查看开机自启是否成功 
systemctl is-enabled clickhouse-server
  

三、 连接

使用clickhouse-client

默认情况下,使用 default 用户并不携带密码连接到 localhost:9000 。 还可以使用 --host 参数连接到指定服务器。

使用第三方工具,如 DBeaver 24.3.0

默认情况下,使用 default 用户并不携带密码连接到 localhost:8123

四、建表与查询

1. SQL文档

文档地址:clickhouse.com/docs/zh/sql…

2.导入

2.0 建表

CREATE TABLE `default`.game_order (
    id UInt64 CODEC(Delta, ZSTD),
    gid UInt32 CODEC(ZSTD),
    sub_gid UInt32 CODEC(ZSTD),
    transaction_id FixedString(100),
    uid UInt32 CODEC(ZSTD),
    bet_amount Decimal64(2) CODEC(ZSTD),
    win_amount Decimal64(2) CODEC(ZSTD),
    transfer_amount Decimal64(2) CODEC(ZSTD),
    bet_time UInt32 CODEC(Delta, ZSTD),
    settlement_time UInt32 CODEC(ZSTD),
    valid_bet Decimal64(2),
    round_id FixedString(100) CODEC(ZSTD),
    game_type UInt8,
    bet_status UInt8,
    settle_status UInt32 CODEC(ZSTD),
    extend String CODEC(ZSTD),
    create_time UInt32 CODEC(Delta, ZSTD),
    update_time UInt32 CODEC(Delta, ZSTD),
    tax_amount Decimal64(2),
    PRIMARY KEY id
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(toDate(create_time))
ORDER BY (id)
SETTINGS index_granularity = 8192

2.1 clickhouse-client

clickhouse-client 客户端 TCP连接导入 走9000端口
csv 导入 带表头

cat csv文件 | clickhouse-client  --database default --query="INSERT INTO table_name FORMAT CSVWithNames"

csv 导入 不带表头、全字段

cat csv文件 | clickhouse-client  --database default --query="INSERT INTO table_name FORMAT CSV"

sql文件导入
未测试成功,内存占用过大,效率过低,不推荐

2.2 HTTP导入

HTTP导入 走8123端口
支持客户端 DBeaver
有可视化的,直接选择下一步下一步完成即可

2.3 导入效率对比

导入效率对比
记录行数:85537725

clickhouse-clientDBeaver
CSV1:5016:34

2.4 体积压缩对比

体积对比

sqlcsvclickhouseMysql
体积20.4G18.1G6.6G12.21G
压缩率--32.35%59.85%

备注:Mysql的索引体积为23.49 G

2.5 查询效率对比

总记录数:85537725

SELECT COUNT(*) from `default`.game_order;
MysqlClickhouse
执行时间72.937 s0.005 s
SELECT COUNT(*) from game_order_241211 WHERE uid=998077;
# mysql中,uid为索引字段
MysqlClickhouse
执行时间0.018 s0.196 s
SELECT COUNT(*) from game_order_241211 WHERE sub_gid=990597;
# mysql条件字段无索引
MysqlClickhouse
执行时间99.391 s0.171 s

五、PHP连接

1.走http方式(推荐,语法兼容性高)

加载包

composer require the-tinderbox/clickhouse-builder

使用

$server = new \Tinderbox\Clickhouse\Server('127.0.0.1', '8123', 'default', 'user', 'pass');
$serverProvider = (new \Tinderbox\Clickhouse\ServerProvider())->addServer($server);

$client = new \Tinderbox\Clickhouse\Client($serverProvider);
$builder = new Builder($client);

方法同TP ORM查询并且更非富 如:

$builder->select('column')->from('table', 'alias')->get();

文档地址:packagist.org/packages/th…

2.走TCP方式(不推荐,语法兼容性低)

2.1 环境

  • 系统:CentOS8 (最低要求,要不然编译不过)
  • PHP:7.4

2.2 安装

2.2.1 依赖
yum install lz4-devel
2.2.2 编译安装
$ git clone --recursive --depth=1 https://github.com/kolya7k/clickhouse-php.git
$ cd clickhouse-php 
$ phpize
$ ./configure  --with-php-config=/usr/local/php/bin/php-config   #替换成自己的PHP安装地址
$ make -j 8 #替换成自己的CPU核数
$ make install
2.2.3 加入php.ini
echo "extension=/usr/local/php/lib/php/extensions/no-debug-non-zts-20200930/clickhouse.so" >> /usr/local/php/etc/php.ini
2.2.4 验证
# php -m | grep "clickhouse"
clickhouse

显示有 clickhouse 结果,代表添加成功

2.3 使用

案例 php test.php

<?php

	require_once "secret.inc.php";		// CLICKHOUSE_* defines

	$time = microtime(true);

	$ch = new ClickHouse(CLICKHOUSE_HOST, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD, CLICKHOUSE_DATABASE, CLICKHOUSE_PORT);

	$query = "SELECT
		toUInt8(1),
		toUInt8(-1),
		toUInt16(1),
		toUInt16(-1),
		toUInt32(1),
		toUInt32(-1),
		toUInt64(1),
		toUInt64(-1),
		toUInt64(9223372036854775807),
		toUInt64(18446744073709551615),
		toInt8(1),
		toInt8(-1),
		toInt16(1),
		toInt16(-1),
		toInt32(1),
		toInt32(-1),
		toInt64(1),
		toInt64(-1),
		toInt64(9223372036854775807),
		toInt64(-9223372036854775808),
		toInt64(18446744073709551615),
		toFloat64(1 / 3),
		toFloat32(1 / 3),
		toFixedString('test', 8),
		toString('test'),
		toDate('2021-01-01'),
		toDateTime('2021-01-01 00:00:00'),
		toInt8OrNull('123'),
		toInt8OrNull('123qwe123'),
		1 == 1,
		NULL,
		toDecimal128(123456789.123, 3)
	";

	$result = $ch->query($query) or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);

	$row = $result->fetch_assoc();
	var_dump($row);

/*
	$result = $ch->query("SELECT * FROM bottle.stats_actions LIMIT 1000000") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);

	var_dump($result->num_rows);

	$total = 0;
	while ($row = $result->fetch_assoc())
		$total++;
*/

	$ch->query("CREATE TABLE IF NOT EXISTS test (
		id UInt64,
		name String,
		key FixedString(5),
		nullable_date Nullable(Date),
		nullable_fixed Nullable(FixedString(15))
	) ENGINE = Memory") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);

	// Index-based data, columns names in separated array
	$ch->insert("test",
		array(
			array(1, "a", "aa", "2020-01-01", "test1"),
			array(2, "b", "bb", NULL, "test2"),
			array(3, "c", "cc", "2020-01-03", NULL)
		),
		array("id", "name", "key", "nullable_date", "nullable_fixed")
	) or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);

	// Associative array data, columns names inside data array, slower than index-based
	$ch->insert("test",
		array(
			array('id' => 4, 'name' => "d", 'key' => "dd", 'nullable_date' => "2020-01-04",	'nullable_fixed' => NULL),
			array('id' => 5, 'name' => "e", 'key' => "ee", 'nullable_date' => NULL,		'nullable_fixed' => "test5"),
			array('id' => 6, 'name' => "f", 'key' => "ff", 'nullable_date' => "2020-01-06",	'nullable_fixed' => "test6")
		)
	) or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);

	$result = $ch->query("SELECT * FROM test") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);
	while ($row = $result->fetch_assoc())
		var_dump($row);

	$ch->query("DROP TABLE test") or trigger_error("Failed to run query: ".$ch->error." (".$ch->errno.")", E_USER_WARNING);

	echo "Memory: ".memory_get_usage()."\n";

2.4 加载ORM包使用

六、数据冷热处理方案

6.1 大数据表处理,非实时同步

  1. 旧数据用CSV方式全量导入
  2. 新数据定时增量导入

6.2 中型数据表,频繁更新的 同步

  1. 实时增量插入数据(有并发风险,推荐走队列)
  2. 使用物化引擎(推荐,Mysql需要5.7+)

6.3 小型数据表,频繁更新或者不更新 同步

  1. 使用Mysql引擎(注意:clickhouse不存数据,直接去MYSQL库读取缓存)

注意:不管用clickhouse表用什么引擎,都可以联表查询

七、注意事项

  • clickhouse 语法类型介于强类型与弱类型之间,建议聚合时,不同类型间先转为同类型再操作,要不然会发生莫名其妙的结果
  • MergeTree引擎类,查询时需要添加 FINAL 修饰符来合并结果集