ByConity业务应用ELT测试

231 阅读14分钟

一、ByConity 介绍

ByConity 是字节跳动开源的新一代云原生数据仓库,基于Clickhouse 进行优化和扩展,采用了存算分离架构,

成熟的OLAP技术,资源弹性调整与隔离、数据查询写入、数据一致性、查询调度 方面均大幅提升。

在交互式分析、实时数据看板、实时数仓的等查询场景表现优异。

更是推出了BSP模式来应对大数据量,复杂业务场景的数据计算需求。

技术架构:

ByConity 大体上可以分为 3 层:服务接入层,计算层 和 存储层。

服务接入层:将用户的查询解析,生成高效执行计划,选取合适的计算节点下发计算任务,支持水平扩展。

计算层:计算层由一个或者多个计算组构成,计算组之间物理隔离,计算组由多个worker 可以动态扩缩容。

存储层:存储层由元数据和数据构成,元数据存在分布式kv中,数据存储在分布式文件系统HDFS 或 S3中,

与计算节点分离,可以降低计算节点负载时的对存储的影响。ByConity 采用列式存储格式,减少不必要的数据 IO 提高查询性能,并对数据进行高效压缩,降低存储成本。除此之外,对于连续存储的列式数据,ByConity 通过向量化执行技术,进一步提升查询性能。

功能特性:

  • 弹性扩缩容与读写分离:ByConity 采用存储计算分离的架构,可以按业务场景进行动态扩缩容,成本可以控。

  • 多租户隔离与资源共享:用户可以为查询 SQL 指定计算组,可以对不同业务进行针对性配置资源。提升了计算组的资源利用率。

  • 查询优化器 与查询调度:具有 基于成本、基于规则与基于数据依赖的优化能力。基于策励与资源利用率的调度。

  • 数据湖与ELT场景支持: 支持通过外部形式访问Hive 与 Hudi。对常规ELT业务场景做了支持。

  • 与Clickhouse 的 SQL 兼容: 可以不修改业务SQL就可以迁移过来。

部署可以参考:

ByConity在对象存储上的部署

了解ByConity 的MPP模式与BSP模式:

我们今天主要对ByConity 的ELT场景进行一个简单的测试。

ByConity 在ELT场景推出了BSP模式,可以参考:ByConity 技术详解之 ELT

简单了解一下MPP与BSP

MPP 架构:

  • 是一种分布式的计算架构,它将任务分配为多个子任务,并行地分配到多个独立的节点上进行并行处理,各节点独立处理完成后,汇总到一个或多节点上。适用于处理大规模的数据密集型任务,如海量数据的分析、复杂的 SQL 查询等

  • MPP优点:

    • 水平扩张能力强,并发处理能力强。
  • MPP缺点:

    • 单节点硬件资源的上限会影响查询数据量。

    • 单节点执行慢会影响整体任务的执行。

    • 出现问题,要从头开始执行。

BSP 架构:

  • 是一个基于整体同步并行计算模型,它将任务拆分为DAG图任务,采用多阶段执行查询,并且进行阶段调度,只有上一个阶段全部执行完,才能执行下一个阶段。

  • BSP优点:

    • 可以用比较小的资源处理更大数据量的计算(处理大于机器内存的数据),
    • 具有task 级别的容错,可以重用中间落盘的计算数据,减少试错成本。
    • 执行更加稳定,支持长时间任务。
  • 缺点:

    • 相比MPP具有通信开销,任务等待同步开销。
    • 小任务中间开销大。

如何开启BSP模式:

执行SQL时候,在SQL尾部添加

  • bsp_mode = 1,来打开BSP模式,
  • distributed_max_parallel_size 可以设置算子并行度,并行度设置越大,单个算子需要的内存资源就越小,大家可以自行根据自己的需求合理设置。
SETTINGS bsp_mode = 1, 
distributed_max_parallel_size = 20

ByConity 使用 ELT 相关参数:

在 query settings 中通过以下参数使用 ELT 能力:

二、常规业务场景下 MPP模式与BSP模式的测试

测试集群配置:

服务名资源服务名资源
Worker:4 * 8core 64GServer:1 * 16core 64G
Daemon Manager:1 * 4core 16GResource Manager:1 * 4core 16G
TSO:1 * 4core 4GFoundationDB3 * 4core 16G
CPU 型号Intel(R) Xeon(R) Platinum 8457C
ByConity版本:v1.0.1 hotfi

测试数据集:

表名表描述数据条数
test_base_fast事件表详情表145亿条
common_base_tag_info用户画像/标签表1289万条

连接到测试集群:

1、连接到ByConity

clickhouse client --port 9000 --host=xxx --user=xxx --password=xxx; 

测试SQL类型:

测试SQL比较长,大家可以直接跳到后面看测试结果。

1、留存查询

SQL语句【大数据量下的查询】:

WITH RankedSales AS
    (
        SELECT
            pd_device_id,
            gameid,
            SUM(playtime) AS total_playtime,
            pd_is_new_user,
            ck_date,
            pd_kind
        FROM test.test_base_fast
        WHERE pd_kind NOT IN ('pandora_http_biz_code', 'mwc_performance', 'pandora_http_request_error', 'ts_action_click', 'app_detail')
        GROUP BY
            pd_device_id,
            gameid,
            pd_is_new_user,
            ck_date,
            pd_kind
    )
SELECT
    sum(retention_result[12]) AS retention_result_12,
    sum(retention_result[11]) AS retention_result_11,
    sum(retention_result[10]) AS retention_result_10,
    sum(retention_result[9]) AS retention_result_9,
    sum(retention_result[8]) AS retention_result_8,
    sum(retention_result[7]) AS retention_result_7,
    sum(retention_result[6]) AS retention_result_6,
    sum(retention_result[5]) AS retention_result_5,
    sum(retention_result[4]) AS retention_result_4,
    sum(retention_result[3]) AS retention_result_3,
    sum(retention_result[2]) AS retention_result_2,
    sum(retention_result[1]) AS retention_result_1,
    pd_is_new_user
FROM
(
    SELECT
        pd_device_id,
        pd_is_new_user,
        retention((pd_kind = 'play_game') AND (ck_date = '2024-12-06'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-07'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-08'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-09'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-10'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-11'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-12'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-13'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-14'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-15'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-16'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-17')) AS retention_result
    FROM
    (
        SELECT DISTINCT
            pd_kind,
            pd_device_id,
            ck_date,
            pd_is_new_user
        FROM RankedSales AS pd_tmp_table
    )
    GROUP BY
        pd_device_id,
        pd_is_new_user
)
WHERE (retention_result[1]) = 1
GROUP BY pd_is_new_user
SETTINGS bsp_mode = 1, distributed_max_parallel_size = 20

SQL语句【小数据量下的查询】:

WITH RankedSales AS
    (
        SELECT
            pd_device_id,
            gameid,
            SUM(playtime) AS total_playtime,
            pd_is_new_user,
            ck_date,
            pd_kind
        FROM test.test_base_fast
        WHERE pd_kind NOT IN ('pandora_http_biz_code', 'mwc_performance', 'pandora_http_request_error', 'ts_action_click', 'app_detail')
        GROUP BY
            pd_device_id,
            gameid,
            pd_is_new_user,
            ck_date,
            pd_kind
    )
SELECT
    sum(retention_result[12]) AS retention_result_12,
    sum(retention_result[11]) AS retention_result_11,
    sum(retention_result[10]) AS retention_result_10,
    sum(retention_result[9]) AS retention_result_9,
    sum(retention_result[8]) AS retention_result_8,
    sum(retention_result[7]) AS retention_result_7,
    sum(retention_result[6]) AS retention_result_6,
    sum(retention_result[5]) AS retention_result_5,
    sum(retention_result[4]) AS retention_result_4,
    sum(retention_result[3]) AS retention_result_3,
    sum(retention_result[2]) AS retention_result_2,
    sum(retention_result[1]) AS retention_result_1,
    pd_is_new_user
FROM
(
    SELECT
        pd_device_id,
        pd_is_new_user,
        retention((pd_kind = 'play_game') AND (ck_date = '2024-12-06'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-07'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-08'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-09'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-10'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-11'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-12'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-13'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-14'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-15'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-16'), (pd_kind = 'mwc_performance') AND (ck_date = '2024-12-17')) AS retention_result
    FROM
    (
        SELECT DISTINCT
            pd_kind,
            pd_device_id,
            ck_date,
            pd_is_new_user
        FROM RankedSales AS pd_tmp_table
    )
    -- 这里我们进行过滤,将 需要用的 play_game 与 mwc_performance 过滤出来,再进行计算
    WHERE pd_kind in ('play_game', 'mwc_performance')  
    GROUP BY
        pd_device_id,
        pd_is_new_user
)
WHERE (retention_result[1]) = 1
GROUP BY pd_is_new_user
SETTINGS bsp_mode = 1, distributed_max_parallel_size = 4

2、转化查询

SQL语句【大数据量下的查询】:

SELECT
    sum(multiIf(ck_pd_level >= 1, 1, 0)) AS ck_pd_level_number_1,
    sum(multiIf(ck_pd_level >= 2, 1, 0)) AS ck_pd_level_number_2,
    sum(multiIf(ck_pd_level >= 3, 1, 0)) AS ck_pd_level_number_3,
    sum(multiIf(ck_pd_level >= 4, 1, 0)) AS ck_pd_level_number_4,
    sum(multiIf(ck_pd_level >= 5, 1, 0)) AS ck_pd_level_number_5
FROM
(
    SELECT
        pd_device_id,
        windowFunnel(3600)(toTimeZone(ck_datetime_timestamp, 'Asia/Shanghai'), 
        kind = 'app_time', kind = 'app_detail', kind = 'play_game', 
        kind = 'mwc_performance', kind = 'mwc_performance') AS ck_pd_level
    FROM test.test_base_fast
    WHERE pd_device_id IS NOT NULL
    GROUP BY pd_device_id
)
SETTINGS bsp_mode = 1, distributed_max_parallel_size = 8

SQL语句【小数据量下的查询】:

SELECT
    sum(multiIf(ck_pd_level >= 1, 1, 0)) AS ck_pd_level_number_1,
    sum(multiIf(ck_pd_level >= 2, 1, 0)) AS ck_pd_level_number_2,
    sum(multiIf(ck_pd_level >= 3, 1, 0)) AS ck_pd_level_number_3
FROM
(
    SELECT
        pd_device_id,
        windowFunnel(3600)(toTimeZone(ck_datetime_timestamp, 'Asia/Shanghai'), 
        kind = 'app_time', kind = 'app_detail', kind = 'play_game') AS ck_pd_level
    FROM test.test_base_fast
    WHERE (pd_device_id IS NOT NULL) AND (kind IN ('app_time', 'app_detail', 'play_game')) AND (ck_date IN ('2024-12-14'))
    GROUP BY pd_device_id
)
SETTINGS bsp_mode = 1,
         distributed_max_parallel_size = 20;

3、计算圈选用户的指标

SQL语句【大数据量下的查询】:

WITH
    tab_info AS
    (
        SELECT
            tag_str_val,
            groupArray(tag_id) AS tag_array
        FROM test.common_base_tag_info
        WHERE tag_str_val != ''
        GROUP BY tag_str_val
    ),
    kind_table AS
    (
        SELECT
            pd_uid,
            playtime,
            kind
        FROM test.test_base_fast
        LEFT JOIN tab_info ON tab_info.tag_str_val = test.test_base_fast.pd_uid
    ),
    palytime_table AS
    (
        SELECT
            pd_uid,
            sum(playtime) AS play_time_sum,
            countDistinct(pd_uid),
            play_time_sum / count(1) AS avg_time
        FROM kind_table
        WHERE kind IN ('app_time', 'play_game', 'mwc_performance')
        GROUP BY pd_uid
        HAVING play_time_sum > 200000
    )
SELECT
    sum(play_time_sum)/1000 AS plays_time_sum,
    countDistinct(pd_uid) AS pd_uid_count,
    plays_time_sum / countDistinct(pd_uid)/1000 AS avg_play_time
FROM palytime_table
SETTINGS bsp_mode = 1, distributed_max_parallel_size = 20

SQL语句【小数据量下的查询】:

WITH
    tab_info AS
    (
        SELECT
            tag_str_val,
            groupArray(tag_id) AS tag_array
        FROM test.common_base_tag_info
        WHERE tag_str_val != ''
        GROUP BY tag_str_val
    ),
    kind_table AS
    (
        SELECT
            pd_uid,
            playtime,
            kind
        FROM test.test_base_fast
        LEFT JOIN tab_info ON tab_info.tag_str_val = test.test_base_fast.pd_uid
    ),
    palytime_table AS
    (
        SELECT
            pd_uid,
            sum(playtime) AS play_time_sum,
            countDistinct(pd_uid),
            play_time_sum / count(1) AS avg_time
        FROM kind_table
        WHERE kind IN ('app_time', 'play_game', 'mwc_performance')
        and ck_date IN ('2024-12-14')
        GROUP BY pd_uid
        HAVING play_time_sum > 200000
    )
SELECT
    sum(play_time_sum)/1000 AS plays_time_sum,
    countDistinct(pd_uid) AS pd_uid_count,
    plays_time_sum / countDistinct(pd_uid)/1000 AS avg_play_time
FROM palytime_table
SETTINGS bsp_mode = 1, distributed_max_parallel_size = 20
4、大量数据聚合并插入新表中

SQL语句【大数据量下的查询】:

(5亿条数据19个列做分组聚合, 聚合结果为2.3亿条)

INSERT INTO test.test_common_base_realtime_flight_data (ck_date, pd_kind, pd_device_id, pd_app_version_code, pd_is_new_user, pd_biz_fresh_install, show_categoryid, channelid, launchtype, isfirstplay, crashtype, pagename, gameid, ck_pandora_new_ab_group_array, ck_pandora_ab_group_array, ck_pandora_switch_ab_group_array, playtime, show_param1, pv, ssp_unit_id, icon_type, reqid, result, rechargequota, voucherquota, price, ssp_price, ssp_unit_id_ecpm, ssp_233_income, realtime_ecpm_ratio, duration, gpackagename)
SELECT
    ck_date,
    pd_kind,
    pd_device_id,
    pd_app_version_code,
    pd_is_new_user,
    pd_biz_fresh_install,
    show_categoryid,
    channelid,
    launchtype,
    isfirstplay,
    crashtype,
    pagename,
    arrayDistinct(groupArray(ifNull(gameid, 0))) AS gameid,
    ck_pandora_new_ab_group_array,
    ck_pandora_ab_group_array,
    ck_pandora_switch_ab_group_array,
    sum(if(playtime IS NOT NULL, if((playtime > 2000) AND (playtime < 21600000), playtime, 0), NULL)) AS playtime,
    sum(show_param1) AS show_param1,
    count(*) AS pv,
    ssp_unit_id,
    icon_type,
    reqid,
    result,
    sum(rechargequota) AS rechargequota,
    sum(voucherquota) AS voucherquota,
    sum(price) AS price,
    sum(if(ssp_price_float IS NOT NULL, if((ssp_price_float >= 0) AND (ssp_price_float <= 50000), ssp_price_float, 0), 0)) AS ssp_price,
    sum(if(ssp_unit_id_ecpm IS NOT NULL, if((ssp_unit_id_ecpm > 0) AND (ssp_unit_id_ecpm <= 500), ssp_unit_id_ecpm, 0), 0)) AS ssp_unit_id_ecpm,
    sum(if(ssp_233_income IS NOT NULL, if((ssp_233_income > 0) AND (ssp_233_income <= 0.5), ssp_233_income, 0), 0)) AS ssp_233_income,
    sum(if(realtime_ecpm_ratio IS NOT NULL, if((realtime_ecpm_ratio > 0) AND (realtime_ecpm_ratio < 500), realtime_ecpm_ratio, 0), 0)) AS realtime_ecpm_ratio,
    sum(if(duration IS NOT NULL, if((duration > 2000) AND (duration < 21600000), duration, 0), 0)) AS duration,
    arrayDistinct(groupArray(ifNull(gpackagename, ''))) AS gpackagename
FROM test.test_base_fast
WHERE (ck_date IN ('2024-12-14')) AND (pd_kind IN ('app_time', 'item_web_click', 'play_game', 'event_start_launch_game', 'crash', 'item_click', 'feed_item_click', 'event_community_article_use_time', 'event_community_main_topic_show', 'event_launch_game_success', 'event_new_ad_show_success', 'event_the_third_pay_success', 'event_ad_free_privilege_pay_success', 'event_le_coin_recharge_succeeded', 'event_intermodal_sdk_le_coin_recharge_result'))
GROUP BY
    ck_date,
    pd_kind,
    pd_device_id,
    pd_app_version_code,
    pd_is_new_user,
    pd_biz_fresh_install,
    show_categoryid,
    channelid,
    launchtype,
    isfirstplay,
    crashtype,
    pagename,
    ck_pandora_new_ab_group_array,
    ck_pandora_ab_group_array,
    ck_pandora_switch_ab_group_array,
    ssp_unit_id,
    icon_type,
    reqid,
    result
SETTINGS bsp_mode = 1, distributed_max_parallel_size = 16

测试结果:

1、留存查询
  • 大数据量下的查询测试结果:
模式执行时间s数据处理总条数/总大小每秒处理行/数据大小
MPP模式内存不足
BSP模式20 并发315.338 sec7.10 billion rows / 535.00 GB22.51 million rows/s / 1.70 GB/s

BSP模式 资源消耗监控信息

  • 小型数据量的查询测试结果:
模式执行时间数据处理总行/总大小每秒处理行/数据大小
MPP模式16.926 sec386.15 million rows / 25.20 GB22.81 million rows/s / 1.49 GB/s
BSP模式20 并发24.508 sec514.64 million rows / 33.58 GB21.00 million rows/s / 1.37 GB/s
BSP模式4 并发17.193 sec514.82 million rows / 33.59 GB29.94 million rows/s / 1.95 GB/s

MPP模式 与 BSP模式的资源消耗 (蓝色框 MPP模式,红色框 BSP模式)

  • 总结:

1、MPP模式出现内存不足的报错时,可以尝试使用BSP模式。

2、BSP模式可以通过拆分任务,降低对资源的消耗,从而达成小资源跑大任务的目的。

3、BSP模式合理调整任务并行数,也可以解决MPP的计算时间。

4、BSP消耗的内存要小于MPP


2、 转化分析
  • 大数据量下的查询测试结果:
模式执行时间数据总处理行/大小每秒处理行/数据大小
MPP模式106.687 sec10.96 billion rows / 632.60 GB102.70 million rows/s / 5.93 GB/s
BSP模式20 并发106.23 sec14.62 billion rows / 844.25 GB137.65 million rows/s / 7.95 GB/s.
BSP模式4 并发104.148 sec.14.65 billion rows / 846.02 GB140.70 million rows/s / 8.12 GB/s

MPP模式 与 BSP模式的资源消耗 (蓝色框 MPP模式,红色框 BSP模式)

  • 小数据量下的查询测试结果:
模式执行时间数据总处理行/ 总大小每秒处理行/数据大小
MPP模式5.269 se1.67 billion rows / 57.42 GB316.11 million rows/s / 10.90 GB/s
BSP模式20 并发6.448 sec2.22 billion rows / 76.57 GB344.51 million rows/s / 11.88 GB/s
BSP模式4 并发5.349 sec2.22 billion rows / 76.57 GB415.28 million rows/s / 14.32 GB/s

MPP模式 与 BSP模式的资源消耗 (蓝色框 MPP模式,红色框 BSP模式)

  • 总结:

1、转换分析场景 MPP与BSP的查询速度接近


3、计算圈选用户的指标
  • 大数据量下的查询测试结果:
模式执行时间数据总处理行/ 总大小每秒处理行/数据大小
MPP模式64.789 sec15.41 billion rows / 620.37 GB237.83 million rows/s / 9.58 GB/s
BSP模式20 并发117.395 sec.15.41 billion rows / 620.63 GB131.30 million rows/s / 5.29 GB/s
BSP 模式 4 并发64.407 sec15.42 billion rows / 620.77 GB239.40 million rows/s / 9.64 GB/s

MPP模式 与 BSP模式的资源消耗 (蓝色框 MPP模式,红色框 BSP模式)

  • 总结:

1、并不是任务并发越高越好,合适的并发可以让BSP的执行时间接近MPP


4、大量数据聚合并插入新表中

(5亿条数据19个列做分组聚合, 聚合结果为2.3亿条,并插入另一个表中)

  • 大数据量下的查询测试结果:
模式执行时间数据总处理行/ 总大小每秒处理行/数据大小
MPP模式内存不足
BSP模式20 并发1699.801 sec488.49 million rows / 565.08 GB287.38 thousand rows/s / 332.44 MB/s
BSP模式40 并发1746.204488.49 million rows / 565.08 GB279.74 thousand rows/s / 323.61 MB/s

BSP模式 20并行任务

  • 总结:

1、对于大型复杂的数仓聚合场景,BSP可以完成任务,MPP则无法应付这种场景

三、应用总结:

1、BSP模式更适合大数据量的计算,可以通过分解并行任务,降低单个任务的内存需求,能够有效保障任务执行的稳定性,更适合ELT,数仓等长任务场景。

2、BSP的任务并行度,并不是设置的越高越好,设置的太高,内存占用低,执行时间长。设置的太低,内存不足,无法完成计算,需要多次尝试获取最佳值。

3、MPP 模式相对BSP模式,可以读取更少的数据完成查询,在短时任务的运行上更具优势。通过排序键、分区、索引等手段可以进一步减少需要操作的数据量,达到更快的查询速度。