ByConity ELT 的体验之旅

159 阅读3分钟

在实际业务场景中,企业通常会根据不同的需求和产品特性分别构建实时数据仓库(Real-time Data Warehouse)和离线数据仓库(Offline Data Warehouse)。实时数仓的核心优势在于能够实现数据的快速入库,并且一旦数据被加载到系统中,就能立即进行分析,提供低延迟的数据洞察。而离线数仓则更侧重于支持复杂查询任务的稳定执行,同时要求有优秀的内存管理能力来处理大规模的数据集。

ByConity引入了一种称为Bulk Synchronous Parallel (BSP)模式的新功能,这一创新不仅提升了系统的容错性,允许在task级别上进行错误恢复,还实现了更加精细的任务调度机制。相较于MPP,BSP可能在某些情况下提供了更高的灵活性,尤其是在需要精确控制任务间交互的情况下。

通过引入BSP模式,ByConity致力于将传统的ETL流程(即Extract, Transform, Load)中的数据转换环节直接集成进平台内部。这样一来,用户可以享受到从数据接入、数据加工直至最终分析的一站式服务体验。

ByConity社区在一周年的时候提出BSP的工作之后,我就一直对这个功能很期待,也对我个人后续的工作有所启发,这次也正好借社区的体验一下。

测试环境

image.png

因为做cpu优化较多,这里特地要了一下cpu的型号,方便后续线下性能测试做进一步的对比:

d230b3c615c8a54dddcd4c8c40e0db36.jpg

数据集:TPCDS-1TB

测试流程

这里我整理了一个测试的脚本,完整的跑完TPCDS的99个SQL。每个SQL跑三次,记录下查询耗时,将MPP的结果和BSP的结果进行对比

# 创建日志文件来记录耗时
LOG_FILE="execution_times.log"
echo "SQL File, Time 1 (ms), Time 2 (ms), Time 3 (ms)" > $LOG_FILE

# 循环遍历 SQL 文件并执行
for ((i=$START_NUM; i<=$END_NUM; i++)); do
    # 格式化文件编号为两位数
    FILE_NUM=$(printf "%02d" $i)
    SQL_FILE="${FILE_NUM}.sql"

    # 检查文件是否存在
    if [ ! -f "$SQL_FILE" ]; then
        echo "File $SQL_FILE does not exist. Skipping..."
        continue
    fi
    echo "Start execute the $SQL_FILE..."

    # 初始化时间数组
    TIMES=()

    for ((j=1; j<=$EXECUTIONS; j++)); do
        # 记录开始时间
        START_TIME=$(date +%s.%N)

        # 执行 SQL 文件
        clickhouse client --port $PORT -d $DATABASE --multiquery < $SQL_FILE  > /dev/null 2>&1

        # 记录结束时间
        END_TIME=$(date +%s.%N)

        # 计算耗时(以毫秒为单位)
        ELAPSED_TIME=$(echo "scale=3; ($END_TIME - $START_TIME) * 1000" | bc)

        # 将耗时添加到数组中
        TIMES+=("$ELAPSED_TIME")
    done

    # 记录耗时到日志文件
    echo "$SQL_FILE, ${TIMES[0]}, ${TIMES[1]}, ${TIMES[2]}" >> $LOG_FILE
done

echo "Execution completed. Check $LOG_FILE for timing details."

通过开关bsp_mode = 1可以控制执行模式。默认bsp模式是关闭状态,第一轮测试我们先用mpp模式来测试一下ByConity的性能表现:

SQL File, Time 1 (ms), Time 2 (ms), Time 3 (ms)
01.sql, 2500.417235000, 740.928270000, 748.862854000
02.sql, 5684.105272000, 4112.837026000, 4114.081477000
03.sql, 13663.636658000, 1352.766360000, 1345.474030000
04.sql, 57313.197289000, 56479.017025000, 56780.650707000
05.sql, 7091.974125000, 1893.200936000, 1906.786671000
06.sql, 676.565873000, 628.883112000, 627.184614000
07.sql, 27245.450581000, 1405.223999000, 1401.063053000
08.sql, 3136.510212000, 897.744280000, 885.744084000
09.sql, 26050.702879000, 20137.338260000, 20135.895216000
10.sql, 598.898387000, 579.964113000, 586.561807000
11.sql, 35065.900974000, 36280.781710000, 35118.692726000
12.sql, 321.999170000, 280.052752000, 279.179310000
13.sql, 16680.277731000, 8356.482468000, 9274.254234000
14.sql, 48200.840253000, 32061.014278000, 32036.357957000
15.sql, 833.055095000, 812.941883000, 815.898660000
16.sql, 15796.652983000, 1666.280626000, 1656.412677000
17.sql, 3386.908611000, 3408.060309000, 3361.758404000
18.sql, 6169.164768000, 1083.924399000, 1069.319139000
19.sql, 1344.175624000, 868.651252000, 858.782186000
20.sql, 330.173017000, 316.371376000, 315.890560000
21.sql, 212.421237000, 193.637222000, 195.864209000
22.sql, 3129.260372000, 3101.200425000, 3108.546019000
23.sql, 65557.232779000, 66138.563086000, 60963.935310000
24.sql, 82751.868197000, 44671.792286000, 44690.845093000
25.sql, 8965.225003000, 2705.733237000, 2731.016437000
26.sql, 10686.814294000, 659.515003000, 650.898823000
27.sql, 12399.255246000, 1442.244891000, 1368.909192000
28.sql, 40626.715580000, 13384.962368000, 13402.759085000
29.sql, 14114.979109000, 2883.055353000, 2853.068223000
30.sql, 2063.347478000, 520.281904000, 509.875105000
31.sql, 8688.910161000, 3019.013709000, 3082.307454000
32.sql, 210.020309000, 164.222858000, 165.223693000
33.sql, 6289.966591000, 823.148774000, 842.899082000
34.sql, 5117.687548000, 2153.117965000, 2198.254187000
35.sql, 2693.072014000, 2673.821479000, 2683.144098000
36.sql, 11939.336363000, 1073.810272000, 1062.518232000
37.sql, 495.314805000, 464.935323000, 471.237599000
38.sql, 12322.080144000, 12490.622366000, 12532.220487000
39.sql, 1132.253501000, 1039.626539000, 1099.546727000
40.sql, 972.943419000, 583.638358000, 587.086361000
41.sql, 134.245786000, 130.778370000, 130.875383000
42.sql, 380.865131000, 375.654269000, 383.434725000
43.sql, 3379.108348000, 3251.953053000, 3245.598081000
44.sql, 18685.521714000, 2334.813128000, 2341.997262000
45.sql, 915.332408000, 878.502493000, 859.795414000
46.sql, 3177.641372000, 3136.025993000, 3150.580043000
47.sql, 11609.751725000, 11353.503033000, 11458.342197000
48.sql, 5595.841693000, 5787.241086000, 5667.469774000
49.sql, 7317.552930000, 1208.849251000, 1174.010077000
50.sql, 3494.156798000, 3428.563442000, 3434.261943000
51.sql, 15843.361274000, 16659.372657000, 16821.447917000
52.sql, 393.870314000, 386.926845000, 400.376744000
53.sql, 908.235498000, 903.865270000, 908.247767000
54.sql, 6204.004771000, 2650.722724000, 2671.216048000
55.sql, 372.331130000, 368.516284000, 373.530255000
56.sql, 4506.461852000, 779.692293000, 781.802262000
57.sql, 10066.929804000, 9862.378469000, 9866.496327000
58.sql, 660.396760000, 654.991440000, 651.146011000
59.sql, 6953.868901000, 6834.725211000, 6824.593703000
60.sql, 1307.181758000, 901.548828000, 922.496505000
61.sql, 1458.323381000, 1350.249955000, 1347.974119000
62.sql, 6197.203773000, 2243.225064000, 2230.614091000
63.sql, 888.172977000, 889.989969000, 886.241186000
64.sql, 28210.991356000, 6647.572140000, 5622.200290000
65.sql, 6673.980898000, 7225.716856000, 6747.398444000
66.sql, 12265.408138000, 1405.136443000, 1422.536582000
67.sql, 69651.678875000, 70368.248963000, 69426.461034000
68.sql, 76899.306189000, 3578.698381000, 3557.638454000
69.sql, 976.799987000, 832.500641000, 842.894378000
70.sql, 11540.984201000, 4439.065095000, 4476.331550000
71.sql, 2966.646143000, 814.902046000, 805.085165000
72.sql, 12874.799024000, 10610.950123000, 10634.905894000
73.sql, 2084.069396000, 2088.949367000, 2067.472098000
74.sql, 19148.432190000, 19172.953789000, 19109.045509000
75.sql, 50763.672160000, 31934.888646000, 32352.643441000
76.sql, 30501.735149000, 2957.278268000, 2951.074677000
77.sql, 7191.921021000, 1137.286131000, 1103.082557000
78.sql, 46670.698119000, 46100.776684000, 36461.773545000
79.sql, 53901.977722000, 4821.301949000, 4546.112628000
80.sql, 16741.823639000, 5042.225487000, 2191.789492000
81.sql, 1822.843299000, 765.879847000, 779.053110000
82.sql, 16502.587213000, 956.885797000, 948.975353000
83.sql, 1961.360390000, 615.315087000, 608.643247000
84.sql, 1412.588023000, 392.027797000, 374.822517000
85.sql, 11472.411646000, 3901.538067000, 3950.645119000
86.sql, 1627.505837000, 1424.079341000, 1443.756288000
87.sql, 12569.177245000, 12652.031655000, 12681.586801000
88.sql, 10508.045348000, 5102.712999000, 5094.397391000
89.sql, 5319.608600000, 1509.646864000, 1506.996456000
90.sql, 5843.208930000, 726.093508000, 720.111817000
91.sql, 285.486470000, 241.264828000, 238.374335000
92.sql, 261.837246000, 165.526041000, 158.327533000
93.sql, 19569.216061000, 2987.161975000, 2978.897082000
94.sql, 11832.125811000, 996.486809000, 990.362105000
95.sql, 1246.484201000, 1250.263064000, 1233.122815000
96.sql, 2232.953358000, 2209.995074000, 2204.039183000
97.sql, 36844.506700000, 37434.347727000, 36652.348167000
98.sql, 828.595516000, 661.760082000, 650.544832000
99.sql, 9068.254419000, 4088.992554000, 4102.461222000

这里可以看到,ByConity在MPP模式下性能表现还是很不错的。这里有个两个问题,

问题1: q23和q78会因为内存问题OOM,这个是符合预期的,后续我们需要用BSP来解决问题

问题2: q58和q72会因为语法问题报错,联系社区工作人员之后修改解决,感谢支持 Code: 207. DB::Exception: Received from localhost:9010. DB::Exception: Identifier item_id is ambiguous. SQLSTATE: 42000.

那继续进行BSP模式测试,这次所有SQL都完整的完成了执行,包含MPP下OOM的Q23和Q78

SQL File, Time 1 (ms), Time 2 (ms), Time 3 (ms)
01.sql, 26722.031596000, 26492.893537000, 26446.361956000
02.sql, 31065.377459000, 31147.889539000, 30993.427781000
03.sql, 10278.226295000, 10266.399854000, 10366.373874000
04.sql, 138282.770139000, 123085.811137000, 106433.360729000
05.sql, 17760.646261000, 14901.309146000, 14918.992985000
06.sql, 31533.417102000, 31400.945812000, 31312.090135000
07.sql, 15843.670964000, 10500.574634000, 10402.000032000
08.sql, 23009.647530000, 22662.262988000, 22555.920426000
09.sql, 29600.148896000, 20810.320432000, 20847.361783000
10.sql, 29049.560720000, 28942.565147000, 29024.406006000
11.sql, 55818.604251000, 56477.392441000, 56233.032348000
12.sql, 13318.778156000, 13701.742569000, 13616.521443000
13.sql, 42777.006882000, 35753.250187000, 35209.959702000
14.sql, 75893.365030000, 63944.528364000, 63941.490294000
15.sql, 14125.856492000, 13870.746401000, 13797.039160000
16.sql, 36910.312133000, 28091.914105000, 27803.665052000
17.sql, 13313.996398000, 13118.177079000, 13287.699459000
18.sql, 26807.929604000, 24963.689395000, 24860.940334000
19.sql, 17852.355546000, 17363.481630000, 17478.566336000
20.sql, 13544.595523000, 13426.667319000, 13715.701160000
21.sql, 10915.450114000, 11103.168840000, 11201.933655000
22.sql, 11177.809594000, 11199.218434000, 11170.620869000
23.sql, 95525.067332000, 74690.923636000, 65435.519655000
24.sql, 41636.368499000, 41776.413677000, 41717.704777000
25.sql, 13017.293090000, 12705.788314000, 12799.757906000
26.sql, 10649.612521000, 10202.927975000, 10299.757852000
27.sql, 10581.454456000, 10443.283052000, 10548.158721000
28.sql, 35783.265476000, 13844.859154000, 13897.076117000
29.sql, 10607.731902000, 10491.013468000, 10491.133896000
30.sql, 21725.926681000, 21762.764809000, 21778.087844000
31.sql, 24299.720290000, 21010.188163000, 21309.682188000
32.sql, 5389.201218000, 5369.533164000, 5368.888436000
33.sql, 23725.273792000, 21529.862940000, 21529.260250000
34.sql, 25608.728194000, 15713.747300000, 15704.771956000
35.sql, 22296.397332000, 22482.317801000, 22504.225759000
36.sql, 14932.192386000, 13246.874438000, 13337.078140000
37.sql, 9858.560769000, 9856.357304000, 9754.014575000
38.sql, 24381.511811000, 24113.737009000, 23514.338521000
39.sql, 14799.839324000, 14992.682438000, 14892.679677000
40.sql, 11488.520830000, 11363.936801000, 11458.776777000
41.sql, 13596.089633000, 13689.824944000, 13692.038170000
42.sql, 9925.183129000, 9918.596750000, 9725.406951000
43.sql, 13190.962023000, 13223.525714000, 13211.963894000
44.sql, 44126.567086000, 18069.764273000, 17865.943197000
45.sql, 13650.256459000, 13215.767153000, 13322.678498000
46.sql, 20621.480942000, 20608.508017000, 20608.494047000
47.sql, 16946.395711000, 16929.917376000, 17096.739385000
48.sql, 25504.501428000, 24718.439220000, 24742.584333000
49.sql, 12703.927522000, 7682.757147000, 7710.857133000
50.sql, 12962.462304000, 12958.872020000, 13045.328567000
51.sql, 56483.476684000, 55917.182683000, 55741.020917000
52.sql, 10138.772807000, 10122.555045000, 10120.122952000
53.sql, 10078.068967000, 10271.387885000, 10169.703197000
54.sql, 36698.242916000, 30630.877662000, 30748.780449000
55.sql, 10116.005668000, 10115.576753000, 10120.710484000
56.sql, 19118.662504000, 19186.645778000, 19206.072488000
57.sql, 16433.860463000, 16697.133893000, 16648.739447000
58.sql, 32963.895128000, 33055.430963000, 33161.468104000
59.sql, 28898.450211000, 29639.837288000, 28918.387203000
60.sql, 31186.907274000, 30657.261138000, 30996.820533000
61.sql, 13986.140110000, 14073.000494000, 14178.580338000
62.sql, 14450.303806000, 13234.386282000, 13424.198712000
63.sql, 10186.707278000, 10067.448863000, 10171.885742000
64.sql, 44363.126450000, 42477.211247000, 41962.867209000
65.sql, 30552.027869000, 31407.086590000, 30193.627328000
66.sql, 15445.771725000, 10926.975135000, 10847.065053000
67.sql, 38308.111802000, 37123.872824000, 36988.787588000
68.sql, 46749.746920000, 20917.471457000, 20847.972963000
69.sql, 28227.077216000, 28142.590601000, 28132.271631000
70.sql, 30205.081997000, 29695.569115000, 29513.607330000
71.sql, 13841.887903000, 13542.863171000, 13540.206828000
72.sql, 64267.889347000, 32334.672386000, 32379.371700000
73.sql, 15612.919999000, 15823.905673000, 15709.334894000
74.sql, 48687.331535000, 40072.875919000, 39937.582034000
75.sql, 24661.008459000, 24207.868836000, 23994.381694000
76.sql, 15223.124105000, 11184.239498000, 11096.187986000
77.sql, 17154.878475000, 16789.658421000, 16908.968220000
78.sql, 57174.191802000, 50462.589161000, 50606.435908000
79.sql, 44621.636090000, 18097.898596000, 18237.533171000
80.sql, 14138.912932000, 14071.173317000, 13900.648541000
81.sql, 21891.091714000, 21981.555024000, 21947.079610000
82.sql, 9768.186183000, 9851.307172000, 9970.844719000
83.sql, 35617.526827000, 35694.611641000, 35697.596243000
84.sql, 21871.068119000, 21354.852896000, 21370.744626000
85.sql, 23873.552047000, 23944.963596000, 23817.038381000
86.sql, 13217.648130000, 13299.050867000, 13399.703009000
87.sql, 23639.510148000, 21016.517114000, 23447.402300000
88.sql, 19099.515640000, 14254.041606000, 14241.940389000
89.sql, 13650.938837000, 13865.859245000, 13973.062387000
90.sql, 8428.838873000, 6955.946790000, 6945.333944000
91.sql, 22865.669305000, 22710.952329000, 22727.951913000
92.sql, 5397.735836000, 5356.107248000, 5359.618246000
93.sql, 30624.605215000, 14209.039581000, 14225.397273000
94.sql, 20322.301796000, 19024.578683000, 18927.594829000
95.sql, 35509.364709000, 35096.358035000, 35081.657804000
96.sql, 7485.227887000, 7488.326103000, 7507.468006000
97.sql, 41720.915746000, 42062.217960000, 42043.485966000
98.sql, 13553.314248000, 13484.398158000, 13384.501815000
99.sql, 14710.196019000, 14997.864624000, 14834.412881000

测试结果分析

从上面的查询结果看BSP通过shuffle spill的方式,稳健的跑出了对应的查询,但是和纯粹内存计算的MPP比,性能上也就失色了不少。这也对应了本文开头提到的离线和实时的两种负载情况。用户可以根据对应的场景选择合适的测试方式。

建议

  • 自适应并行度 当前bsp模式需要手动设置distributed_max_parallel_size 这对用户来说还是有不小的心智成本的
  • shuffle service 可以考虑引入对应的服务,进一步提升BSP模型下的执行效率

最后感谢ByConity的社区同学,提供了一个很好的测试环境,完结撒花~~~