High Performance TiDB 课程学习计划-week 2

550 阅读7分钟

题目

分值:300

题目描述:

使用 sysbench、go-ycsb 和 go-tpc 分别对
 TiDB 进行测试并且产出测试报告。

测试报告需要包括以下内容:

* 部署环境的机器配置(CPU、内存、磁盘规格型号),拓扑结构(TiDB、TiKV 各部署于哪些节点)
* 调整过后的 TiDB 和 TiKV 配置
* 测试输出结果
* 关键指标的监控截图
	    * TiDB Query Summary 中的 qps 与 duration
	    * TiKV Details 面板中 Cluster 中各 server 的 CPU 以及 QPS 指标
	    * TiKV Details 面板中 grpc 的 qps 以及 duration

输出:写出你对该配置与拓扑环境和 workload 下 TiDB 集群负载的分析,提出你认为的 TiDB 的性能的瓶颈所在(能提出大致在哪个模块即 可)

截止时间:下周二(8.25)24:00:00(逾期提交不给分)

学习资料

学习过程

本周学习还是挺顺利的,之前已经通过operator搭建了3.0版本的tidb集群,所以这次也通过Operator搭建4.0集群。

环境配置

搭建Tidb集群详见官方文档 =》 Kubernetes 上使用 TiDB Operator 快速上手

|宿主|配置|操作系统|CPU型号| |------|------------|--|--|--| |IDC1 H1|32核/128G/4562G|CentOS Linux release 7.4.1708 (Core)|Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz| |IDC1 H2|32核/128G/4562G|CentOS Linux release 7.4.1708 (Core)|Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz| |IDC2 H1|32核/128G/4562G|CentOS Linux release 7.4.1708 (Core)|Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz| |IDC2 H2|32核/128G/4562G|CentOS Linux release 7.4.1708 (Core)|Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz| |IDC3 H1|32核/128G/4562G|CentOS Linux release 7.4.1708 (Core)|Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz| |IDC3 H2|32核/128G/4562G|CentOS Linux release 7.4.1708 (Core)|Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz|

Operator搭建

组件版本
Operatorv1.1.2
tidb镜像pingcap/tidb:v4.0.4
tikv镜像pingcap/tikv:v4.0.4
pd镜像pingcap/pd:v4.0.4

组件配置

组件副本数CPU内存拓扑结构
tidb128100GIDC1 H2
tikv328100GIDC1 H1、IDC2 H1、IDC3 H1
pd12860GIDC2 H2
sysbenchMac电脑

Config 配置

tidb

      compatible-kill-query: true
      log:
        expensive-threshold: 10000
        file:
          max-backups: 300
          max-days: 30
          max-size: 300
        level: info
        query-log-max-len: 30000
        slow-threshold: 1000
      mem-quota-query: 209715200
      oom-action: log
      pessimistic-txn:
        enable: true
        max-retry-count: 256
      tikv-client:
        commit-timeout: 41s
        grpc-connection-count: 16
        grpc-keepalive-time: 10
        grpc-keepalive-timeout: 3
      token-limit: 10000

tikv

      raftstore:
        apply-pool-size: 2
        raft-base-tick-interval: 2s
        store-pool-size: 2
      readpool:
        coprocessor:
          high-concurrency: 6
          low-concurrency: 6
          normal-concurrency: 6
        storage:
          high-concurrency: 6
          low-concurrency: 6
          max-tasks-per-worker-high: 24000
          max-tasks-per-worker-low: 24000
          max-tasks-per-worker-normal: 24000
          normal-concurrency: 6
      rocksdb:
        defaultcf:
          dynamic-level-bytes: true
          level0-file-num-compaction-trigger: 4
          level0-slowdown-writes-trigger: 40
          level0-stop-writes-trigger: 60
        max-background-jobs: 8
        max-sub-compactions: 3
        writecf:
          dynamic-level-bytes: true
          level0-file-num-compaction-trigger: 4
          level0-slowdown-writes-trigger: 40
          level0-stop-writes-trigger: 60
      server:
        grpc-compression-type: none
        grpc-concurrency: 4
        grpc-keepalive-time: 12s
      storage:
        block-cache:
          capacity: "3.0"
        scheduler-concurrency: 2048000

准备数据

由于下载比较慢建议翻墙下载[尴尬]

# 安装sysbench
brew install sysbench

pod_ip=${xx} # 输入pod ip
# 优化数据导入
mysql -P4000 -uroot -h$pod_ip
create database sbtest;
set global tidb_disable_txn_auto_retry = off;
set global tidb_txn_mode="optimistic";

# 准备sysbench的config
mkdir sysbench
cd sysbench
cat > config  << EOF
mysql-host=$pod_ip
mysql-port=4000
mysql-user=root
mysql-password=
mysql-db=sbtest
time=600
threads=16
report-interval=10
db-driver=mysql
EOF

# 写入数据
sysbench --config-file=config oltp_point_select --tables=32  --table-size=1000000 prepare

经过近2个多小时终于完成了数据的写入

sysbench 压测

# 调整回悲观事务
set global tidb_txn_mode="pessimistic" ;

Point select 测试命

sysbench --config-file=config oltp_point_select --threads=16 --tables=32 --table-size=5000000 run --time=120
SQL statistics:
    queries performed:
        read:                            101778
        write:                           0
        other:                           0
        total:                           101778
    transactions:                        101778 (848.06 per sec.)
    queries:                             101778 (848.06 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          120.0110s
    total number of events:              101778

Latency (ms):
         min:                                    7.60
         avg:                                   18.86
         max:                                 1083.14
         95th percentile:                       19.29
         sum:                              1919977.27

Threads fairness:
    events (avg/stddev):           6361.1250/129.63
    execution time (avg/stddev):   119.9986/0.00

Update index 测试

sysbench --config-file=config oltp_update_index --threads=16 --tables=32 --table-size=5000000 run --time=120
SQL statistics:
    queries performed:
        read:                            0
        write:                           1
        other:                           116131
        total:                           116132
    transactions:                        116132 (967.65 per sec.)
    queries:                             116132 (967.65 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          120.0135s
    total number of events:              116132

Latency (ms):
         min:                                   10.24
         avg:                                   16.53
         max:                                  625.93
         95th percentile:                       17.63
         sum:                              1919998.24

Threads fairness:
    events (avg/stddev):           7258.2500/178.74
    execution time (avg/stddev):   119.9999/0.00

Read-only 测试

sysbench --config-file=config oltp_read_only --threads=16 --tables=32 --table-size=5000000 run --time=120
SQL statistics:
    queries performed:
        read:                            72548
        write:                           0
        other:                           10364
        total:                           82912
    transactions:                        5182   (43.08 per sec.)
    queries:                             82912  (689.28 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          120.2874s
    total number of events:              5182

Latency (ms):
         min:                                  153.68
         avg:                                  371.17
         max:                                 2499.93
         95th percentile:                      909.80
         sum:                              1923412.99

Threads fairness:
    events (avg/stddev):           323.8750/4.85
    execution time (avg/stddev):   120.2133/0.06

go-ycsb 测试

git clone https://github.com/pingcap/go-ycsb
cd go-ycsb 
make

运行

host=
./bin/go-ycsb load mysql -P workloads/workloada -p recordcount=10000000 -p mysql.host=$host -p mysql.port=4000 --threads 16
...
NSERT - Takes(s): 24240.0, Count: 9981062, OPS: 411.8, Avg(us): 38654, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 723000, 99.99th(us): 1154000
INSERT - Takes(s): 24250.0, Count: 9983405, OPS: 411.7, Avg(us): 38651, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 723000, 99.99th(us): 1154000
INSERT - Takes(s): 24260.0, Count: 9985757, OPS: 411.6, Avg(us): 38648, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 723000, 99.99th(us): 1154000
INSERT - Takes(s): 24270.0, Count: 9987497, OPS: 411.5, Avg(us): 38647, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 723000, 99.99th(us): 1154000
INSERT - Takes(s): 24280.0, Count: 9989750, OPS: 411.4, Avg(us): 38644, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 723000, 99.99th(us): 1154000
INSERT - Takes(s): 24290.0, Count: 9991808, OPS: 411.4, Avg(us): 38643, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
INSERT - Takes(s): 24300.0, Count: 9993949, OPS: 411.3, Avg(us): 38640, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
INSERT - Takes(s): 24310.0, Count: 9996145, OPS: 411.2, Avg(us): 38638, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
INSERT - Takes(s): 24320.0, Count: 9998363, OPS: 411.1, Avg(us): 38635, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
INSERT - Takes(s): 24330.0, Count: 9999734, OPS: 411.0, Avg(us): 38633, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
INSERT - Takes(s): 24340.0, Count: 10000000, OPS: 410.8, Avg(us): 38633, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
INSERT - Takes(s): 24350.0, Count: 10000000, OPS: 410.7, Avg(us): 38633, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000
Run finished, takes 6h45m55.595477627s
INSERT - Takes(s): 24355.6, Count: 10000000, OPS: 410.6, Avg(us): 38633, Min(us): 11143, Max(us): 8813451, 99th(us): 448000, 99.9th(us): 722000, 99.99th(us): 1154000

go-tpc 测试

curl --proto '=https' --tlsv1.2 -sSf https://raw.githubusercontent.com/pingcap/go-tpc/master/install.sh | sh

准备数据

go-tpc -H 10.48.80.33 tpcc --warehouses 4 --parts 4 prepare

# 输出
....
begin to check warehouse 4 at condition 3.3.2.2
begin to check warehouse 4 at condition 3.3.2.9
Finished

运行

go-tpc -H 10.48.80.33 tpcc --warehouses 4 run

#输出
Finished
DELIVERY - Takes(s): 3302.8, Count: 10945, TPM: 198.8, Sum(ms): 5672222, Avg(ms): 518, 95th(ms): 1000, 99th(ms): 1500, 99.9th(ms): 2000
NEW_ORDER - Takes(s): 3303.0, Count: 124460, TPM: 2260.8, Sum(ms): 26087451, Avg(ms): 209, 95th(ms): 512, 99th(ms): 512, 99.9th(ms): 1000
NEW_ORDER_ERR - Takes(s): 15.1, Count: 11, TPM: 43.8, Sum(ms): 318799, Avg(ms): 28981, 95th(ms): 16000, 99th(ms): 16000, 99.9th(ms): 16000
ORDER_STATUS - Takes(s): 3302.9, Count: 10987, TPM: 199.6, Sum(ms): 634410, Avg(ms): 57, 95th(ms): 80, 99th(ms): 96, 99.9th(ms): 512
PAYMENT - Takes(s): 3303.1, Count: 119181, TPM: 2164.9, Sum(ms): 19114865, Avg(ms): 160, 95th(ms): 512, 99th(ms): 512, 99.9th(ms): 1000
PAYMENT_ERR - Takes(s): 0.0, Count: 4, TPM: 423520.9, Sum(ms): 121224, Avg(ms): 30306, 95th(ms): 16000, 99th(ms): 16000, 99.9th(ms): 16000
STOCK_LEVEL - Takes(s): 3303.0, Count: 11206, TPM: 203.6, Sum(ms): 715604, Avg(ms): 63, 95th(ms): 80, 99th(ms): 96, 99.9th(ms): 512
STOCK_LEVEL_ERR - Takes(s): 0.0, Count: 1, TPM: 130504.9, Sum(ms): 30153, Avg(ms): 30153, 95th(ms): 16000, 99th(ms): 16000, 99.9th(ms): 16000

监控截图

TiDB Query Summary 中的 qps 与 duration

TiKV Details 面板中 Cluster 中各 server 的 CPU 以及 QPS 指标

TiKV Details 面板中 grpc 的 qps 以及 duration

性能分析

这块还在学习中,不同场景应该需要使用不同的配置:

  1. 导数据:开启乐观锁;
  2. 写多读少:瓶颈可能在Raft线程池及Compaction