目标
通过预研找到契合团队技术现状的足以支撑业务下一阶段可能达到的大规模数据(2000W)管理的解决方案,并建立原型框架验证解决方案。
技术调研主要过程
测试单表2000W数据的性能,从而确定是否需要进行分库分表
| SQL(ms) | Mybatis(ms) | 备注 | |
|---|---|---|---|
| 计数 | 166163 | 164596 | SELECT COUNT(*) from result_data |
| 分页查询 | 12 | 1066 | SELECT * from result_data LIMIT 10; |
| 单一属性查询 | 143912 | 145203 | SELECT (result_data_id) from result_data where remark =‘我是第480001006产生’ |
调研对比分库分表相关开源技术
a) MyCat,阿里在Cobar调整的,有了mycat2, Stars,5861
b) Cobar,基本停滞了,Cobar is a proxy for sharding databases and tables,compatible with MySQL protocal and MySQL SQL grama,underlying storage only support MySQL for support foreground business more simple,stable,efficient and safety。Stars,2555
c) Sharding JDBC,当当,6881,非常活跃
d) Sharding Proxy,当当,6881,非常活跃
e) TDDL ,淘宝搞的,1069,停滞
f) Atlas,360产品,Star 3909,不活跃
g) DBProxy,美团,Star 1714,基本停滞
h) HBase,重型,与目前的技术体系差别大,需要专业运维
分布式数据库(下一步发展重点)
a) DRDS,DRDS 是阿里巴巴集团自主研发的分布式数据库中间件产品,专注于解决单机关系型数据库扩展性问题,具备轻量(无状态)、灵活、稳定、高效等特性,经历历届双十一核心交易业务和各类行业业务的考验。云产品,无法离线使用
b) TiDB,最后了解到的,相关信息需要补充,star 17598,非常活跃,
i. iDB (The pronunciation is: /'taɪdiːbi:/ tai-D-B, etymology: titanium) is an open-source distributed scalable Hybrid Transactional and Analytical Processing (HTAP) database. It features horizontal scalability, strong consistency, and high availability. TiDB is MySQL compatible and serves as a one-stop data warehouse for both OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) workloads.
ii. 生产级需要使用ansible技术,这也意味着必须采用kubernetes集群
iii. 应用场景
- 数据量大,单机保存不下
- 不希望做 Sharding 或者懒得做 Sharding
- 访问模式上没有明显的热点
- 需要事务、需要强一致、需要灾备
c) 主要应用案例
i. 北京银行:事务性,性能,以及扩展性,无侵入性
ii. 头条:如今(数据截至发文),TiDB 支撑着今日头条 OLTP 系统里数据流量最大、QPS 最高的场景:集群容量 几十T,数据量日增 5 亿,日常 QPS 均值在 12W,峰值 20W 左右。
对比分析各项技术
确定sharding proxy,sharding jdbc为突破点
a) 简单
b) 性能由一点损失
Sharding proxy
基本理论
搭建Sharding proxy集群
PS:建立四台虚拟机组成集群,三台数据库,一台proxy,OS均为windows server 2008
- 安装mysql 到三台数据库节点
- 下载3.0.0版本的release包
- 修改配置文件
a) Config-sharding.yaml
#
# Here you can configure the rules for the proxy.
# This example is configuration of sharding rule.
#
# If you want to use sharding, please refer to this file;
# if you want to use master-slave, please refer to the config-master_slave.yaml.
#
######################################################################################################
schemaName: dim_research
dataSources:
ds_0:
url: jdbc:mysql://10.0.4.212:3306/dim_research?serverTimezone=UTC&useSSL=false&characterEncoding=utf8
username: root
password: ABC.123456.ZwSe2
autoCommit: true
connectionTimeout: 30000
idleTimeout: 60000
maxLifetime: 1800000
maximumPoolSize: 50
ds_1:
url: jdbc:mysql://10.0.4.207:3306/dim_research?serverTimezone=UTC&useSSL=false&characterEncoding=utf8
username: root
password: ABC.123456.ZwSe2
autoCommit: true
connectionTimeout: 30000
idleTimeout: 60000
maxLifetime: 1800000
maximumPoolSize: 50
ds_2:
url: jdbc:mysql://10.0.4.213:3306/dim_research?serverTimezone=UTC&useSSL=false&characterEncoding=utf8
username: root
password: ABC.123456.ZwSe2
autoCommit: true
connectionTimeout: 30000
idleTimeout: 60000
maxLifetime: 1800000
maximumPoolSize: 50
shardingRule:
tables:
result_data:
actualDataNodes: ds0.result_data_0,ds1.result_data_1,ds2.result_data_2
tableStrategy:
inline:
shardingColumn: result_data_id
algorithmExpression: result_data_${result_data_id % 3}
keyGeneratorColumnName: result_data_id
bindingTables:
- result_data
defaultDatabaseStrategy:
inline:
shardingColumn: user_id
algorithmExpression: ds_${user_id % 3}
defaultTableStrategy:
none:
defaultKeyGeneratorClassName: io.shardingsphere.core.keygen.DefaultKeyGenerator
b) server.yaml
#
# If you want to configure orchestration, authorization and proxy properties, please refer to this file.
#
######################################################################################################
# orchestration:
# name: orchestration_ds
# overwrite: true
# registry:
# serverLists: localhost:2181
# namespace: orchestration
authentication:
username: root
password: root
props:
max.connections.size.per.query: 1
acceptor.size: 16
executor.size: 16
proxy.transaction.enabled: false
proxy.opentracing.enabled: false
sql.show: false
4.启动,运行 bin/start.exe
先后启动遇到的问题
- 官网配置手册参数错误问题(这个太不应该了)
a) 使用本文的配置 - 遇到无法执行SQL,无法找到query_cache_size的环境变量
a) 无法兼容Mysql8,换用mysql 5.5 - 插入数据中文是乱码问题
a) 要在url上加上characterEncoding=utf8 - 插入重复数据的问题
a) sharding.jdbc.config.sharding.tables.result_data.actual-data-nodes这行的配置是关键,如果写成了ds${0…2}.result_data会导致数据被重复存储了3份
Sharding Sidecar
需要比较强的基础设施,这里就不介绍了。下面是官网地址。
shardingsphere.apache.org/document/cu…
Sharding JDBC搭建
采用的SpringBoot+Mybatis+Druid的框架,也没什么特殊的直接上代码吧。
目录结果
关键的依赖pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.zw.se2.research</groupId>
<artifactId>mybatisDemo</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.0.0.RELEASE</version>
</parent>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.38</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.shardingsphere/sharding-jdbc-core -->
<dependency>
<groupId>io.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-boot-starter</artifactId>
<version>3.1.0</version>
</dependency>
<!-- for spring namespace -->
<dependency>
<groupId>io.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-namespace</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-joda</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-parameter-names</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid</artifactId>
<version>1.1.10</version>
</dependency>
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>1.3.1</version>
</dependency>
<!--mapper-->
<dependency>
<groupId>tk.mybatis</groupId>
<artifactId>mapper-spring-boot-starter</artifactId>
<version>2.0.0</version>
</dependency>
<!--pagehelper-->
<dependency>
<groupId>com.github.pagehelper</groupId>
<artifactId>pagehelper-spring-boot-starter</artifactId>
<version>1.2.3</version>
<exclusions>
<exclusion>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-starter</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-aop</artifactId>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.3</version>
</dependency>
<dependency>
<groupId>net.sf.jacob-project</groupId>
<artifactId>jacob</artifactId>
<version>1.14.3</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>spring-snapshots</id>
<url>https://repo.spring.io/snapshot</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
<repository>
<id>spring-milestones</id>
<url>https://repo.spring.io/milestone</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>spring-snapshots</id>
<url>https://repo.spring.io/snapshot</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</pluginRepository>
<pluginRepository>
<id>spring-milestones</id>
<url>https://repo.spring.io/milestone</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
</project>
关键配置application-dev.properties
server.port=9010
debug=true
logging.level.root=info
logging.level.tk.mybatis.springboot.mapper=trace
logging.config=ext/conf/logback.xml
sharding.jdbc.datasource.names=ds0,ds1,ds2
sharding.jdbc.datasource.ds0.driver-class-name=com.mysql.jdbc.Driver
sharding.jdbc.datasource.ds0.type=com.alibaba.druid.pool.DruidDataSource
sharding.jdbc.datasource.ds0.url=jdbc:mysql://10.0.4.207:3306/dim_research?characterEncoding=utf8
sharding.jdbc.datasource.ds0.username=root
sharding.jdbc.datasource.ds0.password=ABC.123456.ZwSe2
sharding.jdbc.datasource.ds1.driver-class-name=com.mysql.jdbc.Driver
sharding.jdbc.datasource.ds1.type=com.alibaba.druid.pool.DruidDataSource
sharding.jdbc.datasource.ds1.url=jdbc:mysql://10.0.4.212:3306/dim_research?characterEncoding=utf8
sharding.jdbc.datasource.ds1.username=root
sharding.jdbc.datasource.ds1.password=ABC.123456.ZwSe2
sharding.jdbc.datasource.ds2.driver-class-name=com.mysql.jdbc.Driver
sharding.jdbc.datasource.ds2.type=com.alibaba.druid.pool.DruidDataSource
sharding.jdbc.datasource.ds2.url=jdbc:mysql://10.0.4.213:3306/dim_research?characterEncoding=utf8
sharding.jdbc.datasource.ds2.username=root
sharding.jdbc.datasource.ds2.password=ABC.123456.ZwSe2
sharding.jdbc.config.sharding.tables.result_data.actual-data-nodes=ds0.result_data_0,ds1.result_data_1,ds2.result_data_2
sharding.jdbc.config.sharding.tables.result_data.table-strategy.inline.sharding-column=result_data_id
sharding.jdbc.config.sharding.tables.result_data.table-strategy.inline.algorithm-expression=result_data_$->{result_data_id % 3}
sharding.jdbc.config.sharding.tables.result_data.key-generator-column-name=result_data_id
sharding.jdbc.config.sharding.binding-tables=result_data
sharding.jdbc.config.sharding.default-database-strategy.inline.sharding-column=user_id
sharding.jdbc.config.sharding.default-database-strategy.inline.algorithm-expression=ds$->{user_id % 3}
具体的工程还是参考本人的github
github.com/zhaoenweiex…
测试
测试环境
版本mysql 5.5,未添加索引
数据库节点:虚拟机,4核8G
测试结果
PS:Proxy 插入速度极为缓慢,比单表批量插入性能要低30倍,接近单个插入速度,猜测由于采用id分表的原因,批量插入的模式变为单个插入,性能大幅降低。
- 100W数据下测试
| 操作 | 单表(214) | Proxy(214) | Sharding-jdbc(207,212,213) |
|---|---|---|---|
| Count | 1230 | 3600 | 665 |
| 特殊属性查询 | 1680 | 4900 | 1475 |
| 100分页查询 | 61 | 157 | 48 |
| 批量插入 | 113000 | 3610000 | 686000 |
2.2000W测试
| 操作 | 单表(214) | Proxy(214) | Sharding-jdbc(207,212,213) |
|---|---|---|---|
| Count | 126000 | 53987 | 3555 |
| 特殊属性查询 | 126000 | 9183 | 8760 |
| 100分页查询 | 350 | 307 | 107 |
| 批量插入 | 2189000 | 41599000 | 7237031 |
3.6000W数据的测试
| 操作 | 单表(214) | Proxy(214) | Sharding-jdbc(207,212,213) |
|---|---|---|---|
| Count | 480816 | 258757 | 258757 |
| 特殊属性查询 | 511972 | 252089 | 219035 |
| 100分页查询 | 307 | 270 | 213 |
| 批量插入 | 4331000 | 72819989 | 12278000 |