ShardingSphere之sharding jdbc

1,002 阅读6分钟

sharding通过适配器模式模式对jdbc的Datasource、Connection、Statement、Preparedstatement、ResultSet等核心对象的重新,来提供分片、读取分离等功能。

JDBCShardingSphere
DatasourceShardingSphereDataSource
ConnectionShardingSphereConnection
StatementShardingSphereStatement
PreparedstatementShardingSpherePreparedstatement
ResultSetShardingSphereResultSet

我们先从ShardingSphereDataSource的创建开始

new ShardingSphereDataSource(Strings.isNullOrEmpty(schemaName) ? DefaultSchema.LOGIC_NAME : schemaName, getModeConfiguration(modeConfig), dataSourceMap, configs, props);

ShardingSphereDataSource类的两个成员变量

private final String schemaName;

private final ContextManager contextManager;

public ShardingSphereDataSource(final String schemaName, final ModeConfiguration modeConfig, final Map<String, DataSource> dataSourceMap,
                                final Collection<RuleConfiguration> ruleConfigs, final Properties props) throws SQLException {
    this.schemaName = schemaName;
    contextManager = createContextManager(schemaName, modeConfig, dataSourceMap, ruleConfigs, props);
}

除了获取Connection之外,核心就是ContextManager对象

private ContextManager createContextManager(final String schemaName, final ModeConfiguration modeConfig, final Map<String, DataSource> dataSourceMap,
                                            final Collection<RuleConfiguration> ruleConfigs, final Properties props) throws SQLException {
    //构建mode 1、Memory,2、Standalone
    ShardingSphereMode mode = ModeBuilderEngine.build(modeConfig);
    //保存数据库连接信息
    Map<String, Map<String, DataSource>> dataSourcesMap = Collections.singletonMap(schemaName, dataSourceMap);
    //保存 rule配置信息
    Map<String, Collection<RuleConfiguration>> schemaRuleConfigs = Collections.singletonMap(
            schemaName, ruleConfigs.stream().filter(each -> each instanceof SchemaRuleConfiguration).collect(Collectors.toList()));
    Collection<RuleConfiguration> globalRuleConfigs = ruleConfigs.stream().filter(each -> each instanceof GlobalRuleConfiguration).collect(Collectors.toList());
    //通过spi获取ContextManagerBuilder的实现类
    ContextManagerBuilder builder = TypedSPIRegistry.getRegisteredService(ContextManagerBuilder.class, modeConfig.getType(), new Properties());
    return builder.build(mode, dataSourcesMap, schemaRuleConfigs, globalRuleConfigs, props, modeConfig.isOverwrite());
}

MemoryContextManagerBuilder 类的build方法

    @Override
    public ContextManager build(final ShardingSphereMode mode, final Map<String, Map<String, DataSource>> dataSourcesMap, 
                                final Map<String, Collection<RuleConfiguration>> schemaRuleConfigs, final Collection<RuleConfiguration> globalRuleConfigs, 
                                final Properties props, final boolean isOverwrite) throws SQLException {
        //构建MetaDataContexts
        MetaDataContexts metaDataContexts = new MetaDataContextsBuilder(dataSourcesMap, schemaRuleConfigs, globalRuleConfigs, props).build(null);
        //构建TransactionContexts
        TransactionContexts transactionContexts = createTransactionContexts(metaDataContexts);
        ContextManager result = new MemoryContextManager();
        result.init(metaDataContexts, transactionContexts);
        return result;
    }

MetaDataContextsBuilder的build方法会将MetaDataContexts中需要的数据赋值,包括真实数据库的配置信息、分片的配置信息等。

public MetaDataContexts build(final DistMetaDataPersistService persistService) throws SQLException {
    Map<String, ShardingSphereMetaData> metaDataMap = new HashMap<>(schemaRuleConfigs.size(), 1);
    Map<String, ShardingSphereMetaData> actualMetaDataMap = new HashMap<>(schemaRuleConfigs.size(), 1);
    for (String each : schemaRuleConfigs.keySet()) {
        Map<String, DataSource> dataSourceMap = dataSources.get(each);
        Collection<RuleConfiguration> ruleConfigs = schemaRuleConfigs.get(each);
        DatabaseType databaseType = DatabaseTypeRecognizer.getDatabaseType(dataSourceMap.values());
        Collection<ShardingSphereRule> rules = ShardingSphereRulesBuilder.buildSchemaRules(new ShardingSphereRulesBuilderMaterials(each, ruleConfigs, databaseType, dataSourceMap, props));
        Map<TableMetaData, TableMetaData> tableMetaData = SchemaBuilder.build(new SchemaBuilderMaterials(databaseType, dataSourceMap, rules, props));
        ShardingSphereRuleMetaData ruleMetaData = new ShardingSphereRuleMetaData(ruleConfigs, rules);
        ShardingSphereResource resource = buildResource(databaseType, dataSourceMap);
        ShardingSphereSchema actualSchema = new ShardingSphereSchema(tableMetaData.keySet().stream().filter(Objects::nonNull).collect(Collectors.toMap(TableMetaData::getName, v -> v)));
        actualMetaDataMap.put(each, new ShardingSphereMetaData(each, resource, ruleMetaData, actualSchema));
        metaDataMap.put(each, new ShardingSphereMetaData(each, resource, ruleMetaData, buildSchema(tableMetaData)));
    }
    OptimizeContextFactory optimizeContextFactory = new OptimizeContextFactory(actualMetaDataMap);
    return new MetaDataContexts(persistService, metaDataMap, buildGlobalSchemaMetaData(metaDataMap), executorEngine, props, optimizeContextFactory);
}

MemoryContextManager类init方法

public final class MemoryContextManager implements ContextManager {
    
    private volatile MetaDataContexts metaDataContexts = new MetaDataContexts(null);
    
    private volatile TransactionContexts transactionContexts = new TransactionContexts();
    
    @Override
    public void init(final MetaDataContexts metaDataContexts, final TransactionContexts transactionContexts) {
        this.metaDataContexts = metaDataContexts;
        this.transactionContexts = transactionContexts;
    }

MetaDataContexts为元数据部分,元数据是表示数据的数据。从数据库角度而言,则概括为数据库的任何数据都是元数据,因此如列名、数据库名、用户名、表名等以及数据自定义库表存储的关于数据库对象的信息都是元数据。而 ShardingSphere 中的核心功能如数据分片、加解密等都是需要基于数据库的元数据生成路由或者加密解密的列实现,由此可见元数据是 ShardingSphere 系统运行的核心,同样也是每一个数据存储相关中间件或者组件的核心数据。有了元数据的注入,相当于整个系统有了神经中枢,可以结合元数据完成对于库、表、列的个性化操作,如数据分片、数据加密、SQL 改写等。

MetaDataContexts类的成员变量

private final DistMetaDataPersistService distMetaDataPersistService;

//数据库元数据
private final Map<String, ShardingSphereMetaData> metaDataMap;
//规则元数据
private final ShardingSphereRuleMetaData globalRuleMetaData;

private final ExecutorEngine executorEngine;

private final OptimizeContextFactory optimizeContextFactory;

private final ConfigurationProperties props;

private final StateContext stateContext;

ShardingSphereMetaData类的成员变量

private final String name;

private final ShardingSphereResource resource;

private final ShardingSphereRuleMetaData ruleMetaData;

private final ShardingSphereSchema schema;

ShardingSphereSchema类的成员变量

private final Map<String, TableMetaData> tables;

TableMetaData类的成员变量

private final String name;

private final Map<String, ColumnMetaData> columns;

private final Map<String, IndexMetaData> indexes;

而对于 ShardingSphere 元数据的加载过程,首先需要弄清楚在 ShardingSphere 中元数据的类型以及分级。在 ShardingSphere 中元数据主要围绕着 ShardingSphereMetaData 来进行展开,其中较为核心的是 ShardingSphereSchema。该结构是数据库的元数据,同时也为数据源元数据的顶层对象,在 ShardingSphere 中数据库元数据的结构如下图。对于每一层来说,上层数据来源于下层数据的组装,所以下面我们采用从下往上的分层方式进行逐一剖析。

image.png

ColumMetaData 和 IndexMetaData 是组成 TableMetaData 的基本元素,下面我们分开讲述两种元数据的结构以及加载过程。ColumMetaData、IndexMetaData 主要结构如下:

public final class ColumnMetaData {
    // 列名
    private final String name;
    // 数据类型
    private final int dataType;
    // 是否主键
    private final boolean primaryKey;
    // 是否自动生成
    private final boolean generated;
    // 是否区分大小写
    private final boolean caseSensitive;
}

public final class IndexMetaData {
    
    private final String name;
}

元数据加载优化分析 虽然说元数据是我们系统的核心,是必不可少的,但是在系统启动时进行数据加载,必然会导致系统的负载增加,系统启动效率低。所以我们需要对加载的过程进行优化,目前主要是以下两方面的探索:

一、使用 SQL 查询替换原生 JDBC 驱动连接

在 5.0.0-beta 版本之前,采用的方式是通过原生 JDBC 驱动原生方式加载。在 5.0.0-beta 版本中,我们逐步采用了使用数据库方言,通过 SQL 查询的方式,多线程方式实现了元数据的加载。进一步提高了系统数据加载的速度。详细的方言 Loader 可以查看 org.apache.shardingsphere.infra.metadata.schema.builder.spi.DialectTableMetaDataLoader 的相关实现。

二、减少元数据的加载次数

对于系统通用的资源的加载,我们遵循一次加载,多处使用。当然在这个过程中,我们也要权衡空间和时间,所以我们在不断的进行优化,减少元数据的重复加载,提高系统整体的效率。

通过ShardingSphereDataSource获取ShardingSphereConnetion

@Override
public Connection getConnection() {
    return DriverStateContext.getConnection(schemaName, getDataSourceMap(), contextManager, TransactionTypeHolder.get());
}

DriverStateContext类的getConnection

public static Connection getConnection(final String schemaName, final Map<String, DataSource> dataSourceMap, final ContextManager contextManager, final TransactionType transactionType) {
    return STATES.get(contextManager.getMetaDataContexts().getStateContext().getCurrentState()).getConnection(schemaName, dataSourceMap, contextManager, transactionType);
}

OKDriverState类的getConnection

@Override
public Connection getConnection(final String schemaName, final Map<String, DataSource> dataSourceMap, final ContextManager contextManager, final TransactionType transactionType) {
    return new ShardingSphereConnection(schemaName, dataSourceMap, contextManager, TransactionTypeHolder.get());
}

通过ShardingSphereConnection获取ShardingSphereStatement

@Override
public Statement createStatement() {
    return new ShardingSphereStatement(this);
}

ShardingSphereStatement的构造器

@Getter
private final ShardingSphereConnection connection;

private final MetaDataContexts metaDataContexts;

private final List<Statement> statements;

private final StatementOption statementOption;

private final DriverJDBCExecutor driverJDBCExecutor;

private final RawExecutor rawExecutor;

@Getter(AccessLevel.PROTECTED)
private final FederateExecutor federateExecutor;

private final KernelProcessor kernelProcessor;

private boolean returnGeneratedKeys;

private ExecutionContext executionContext;

private ResultSet currentResultSet;

public ShardingSphereStatement(final ShardingSphereConnection connection, final int resultSetType, final int resultSetConcurrency, final int resultSetHoldability) {
    super(Statement.class);
    this.connection = connection;
    metaDataContexts = connection.getContextManager().getMetaDataContexts();
    statements = new LinkedList<>();
    statementOption = new StatementOption(resultSetType, resultSetConcurrency, resultSetHoldability);
    JDBCExecutor jdbcExecutor = new JDBCExecutor(metaDataContexts.getExecutorEngine(), connection.isHoldTransaction());
    driverJDBCExecutor = new DriverJDBCExecutor(connection.getSchemaName(), metaDataContexts, jdbcExecutor);
    rawExecutor = new RawExecutor(metaDataContexts.getExecutorEngine(), connection.isHoldTransaction(), metaDataContexts.getProps());
    // TODO Consider FederateRawExecutor
    federateExecutor = new FederateJDBCExecutor(connection.getSchemaName(), metaDataContexts.getOptimizeContextFactory(), metaDataContexts.getProps(), jdbcExecutor);
    kernelProcessor = new KernelProcessor();
}

可以发现ShardingSphere对jdbc的封装,做的大多数就是加入了很多配置类,记录各种真实数据、分片规则等,用于后面的路由、改写、执行、归并。

总结

在代码跟踪的过程中,涉及到大量的配置类,虽没有实际的功能,但都是用于辅助后续流程的entity。这次主要是看了下shardingsphere jdbc中元数据部分的结构。还有就是执行流程:首先构建DataSource,接下来通过DataSource获取Connection,再然后通过Connection获取Statement,通过Statement执行sql语句,会经过解析器、路由器、改写器、执行器、归并器,最终将ResultSet返回给用户。后续会跟进解析器、路由器、改写器、执行器、归并器等五大核心引擎。