ShardingJDBC源码阅读(四)路由(上)

1,011 阅读14分钟

前言

本章分析ShardingJDBC的核心步骤:路由

一、DataNodeRouter

回到DataNodeRouterexecuteRoute方法,此时已经完成SQL解析工作,createRouteContext方法构造的RouteContext中包含SQLStatementContext、params(参数列表)、new RouteResult()(一个空的路由结果)。

public final class DataNodeRouter {
    // 包含数据源和表结构信息
    private final ShardingSphereMetaData metaData;
    // BaseRule-RouteDecorator映射关系,BasePrepareEngine注入
    private final Map<BaseRule, RouteDecorator> decorators = new LinkedHashMap<>();
    
    private RouteContext executeRoute(final String sql, final List<Object> parameters, final boolean useCache) {
        // 解析
        RouteContext result = createRouteContext(sql, parameters, useCache);
        // 路由
        for (Entry<BaseRule, RouteDecorator> entry : decorators.entrySet()) {
            result = entry.getValue().decorate(result, metaData, entry.getKey(), properties);
        }
        return result;
    }
}

decorators是在BasePrepareEngine#registerRouteDecorator方法执行时,注册的BaseRuleRouteDecorator的映射关系。

这里我们只讨论ShardingRule和对应的ShardingRouteDecorator,如果ShardingRule里有MasterSlaveRule配置,这里也会调用MasterSlaveRouteDecoratorRouteContext进行二次装饰。

二、ShardingRouteDecorator

public final class ShardingRouteDecorator implements RouteDecorator<ShardingRule> {
    
    @Override
    public RouteContext decorate(final RouteContext routeContext, final ShardingSphereMetaData metaData, final ShardingRule shardingRule, final ConfigurationProperties properties) {
        // SQL上下文 SQLParserEngine解析SQL DataNodeRouter创建
        SQLStatementContext sqlStatementContext = routeContext.getSqlStatementContext();
        // 参数列表
        List<Object> parameters = routeContext.getParameters();
        // 对于SQL做校验
        ShardingStatementValidatorFactory.newInstance(
                sqlStatementContext.getSqlStatement()).ifPresent(validator -> validator.validate(shardingRule, sqlStatementContext.getSqlStatement(), parameters));
        // 创建ShardingConditions 包含很多RouteValue 用于route
        ShardingConditions shardingConditions = getShardingConditions(parameters, sqlStatementContext, metaData.getSchema(), shardingRule);
        // 合并shardingConditions
        boolean needMergeShardingValues = isNeedMergeShardingValues(sqlStatementContext, shardingRule);
        if (sqlStatementContext.getSqlStatement() instanceof DMLStatement && needMergeShardingValues) {
            checkSubqueryShardingValues(sqlStatementContext, shardingRule, shardingConditions);
            mergeShardingConditions(shardingConditions);
        }
        // 路由
        ShardingRouteEngine shardingRouteEngine = ShardingRouteEngineFactory.newInstance(shardingRule, metaData, sqlStatementContext, shardingConditions, properties);
        RouteResult routeResult = shardingRouteEngine.route(shardingRule);
        if (needMergeShardingValues) {
            Preconditions.checkState(1 == routeResult.getRouteUnits().size(), "Must have one sharding with subquery.");
        }
        // 把路由结果放入上下文
        return new RouteContext(sqlStatementContext, parameters, routeResult);
    }
}

1、SQL校验

首先ShardingStatementValidatorFactory根据SQL的类型,创建了校验器。

public final class ShardingStatementValidatorFactory {
    public static Optional<ShardingStatementValidator> newInstance(final SQLStatement sqlStatement) {
        if (sqlStatement instanceof InsertStatement) {
            return Optional.of(new ShardingInsertStatementValidator());
        }
        if (sqlStatement instanceof UpdateStatement) {
            return Optional.of(new ShardingUpdateStatementValidator());
        }
        return Optional.empty();
    }
}

对于SelectStatement没有校验。InsertStatement禁止on duplicate key update更新分片键;UpdateStatement禁止更新分片键。

2、创建ShardingConditions

回到ShardingRouteDecorator,接着getShardingConditions方法创建ShardingConditions

 private ShardingConditions getShardingConditions(final List<Object> parameters, 
                                                     final SQLStatementContext sqlStatementContext, final SchemaMetaData schemaMetaData, final ShardingRule shardingRule) {
    if (sqlStatementContext.getSqlStatement() instanceof DMLStatement) {
        if (sqlStatementContext instanceof InsertStatementContext) {
            return new ShardingConditions(new InsertClauseShardingConditionEngine(shardingRule).createShardingConditions((InsertStatementContext) sqlStatementContext, parameters));
        }
        // 对于Select语句走这里,根据where子句过滤出需要执行分片策略的表和对应的字段
        return new ShardingConditions(new WhereClauseShardingConditionEngine(shardingRule, schemaMetaData).createShardingConditions(sqlStatementContext, parameters));
    }
    return new ShardingConditions(Collections.emptyList());
}

ShardingConditions很重要,它包含了RouteValue,各种ShardingStrategy分片策略的doSharding方法都需要用到。而RouteValue也是ShardingValue的雏形,各种ShardingAlgorithm分片算法需要用到。 ShardingConditions这个对象主要是用于确定本次执行的sql涉及的需要执行分片策略的表、字段、字段值。

public final class ShardingConditions {
    private final List<ShardingCondition> conditions;
}
public class ShardingCondition {
    private final List<RouteValue> routeValues = new LinkedList<>();
}
public interface RouteValue {
    String getColumnName();
    String getTableName();
}
// RouteValue的实现类ListRouteValue 处理 = in
public final class ListRouteValue<T extends Comparable<?>> implements RouteValue {
    private final String columnName;
    private final String tableName;
    // 比如in语句 这里values就有多个值 比如=语句 这里就只有一个值
    private final Collection<T> values;
}
// RouteValue的实现类RangeRouteValue 处理between and 和 < >
public final class RangeRouteValue<T extends Comparable<?>> implements RouteValue {
    private final String columnName;
    private final String tableName;
    // 区间 比如[0,1]、[2, 3)、(1, 正无穷)等等
    private final Range<T> valueRange;
}

WhereClauseShardingConditionEngine的唯一公共方法createShardingConditions创建ShardingCondition集合。

public final class WhereClauseShardingConditionEngine {
  private final ShardingRule shardingRule;
  private final SchemaMetaData schemaMetaData;
  public List<ShardingCondition> createShardingConditions(final SQLStatementContext sqlStatementContext, final List<Object> parameters) {
      if (!(sqlStatementContext instanceof WhereAvailable)) {
          return Collections.emptyList();
      }
      List<ShardingCondition> result = new ArrayList<>();
      Optional<WhereSegment> whereSegment = ((WhereAvailable) sqlStatementContext).getWhere();
      if (whereSegment.isPresent()) {
          // 有where分段,创建ShardingCondition集合
          result.addAll(createShardingConditions(sqlStatementContext, whereSegment.get().getAndPredicates(), parameters));
      }
      return result;
  }
}

进入createShardingConditions方法,循环所有and断言。

private Collection<ShardingCondition> createShardingConditions(final SQLStatementContext sqlStatementContext, final Collection<AndPredicate> andPredicates, final List<Object> parameters) {
    Collection<ShardingCondition> result = new LinkedList<>();
    // 循环所有的and
    for (AndPredicate each : andPredicates) {
        // 创建 字段 - RouteValue集合 的映射关系
        Map<Column, Collection<RouteValue>> routeValueMap = createRouteValueMap(sqlStatementContext, each, parameters);
        if (routeValueMap.isEmpty()) {
            return Collections.emptyList();
        }
        // 创建单个ShardingCondition
        result.add(createShardingCondition(routeValueMap));
    }
    return result;
}

createRouteValueMap循环and内部的断言Segement。

 private Map<Column, Collection<RouteValue>> createRouteValueMap(final SQLStatementContext sqlStatementContext, final AndPredicate andPredicate, final List<Object> parameters) {
    Map<Column, Collection<RouteValue>> result = new HashMap<>();
    for (PredicateSegment each : andPredicate.getPredicates()) {
        // 根据ColumnSegment中列所属表(可能有别名)和 schema元数据 找到表名
        Optional<String> tableName = sqlStatementContext.getTablesContext().findTableName(each.getColumn(), schemaMetaData);
        // 判断ShardingRule中表对应字段是否是分片键
        if (!tableName.isPresent() || !shardingRule.isShardingColumn(each.getColumn().getIdentifier().getValue(), tableName.get())) {
            continue;
        }
        // 构造Column 作为返回Map的key
        Column column = new Column(each.getColumn().getIdentifier().getValue(), tableName.get());
        // 构造RouteValue 作为返回Map的Value中的元素
        Optional<RouteValue> routeValue = ConditionValueGeneratorFactory.generate(each.getRightValue(), column, parameters);
        if (!routeValue.isPresent()) {
            continue;
        }
        if (!result.containsKey(column)) {
            result.put(column, new LinkedList<>());
        }
        // 放入返回Map
        result.get(column).add(routeValue.get());
    }
    return result;
}

这么看ShardingRouteDecorator#getShardingConditions方法还是有点难以理解,如何从一个SQLStatement构造出ShardingCondition集合。

举个几个例子。

案例1select * from t_order where user_id in (1,2) and order_id = 534695469004599297 and status in (1,2)。只包含1个AndPredicate,这个AndPredicate包含3个PredicateSegment

案例2select * from t_order where user_id = 2 and order_id = 534695469004599297 or user_id = 3 and order_id = 534695469004599297 or user_id = 4 and order_id = 534695469004599297。包含3个AndPredicate,每个AndPredicate包含2个PredicateSegment,最终生成3个ShardingCondition。。

案例3select * from t_order where (user_id = 2 or user_id = 3 or user_id = 4) and (order_id = 534695469004599297 or order_id = 534695469021376512)。包含6个AndPredicate,每个AndPredicate包含2个PredicateSegment,最终生成6个ShardingCondition(图忽略了,太大了)。 这里为什么是6个AndPredicate,是因为在解析阶段org.apache.shardingsphere.sql.parser.mysql.visitor.impl.MySQLDMLVisitor#visitWhereClause这里将OrPredicate中的AndPredicate都放入了最后的AndPredicate集合。

@Override
public ASTNode visitWhereClause(final WhereClauseContext ctx) {
    WhereSegment result = new WhereSegment(ctx.getStart().getStartIndex(), ctx.getStop().getStopIndex());
    ASTNode segment = visit(ctx.expr());
    if (segment instanceof OrPredicateSegment) {
        // 将OrPredicateSegment里的andPredicates放入result
        result.getAndPredicates().addAll(((OrPredicateSegment) segment).getAndPredicates());
    } else if (segment instanceof PredicateSegment) {
        AndPredicate andPredicate = new AndPredicate();
        andPredicate.getPredicates().add((PredicateSegment) segment);
        result.getAndPredicates().add(andPredicate);
    }
    return result;
}

从上面三个案例看出,一般情况下AndPredicate的数量就等于最后ShardingCondition的数量。而AndPredicate的构造和sql的含义息息相关,代表and连接的一个短语(理解为and优先级高导致一些匹配规则优先合并在一起判断)。

即一个ShardingCondition对应一个and组合短句,一个ShardingCondition包含多个断言即多个RouteValue,一个RouteValue对应一个表+字段+值。

3、合并shardingConditions

再次回到ShardingRouteDecoratordecorate方法,此时已经有了ShardingCondition(知道需要对哪些表的哪些字段的哪些值做路由选择)。

public final class ShardingRouteDecorator implements RouteDecorator<ShardingRule> {
    
    @Override
    public RouteContext decorate(final RouteContext routeContext, final ShardingSphereMetaData metaData, final ShardingRule shardingRule, final ConfigurationProperties properties) {
        // SQL上下文 SQLParserEngine解析SQL DataNodeRouter创建
        SQLStatementContext sqlStatementContext = routeContext.getSqlStatementContext();
        // 参数列表
        List<Object> parameters = routeContext.getParameters();
        // 创建ShardingConditions 包含很多RouteValue 用于route
        ShardingConditions shardingConditions = getShardingConditions(parameters, sqlStatementContext, metaData.getSchema(), shardingRule);
        // 合并shardingConditions
        boolean needMergeShardingValues = isNeedMergeShardingValues(sqlStatementContext, shardingRule);
        if (sqlStatementContext.getSqlStatement() instanceof DMLStatement && needMergeShardingValues) {
            checkSubqueryShardingValues(sqlStatementContext, shardingRule, shardingConditions);
            mergeShardingConditions(shardingConditions);
        }
        // 路由
        ...
    }
}

对于包含子查询的语句,需要合并ShardingCondition。但是我看的4.1.0版本,好像子查询有bug,相关的代码已经被标注为FIXME并且注释掉了,isContainsSubquery永远返回false。

private boolean isNeedMergeShardingValues(final SQLStatementContext sqlStatementContext, final ShardingRule shardingRule) {
        return sqlStatementContext instanceof SelectStatementContext && ((SelectStatementContext) sqlStatementContext).isContainsSubquery() 
                && !shardingRule.getShardingLogicTableNames(sqlStatementContext.getTablesContext().getTableNames()).isEmpty();
    }

不过平常写SQL也不太会写子查询,性能差,这个逻辑跳过。

4、路由

public final class ShardingRouteDecorator implements RouteDecorator<ShardingRule> {
    @Override
    public RouteContext decorate(final RouteContext routeContext, final ShardingSphereMetaData metaData, final ShardingRule shardingRule, final ConfigurationProperties properties) {
        ...
        // 路由
        ShardingRouteEngine shardingRouteEngine = ShardingRouteEngineFactory.newInstance(shardingRule, metaData, sqlStatementContext, shardingConditions, properties);
        RouteResult routeResult = shardingRouteEngine.route(shardingRule);
        // 把路由结果放入上下文
        return new RouteContext(sqlStatementContext, parameters, routeResult);
    }
}

接下来分为四个模块对路由进行深入了解,对于源码会进行一些改写方便阅读(方法嵌套调用和超长的三元运算导致读起来很困难)。

  • RouteResult路由结果,知道在sql重写之前需要获取哪些有用信息。
  • ShardingRouteEngine路由引擎,知道路由处理的几种方式,如何通过路由引擎生成RouteResult
  • ShardingRouteEngineFactory路由引擎工厂,知道不同的sql由哪些不同的引擎处理,什么时候会走广播,什么时候会做全库表的扫描。
  • ShardingStrategy分片策略,知道ShardingStrategryConfiguration对运行时的路由有何影响。

4-1、RouteResult

RouteResult代表路由结果,是RoutingEngine的产物。

public final class RouteResult {
    // DataNode
    private final Collection<Collection<DataNode>> originalDataNodes = new LinkedList<>();
    // RouteUnit
    private final Collection<RouteUnit> routeUnits = new LinkedHashSet<>();
}

一个RouteResult包含多个RouteUnit,一个RouteUnit对应一个数据源的路由结果。

public final class RouteUnit {
    // dataSource
    private final RouteMapper dataSourceMapper;
    // table
    private final Collection<RouteMapper> tableMappers;
}

RouteMapper代表逻辑名称与实际名称的映射关系,有了这个映射关系,sql重写才能够实现。

public final class RouteMapper {
    private final String logicName;
    private final String actualName;
}

此外RouteResult里还包含了多个DataNodeDataNode表示实际的数据节点,每个DataNode对应一个实际数据源名称和一个实际表名。

public final class DataNode {
    // 实际数据源名
    private final String dataSourceName;
    // 实际表名
    private final String tableName;
}

4-2、ShardingRouteEngine

ShardingRouteEngine只有一个route方法,就是通过一系列参数,获取RouteResult路由结果

public interface ShardingRouteEngine {
    RouteResult route(ShardingRule shardingRule);
}

ShardingDatabaseBroadcastRoutingEngine

ShardingDatabaseBroadcastRoutingEngine,数据源广播,返回RouteResult包含所有数据源。

public final class ShardingDatabaseBroadcastRoutingEngine implements ShardingRouteEngine {
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        RouteResult result = new RouteResult();
        for (String each : shardingRule.getShardingDataSourceNames().getDataSourceNames()) {
            result.getRouteUnits().add(new RouteUnit(new RouteMapper(each, each), Collections.emptyList()));
        }
        return result;
    }
}

ShardingTableBroadcastRoutingEngine

ShardingTableBroadcastRoutingEngine,表广播。

@RequiredArgsConstructor
public final class ShardingTableBroadcastRoutingEngine implements ShardingRouteEngine {
    private final SchemaMetaData schemaMetaData;
    private final SQLStatementContext sqlStatementContext;
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        RouteResult result = new RouteResult();
        // 通过sqlStatementContext找到所有逻辑表名
        Collection<String> logicTableNames = getLogicTableNames();
        // 循环每个逻辑表(表广播路由的含义)
        for (String each : logicTableNames) {
            // 根据ShardingRule和逻辑表名 找到所有对应的数据源 组装为RouteUnit集合
            Collection<RouteUnit> routeUnits = getAllRouteUnits(shardingRule, each);
            result.getRouteUnits().addAll(routeUnits);
        }
        return result;
    }
}

首先通过SQLStatementContext获取所有逻辑表名。这里有个分支逻辑,如果是删除索引sql,通过sql和SchemaMetaData获取逻辑表名;如果非删除索引sql,直接从sql上下文的table上下文中获取所有逻辑表名。

private Collection<String> getLogicTableNames() {
    // 删除索引sql
    if (sqlStatementContext.getSqlStatement() instanceof DropIndexStatement 
            && !((DropIndexStatement) sqlStatementContext.getSqlStatement()).getIndexes().isEmpty()) {
        return getTableNamesFromMetaData((DropIndexStatement) sqlStatementContext.getSqlStatement());
    }
    // 其他sql
    else {
        return sqlStatementContext.getTablesContext().getTableNames();
    }
}

接着循环逻辑表名,组装RouteUnit。这里我们需要知道,根据logicTableName可以从ShardingRule中获取对应的TableRule,得到TableRule就可以得到所有实际的DataNode。后续很多ShardingRouteEngine都是通过这种方式确定RouteUnit的。

private Collection<RouteUnit> getAllRouteUnits(final ShardingRule shardingRule, final String logicTableName) {
    Collection<RouteUnit> result = new LinkedList<>();
    // 通过逻辑表名从ShardingRule中获取TableRule
    TableRule tableRule = shardingRule.getTableRule(logicTableName);
    for (DataNode each : tableRule.getActualDataNodes()) {
        // 数据源mapper
        RouteMapper dataSourceMapper = new RouteMapper(each.getDataSourceName(), each.getDataSourceName());
        // 表mapper
        RouteMapper tableMapper = new RouteMapper(logicTableName, each.getTableName());
        // unit
        RouteUnit routeUnit = new RouteUnit(dataSourceMapper, Collections.singletonList(tableMapper));
        result.add(routeUnit);
    }
    return result;
}

看一下ShardingRule如何通过逻辑表名获取到TableRule,这个逻辑很关键,后续都会见到。

public class ShardingRule implements BaseRule {
    // 持有所有数据源名称
    private final ShardingDataSourceNames shardingDataSourceNames;
    // 配置了分片规则的TableRule
    private final Collection<TableRule> tableRules;
    // 广播表
    private final Collection<String> broadcastTables;
    
	public TableRule getTableRule(final String logicTableName) {
        // 优先取配置了分片规则的TableRule
        Optional<TableRule> tableRule = findTableRule(logicTableName);
        if (tableRule.isPresent()) {
            return tableRule.get();
        }
        // 如果是广播表 new一个TableRule
        // 数据源名使用shardingDataSourceNames.getDataSourceNames得到的所有数据源名
        if (isBroadcastTable(logicTableName)) {
            return new TableRule(shardingDataSourceNames.getDataSourceNames(), logicTableName);
        }
        // 如果有默认数据源名 new一个TableRule
        // 数据源名使用默认数据源名
        if (!Strings.isNullOrEmpty(shardingDataSourceNames.getDefaultDataSourceName())) {
            return new TableRule(shardingDataSourceNames.getDefaultDataSourceName(), logicTableName);
        }
        // 如果上述条件都不满足,抛出异常
        throw new ShardingSphereConfigurationException("Cannot find table rule and default data source with logic table: '%s'", logicTableName);
    }
    // 根据逻辑表名 匹配 配置了分片规则的TableRule
    public Optional<TableRule> findTableRule(final String logicTableName) {
        return tableRules.stream().filter(each -> each.getLogicTable().equalsIgnoreCase(logicTableName)).findFirst();
    }
    // 判断逻辑表名 是否是 广播表
    public boolean isBroadcastTable(final String logicTableName) {
        return broadcastTables.stream().anyMatch(each -> each.equalsIgnoreCase(logicTableName));
    }
}

ShardingMasterInstanceBroadcastRoutingEngine

ShardingMasterInstanceBroadcastRoutingEngine,根据数据库实例路由,对于主从数据源只会路由到master实例。

@RequiredArgsConstructor
public final class ShardingMasterInstanceBroadcastRoutingEngine implements ShardingRouteEngine {
    private final DataSourceMetas dataSourceMetas;
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        RouteResult result = new RouteResult();
        // 获取所有数据源名
        Collection<String> dataSourceNames = shardingRule.getShardingDataSourceNames().getDataSourceNames();
        // 循环数据源名
        for (String each : dataSourceNames) {
            // 获取数据库实例对应唯一dataSourceName集合(如果host:port一致认为是同一实例)
            Collection<String> allInstanceDataSourceNames = dataSourceMetas.getAllInstanceDataSourceNames();
            // 只有在allInstanceDataSourceNames集合中的数据源才能加入RouteUnit
            if (allInstanceDataSourceNames.contains(each)) {
                // 查找MasterSlaveRule 如果each是某个master或slave 则返回MasterSlaveRule
                Optional<MasterSlaveRule> masterSlaveRule = shardingRule.findMasterSlaveRule(each);
                // 如果没有对应的MasterSlaveRule 或 是master数据源(言外之意,slave数据源不会加入RouteUnit)
                if (!masterSlaveRule.isPresent() || masterSlaveRule.get().getMasterDataSourceName().equals(each)) {
                    result.getRouteUnits().add(new RouteUnit(new RouteMapper(each, each), Collections.emptyList()));
                }
            }
        }
        return result;
    }
}

重点看一下dataSourceMetas.getAllInstanceDataSourceNames方法。

public final class DataSourceMetas {
    
    private final Map<String, DataSourceMetaData> dataSourceMetaDataMap;
	public Collection<String> getAllInstanceDataSourceNames() {
        Collection<String> result = new LinkedList<>();
        for (Entry<String, DataSourceMetaData> entry : dataSourceMetaDataMap.entrySet()) {
            // 判断结果集中是否存在当前entry对应的实例,不存在则放入结果集
            if (!isExisted(entry.getKey(), result)) {
                result.add(entry.getKey());
            }
        }
        return result;
    }
    
    private boolean isExisted(final String dataSourceName, final Collection<String> existedDataSourceNames) {
        return existedDataSourceNames.stream()
        .anyMatch(each -> isInSameDatabaseInstance(dataSourceMetaDataMap.get(dataSourceName), dataSourceMetaDataMap.get(each)));
    }
    // 判断是否是同一个数据库实例
    private boolean isInSameDatabaseInstance(final DataSourceMetaData sample, final DataSourceMetaData target) {
        return sample instanceof MemorizedDataSourceMetaData
        		// H2
                ? Objects.equals(target.getSchema(), sample.getSchema()) 
                // MySQL 判断host与port完全一致 即 同一实例
                : target.getHostName().equals(sample.getHostName()) && target.getPort() == sample.getPort();
    }
}

ShardingUnicastRoutingEngine

ShardingUnicastRoutingEngine,从字面上看译为单播路由,实际上含义是随机数据源路由。

@RequiredArgsConstructor
public final class ShardingUnicastRoutingEngine implements ShardingRouteEngine {
  private final Collection<String> logicTables;
  @Override
  public RouteResult route(final ShardingRule shardingRule) {
      RouteResult result = new RouteResult();
      // 从所有数据源中获取随机数据源
      String dataSourceName = shardingRule.getShardingDataSourceNames().getRandomDataSourceName();
      RouteMapper dataSourceMapper = new RouteMapper(dataSourceName, dataSourceName);
      // 如果全是广播表,则逻辑表全加入RouteUnit的tableMappers,RouteUnit的tableMappers的dataSourceMapper为上面随机数据源
      if (shardingRule.isAllBroadcastTables(logicTables)) {
          List<RouteMapper> tableMappers = new ArrayList<>(logicTables.size());
          for (String each : logicTables) {
              tableMappers.add(new RouteMapper(each, each));
          }
          result.getRouteUnits().add(new RouteUnit(dataSourceMapper, tableMappers));
      }
      // 如果逻辑表是空集合,RouteUnit的tableMappers为空集合,RouteUnit的tableMappers的dataSourceMapper为上面随机数据源
      else if (logicTables.isEmpty()) {
          result.getRouteUnits().add(new RouteUnit(dataSourceMapper, Collections.emptyList()));
      }
      // 如果逻辑表只有1个元素,RouteUnit的dataSourceMapper为上面随机数据源
      else if (1 == logicTables.size()) {
          String logicTableName = logicTables.iterator().next();
          // 如果通过逻辑表没找到TableRule,tableMappers为空
          if (!shardingRule.findTableRule(logicTableName).isPresent()) {
              result.getRouteUnits().add(new RouteUnit(dataSourceMapper, Collections.emptyList()));
              return result;
          }
          // 如果逻辑表有对应的TableRule,则通过TableRule找到DataNode
          DataNode dataNode = shardingRule.getDataNode(logicTableName);
          RouteMapper tableMapper = new RouteMapper(logicTableName, dataNode.getTableName());
          result.getRouteUnits().add(new RouteUnit(dataSourceMapper, Collections.singletonList(tableMapper)));
      }
      // 逻辑表的元素数量 大于 1
      else {
          List<RouteMapper> tableMappers = new ArrayList<>(logicTables.size());
          // 数据源交集
          Set<String> availableDatasourceNames = null;
          boolean first = true;
          for (String each : logicTables) {
              // 通过逻辑表名找到TableRule
              TableRule tableRule = shardingRule.getTableRule(each);
              // 通过TableRule选择第一个DataNode
              DataNode dataNode = tableRule.getActualDataNodes().get(0);
              // 加入tableMappers
              tableMappers.add(new RouteMapper(each, dataNode.getTableName()));
              // 数据源取交集逻辑
              Set<String> currentDataSourceNames = new HashSet<>(tableRule.getActualDatasourceNames().size());
              for (DataNode eachDataNode : tableRule.getActualDataNodes()) {
                  currentDataSourceNames.add(eachDataNode.getDataSourceName());
              }
              if (first) {
                  availableDatasourceNames = currentDataSourceNames;
                  first = false;
              } else {
                  availableDatasourceNames = Sets.intersection(availableDatasourceNames, currentDataSourceNames);
              }
          }
          // 如果数据源不存在交集,则报错
          if (availableDatasourceNames.isEmpty()) {
              throw new ShardingSphereConfigurationException("Cannot find actual datasource intersection for logic tables: %s", logicTables);
          }
          // 从数据源交集中 获取随机数据源
          dataSourceName = shardingRule.getShardingDataSourceNames().getRandomDataSourceName(availableDatasourceNames);
          // 用随机数据源作为RouteUnit的dataSourceMapper
          result.getRouteUnits().add(new RouteUnit(new RouteMapper(dataSourceName, dataSourceName), tableMappers));
      }
      return result;
  }
}

总结一下上面ShardingUnicastRoutingEngineroute方法。

  • 如果全是广播表,选择随机数据源,并将所有表作为tableMapper放入RouteUnit
  • 如果逻辑表数量为0,选择随机数据源,并将空集合放入RouteUnit的tableMappers。
  • 如果逻辑表数量为1,选择随机数据源,根据逻辑表名是否匹配到TableRule决定RouteUnit的tableMappers。
  • 如果逻辑表数量大于1,选择所有数据源交集中的随机数据源,并将所有表作为tableMapper放入RouteUnit。数据源交集为空会报错。

ShardingDataSourceGroupBroadcastRoutingEngine

ShardingDataSourceGroupBroadcastRoutingEngine,数据源组播。

public final class ShardingDataSourceGroupBroadcastRoutingEngine implements ShardingRouteEngine {
    
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        RouteResult result = new RouteResult();
        // 数据源分组集合 每个tableRule中对应的实际数据源名集合(去重) 构成的集合
        Collection<Set<String>> dataSourceGroup = getDataSourceGroup(shardingRule);
        // 确定广播数据源分组集合
        Collection<Set<String>> broadcastDataSourceGroup = getBroadcastDataSourceGroup(dataSourceGroup);
        // 每组随机选出一个数据源
        for (Set<String> each : broadcastDataSourceGroup) {
            String dataSourceName = getRandomDataSourceName(each);
            result.getRouteUnits().add(new RouteUnit(new RouteMapper(dataSourceName, dataSourceName), Collections.emptyList()));
        }
        return result;
    }
}

首先获取数据源分组集合。

private Collection<Set<String>> getDataSourceGroup(final ShardingRule shardingRule) {
    Collection<Set<String>> result = new LinkedList<>();
    // 循环所有TableRule
    for (TableRule each : shardingRule.getTableRules()) {
        // 实际数据源名 - DataNode(实际数据源名&实际表名)
        Map<String, List<DataNode>> dataNodeGroup = each.getDataNodeGroups();
        result.add(dataNodeGroup.keySet());
    }
    // 将默认数据源名加入结果集
    if (null != shardingRule.getShardingDataSourceNames().getDefaultDataSourceName()) {
        result.add(Sets.newHashSet(shardingRule.getShardingDataSourceNames().getDefaultDataSourceName()));
    }
    return result;
}

然后获取广播数据源分组集合。

private Collection<Set<String>> getBroadcastDataSourceGroup(final Collection<Set<String>> dataSourceGroup) {
  Collection<Set<String>> result = new LinkedList<>();
  // 循环所有分组
  for (Set<String> each : dataSourceGroup) {
      // 根据上一次的result和each生成新的result
      result = getCandidateDataSourceGroup(result, each);
  }
  return result;
}
private Collection<Set<String>> getCandidateDataSourceGroup(final Collection<Set<String>> dataSourceSetGroup, final Set<String> compareSet) {
  Collection<Set<String>> result = new LinkedList<>();
  Set<String> intersectionSet;
  // 如果当前结果集dataSourceSetGroup为空,则直接返回compareSet作为集合唯一元素
  if (dataSourceSetGroup.isEmpty()) {
      result.add(compareSet);
      return result;
  }
  boolean hasIntersection = false; // 是否存在交集标志位
  // 循环当前结果集dataSourceSetGroup中的每个分组
  for (Set<String> each : dataSourceSetGroup) {
      // 结果集中的分组与比较分组取交集
      intersectionSet = Sets.intersection(each, compareSet);
      // 如果交集不为空,结果集加入交集
      if (!intersectionSet.isEmpty()) {
          result.add(intersectionSet);
          hasIntersection = true;
      } else {
      // 如果交集为空,加入原始结果集中的each
          result.add(each);
      }
  }
  // 如果原始结果集中没有与compareSet相交的集合,则加入compareSet
  if (!hasIntersection) {
      result.add(compareSet);
  }
  return result;
}

最后从所有分组集合中,每个分组选取组内任意数据源,加入ResultUnit

public final class ShardingDataSourceGroupBroadcastRoutingEngine implements ShardingRouteEngine {
    
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        RouteResult result = new RouteResult();
        ...
        // 每组随机选出一个数据源
        for (Set<String> each : broadcastDataSourceGroup) {
            // 组内随机数据源
            String dataSourceName = getRandomDataSourceName(each);
            // 加入RouteUnit
            result.getRouteUnits().add(new RouteUnit(new RouteMapper(dataSourceName, dataSourceName), Collections.emptyList()));
        }
        return result;
    }
}

ShardingDefaultDatabaseRoutingEngine

ShardingDefaultDatabaseRoutingEngine,默认数据源路由。

@RequiredArgsConstructor
public final class ShardingDefaultDatabaseRoutingEngine implements ShardingRouteEngine {
    
    private final Collection<String> logicTables;
    
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        RouteResult result = new RouteResult();
        // 取所有表作为tableMappers放入RouteUnit
        List<RouteMapper> routingTables = new ArrayList<>(logicTables.size());
        for (String each : logicTables) {
            routingTables.add(new RouteMapper(each, each));
        }
        // 取默认数据源作为dataSourceMapper放入RouteUnit
        String dataSourceName = shardingRule.getShardingDataSourceNames().getDefaultDataSourceName();
        result.getRouteUnits().add(new RouteUnit(new RouteMapper(dataSourceName, dataSourceName), routingTables));
        return result;
    }
}

ShardingIgnoreRoutingEngine

ShardingIgnoreRoutingEngine返回一个空的RouteResult

public final class ShardingIgnoreRoutingEngine implements ShardingRouteEngine {
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        return new RouteResult();
    }
}

ShardingStandardRoutingEngine

ShardingStandardRoutingEngine是对于有TableRule配置的单表路由引擎。

@RequiredArgsConstructor
public final class ShardingStandardRoutingEngine implements ShardingRouteEngine {
    // logicTableName 是 sql中的table 且 能找到tableRule的table
    private final String logicTableName;
    // sql
    private final SQLStatementContext sqlStatementContext;
    // ShardingRouteDecorator.getShardingConditions 持有RouteValue 通过解析whereSegment得到
    private final ShardingConditions shardingConditions;
    // 配置
    private final ConfigurationProperties properties;
    // 实际DataNode
    private final Collection<Collection<DataNode>> originalDataNodes = new LinkedList<>();
    @Override
    public RouteResult route(final ShardingRule shardingRule) {
        // 对于insert update delete 不支持多表操作
        if (isDMLForModify(sqlStatementContext) && 1 != ((TableAvailable) sqlStatementContext).getAllTables().size()) {
            throw new ShardingSphereException("Cannot support Multiple-Table for '%s'.", sqlStatementContext.getSqlStatement());
        }
        // 根据逻辑表名 找到 表规则
        TableRule tableRule = shardingRule.getTableRule(logicTableName);
        // 找DataNode
        Collection<DataNode> dataNodes = getDataNodes(shardingRule, tableRule);
        // 用所有DataNode生成RouteResult
        return generateRouteResult(dataNodes);
    }
}

route方法首先校验了sql的类型,如果是insert、update、delete且是多表操作直接抛出异常。接着通过逻辑表名找到了TableRule

重点关注getDataNodes方法,根据分库和分表策略会走不同的路由逻辑。

private Collection<DataNode> getDataNodes(final ShardingRule shardingRule, final TableRule tableRule) {
  // TableRule分库分表策略都是Hint
  if (isRoutingByHint(shardingRule, tableRule)) {
      return routeByHint(shardingRule, tableRule);
  }
  // TableRule分库分表策略都不是Hint
  if (isRoutingByShardingConditions(shardingRule, tableRule)) {
  	  // 重点关注
      return routeByShardingConditions(shardingRule, tableRule);
  }
  // 分库分表策略 一个Hint 一个非Hint
  return routeByMixedConditions(shardingRule, tableRule);
}

暂时先不考虑Hint分片,进入routeByShardingConditions方法。注意如果shardingConditions是空集合,会带着两个空集合进入route0,这样会导致全库表的路由。

private Collection<DataNode> routeByShardingConditions(final ShardingRule shardingRule, final TableRule tableRule) {
    // 在ShardingRouteDecorator.getShardingConditions阶段
    // 如果where里字段没有匹配的分片策略 这里就是空 会走所有库表的广播
    if (shardingConditions.getConditions().isEmpty()) {
    	// route0是个公共方法,先跳过,后面会遇到空集合传入的处理方式,其实就是返回所有库表路由
        return route0(shardingRule, tableRule, Collections.emptyList(), Collections.emptyList());
    }
    // 否则会走这个方法
    else {
        return routeByShardingConditionsWithCondition(shardingRule, tableRule);
    }
}

再进入routeByShardingConditionsWithCondition

private Collection<DataNode> routeByShardingConditionsWithCondition(final ShardingRule shardingRule, final TableRule tableRule) {
  Collection<DataNode> result = new LinkedList<>();
  for (ShardingCondition each : shardingConditions.getConditions()) {
      // 获取分库RouteValue ShardingStrategy中的columns和conditions匹配
      List<RouteValue> databaseShardingValues = getShardingValuesFromShardingConditions(shardingRule, shardingRule.getDatabaseShardingStrategy(tableRule).getShardingColumns(), each);
      // 获取分表RouteValue ShardingStrategy中的columns和conditions匹配
      List<RouteValue> tableShardingValues = getShardingValuesFromShardingConditions(shardingRule, shardingRule.getTableShardingStrategy(tableRule).getShardingColumns(), each);
      // 路由核心逻辑 RouteValue + ShardingRule + TableRule -> DataNode
      Collection<DataNode> dataNodes = route0(shardingRule, tableRule, databaseShardingValues, tableShardingValues);
      result.addAll(dataNodes);
      originalDataNodes.add(dataNodes);
  }
  return result;
}

首先进入2次getShardingValuesFromShardingConditions方法,得到分库和分表的RouteValue。这里将ShardingCondition里的所有RouteValue打散 并 过滤出其中存在分片策略(存在于ShardingStrategyShardingColumns中)的RouteValue

private List<RouteValue> getShardingValuesFromShardingConditions(final ShardingRule shardingRule, final Collection<String> shardingColumns, final ShardingCondition shardingCondition) {
  List<RouteValue> result = new ArrayList<>(shardingColumns.size());
  for (RouteValue each : shardingCondition.getRouteValues()) {
      Optional<BindingTableRule> bindingTableRule = shardingRule.findBindingTableRule(logicTableName);
      // 如果 当前表是分片表 或 当前表是logicTableName相关的广播表
      // 且 TableRule对应的ShardingStrategy的shardingColumns包含当前字段 或 默认ShardingStrategy的shardingColumns包含当前字段
      if ((logicTableName.equals(each.getTableName()) || bindingTableRule.isPresent() && bindingTableRule.get().hasLogicTable(logicTableName))
              && shardingColumns.contains(each.getColumnName())) {
          result.add(each);
      }
  }
  return result;
}

接下来就进入核心route0方法,先做数据源路由再做表路由。

private Collection<DataNode> route0(final ShardingRule shardingRule, 
            final TableRule tableRule, 
            final List<RouteValue> databaseShardingValues, 
            final List<RouteValue> tableShardingValues) {
    // dataSource路由
    Collection<String> routedDataSources = routeDataSources(shardingRule, tableRule, databaseShardingValues);
    Collection<DataNode> result = new LinkedList<>();
    for (String each : routedDataSources) {
        // table路由 table+dataSource构造为DataNode
        Collection<DataNode> dataNodes = routeTables(shardingRule, tableRule, each, tableShardingValues);
        result.addAll(dataNodes);
    }
    return result;
}

routeDataSources数据源路由。如果databaseShardingValues为空,返回了所有数据源,呼应上面shardingConditions为空直接进入route0的情况。接下来就是通过TableRule找到分库策略,执行分库策略的doSharding方法(不同的策略之后再了解,先走完路由流程)最终得到数据源名结果集。

private Collection<String> routeDataSources(final ShardingRule shardingRule, final TableRule tableRule, final List<RouteValue> databaseShardingValues) {
    // 如果没有RouteValue 返回所有数据源
    if (databaseShardingValues.isEmpty()) {
        return tableRule.getActualDatasourceNames();
    }
    // 获取分库策略
    ShardingStrategy databaseShardingStrategy = shardingRule.getDatabaseShardingStrategy(tableRule);
    // 执行分片算法 获取数据源名
    Collection<String> dataSources = databaseShardingStrategy.doSharding(tableRule.getActualDatasourceNames(), databaseShardingValues, this.properties);
    // 放入LinkedHashSet
    Collection<String> result = new LinkedHashSet<>(dataSources);
    // 路由结果必须非空
    Preconditions.checkState(!result.isEmpty(), "no database route info");
    // 路由结果不能超出实际数据源集合范围
    Preconditions.checkState(tableRule.getActualDatasourceNames().containsAll(result), 
            "Some routed data sources do not belong to configured data sources. routed data sources: `%s`, configured data sources: `%s`", result, tableRule.getActualDatasourceNames());
    return result;
}

routeTables做了两件事情,一个是ShardingStrategy执行分片算法得到表的结果集,另外一个就是将DataSource和table组合成DataNode

private Collection<DataNode> routeTables(final ShardingRule shardingRule, final TableRule tableRule, final String routedDataSource, final List<RouteValue> tableShardingValues) {
    // 根据dataSource 找到这个dataSource下所有的表
    Collection<String> availableTargetTables = tableRule.getActualTableNames(routedDataSource);
    Collection<String> routedTables;
    // 如果RouteValue是空 返回当前dataSource下所有的表
    if (tableShardingValues.isEmpty()) {
        routedTables = new LinkedHashSet<>(availableTargetTables);
    }
    // 执行分片算法获取表结果集
    else {
        ShardingStrategy tableShardingStrategy = shardingRule.getTableShardingStrategy(tableRule);
        Collection<String> tables = tableShardingStrategy.doSharding(availableTargetTables, tableShardingValues, this.properties);
        routedTables = new LinkedHashSet<>(tables);
    }
    // 路由结果不能为空
    Preconditions.checkState(!routedTables.isEmpty(), "no table route info");
    // 组装DataNode
    Collection<DataNode> result = new LinkedList<>();
    for (String each : routedTables) {
        result.add(new DataNode(routedDataSource, each));
    }
    return result;
}

至此ShardingStandardRoutingEnginegetDataNodes方法就结束了,再回到ShardingStandardRoutingEngineroute方法。最后就是将DataNode转换为RouteResult

public RouteResult route(final ShardingRule shardingRule) {
  // 对于insert update delete 不支持多表操作
  if (isDMLForModify(sqlStatementContext) && 1 != ((TableAvailable) sqlStatementContext).getAllTables().size()) {
      throw new ShardingSphereException("Cannot support Multiple-Table for '%s'.", sqlStatementContext.getSqlStatement());
  }
  // 根据逻辑表名 找到 表规则
  TableRule tableRule = shardingRule.getTableRule(logicTableName);
  // 找DataNode
  Collection<DataNode> dataNodes = getDataNodes(shardingRule, tableRule);
  // 用所有DataNode生成RouteResult
  return generateRouteResult(dataNodes);
}

// 将DataNode转换为RouteUnit放入RouteResult
private RouteResult generateRouteResult(final Collection<DataNode> routedDataNodes) {
  RouteResult result = new RouteResult();
  result.getOriginalDataNodes().addAll(originalDataNodes);
  for (DataNode each : routedDataNodes) {
      RouteMapper dataSourceMapper = new RouteMapper(each.getDataSourceName(), each.getDataSourceName());
      RouteMapper tableMapper = new RouteMapper(logicTableName, each.getTableName());
      RouteUnit unit = new RouteUnit(dataSourceMapper, Collections.singletonList(tableMapper));
      result.getRouteUnits().add(unit);
  }
  return result;
}

ShardingComplexRoutingEngine

ShardingComplexRoutingEngine,复合路由引擎。不同于ShardingStandardRoutingEngine针对于单个logicTable,ShardingComplexRoutingEngine针对的是多个logicTable。

@RequiredArgsConstructor
public final class ShardingComplexRoutingEngine implements ShardingRouteEngine {
    
  private final Collection<String> logicTables;

  private final SQLStatementContext sqlStatementContext;

  private final ShardingConditions shardingConditions;

  private final ConfigurationProperties properties;

  @Override
  public RouteResult route(final ShardingRule shardingRule) {
      Collection<RouteResult> result = new ArrayList<>(logicTables.size());
      // 绑定表集合 包含多组绑定关系
      Collection<String> bindingTableNames = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
      for (String each : logicTables) {
          // 只有在TableRule中存在的逻辑表才走路由逻辑
          Optional<TableRule> tableRule = shardingRule.findTableRule(each);
          if (tableRule.isPresent()) {
              // 之前相关的绑定表没包含本次each这张表 就要执行ShardingStandardRoutingEngine的route方法
              if (!bindingTableNames.contains(each)) {
                  result.add(new ShardingStandardRoutingEngine(tableRule.get().getLogicTable(), sqlStatementContext, shardingConditions, properties).route(shardingRule));
              }
              // 把同一个绑定组中的table都加入到bindingTableNames,防止多次调用ShardingStandardRoutingEngine,也相当于做了result去重
              shardingRule.findBindingTableRule(each).ifPresent(bindingTableRule -> bindingTableNames.addAll(
                  bindingTableRule.getTableRules().stream().map(TableRule::getLogicTable).collect(Collectors.toList())));
          }
      }
      // 如果路由结果是空 抛出异常
      if (result.isEmpty()) {
          throw new ShardingSphereException("Cannot find table rule and default data source with logic tables: '%s'", logicTables);
      }
      // 如果路由结果只有一个 直接返回
      if (1 == result.size()) {
          return result.iterator().next();
      }
      // 否则执行ShardingCartesianRoutingEngine
      return new ShardingCartesianRoutingEngine(result).route(shardingRule);
  }
}

从上面的route方法不难看出,ShardingComplexRoutingEngine虽然是针对多个logicTable,但是还是将每个logicTable委托给ShardingStandardRoutingEngine做路由处理。如果路由结果集为空,抛出异常;如果路由结果集只有一个,直接返回;如果路由结果集超出一个,会走ShardingCartesianRoutingEngine的route方法。

ShardingCartesianRoutingEngine

ShardingCartesianRoutingEngine,笛卡尔积路由引擎,由RouteResult集合构造,针对上游路由引擎(ShardingStandardRoutingEngine)得出的路由结果,做笛卡尔积生成新的路由结果。

ShardingCartesianRoutingEngine目前只有当ShardingComplexRoutingEngine的路由结果集元素数量大于1才会进入。

@RequiredArgsConstructor
public final class ShardingCartesianRoutingEngine implements ShardingRouteEngine {
    
  private final Collection<RouteResult> routeResults;

  @Override
  public RouteResult route(final ShardingRule shardingRule) {
      RouteResult result = new RouteResult();
      // 对数据源做交集 然后获得 数据源 - 逻辑表集合 
      Map<String, Set<String>> dataSourceLogicTablesMap = getDataSourceLogicTablesMap();
      // 循环 数据源 - 逻辑表集合
      for (Entry<String, Set<String>> entry : dataSourceLogicTablesMap.entrySet()) {
          // Set中是同一个逻辑表的不同实际表
          List<Set<String>> actualTableGroups = getActualTableGroups(entry.getKey(), entry.getValue());
          // 把actualTableGroups转换为List<Set<RouteMapper>>
          List<Set<RouteMapper>> routingTableGroups = toRoutingTableGroups(entry.getKey(), actualTableGroups);
          // routingTableGroups元素组成笛卡尔积
          Set<List<RouteMapper>> cartesianProduct = Sets.cartesianProduct(routingTableGroups);
          // 转换为RouteUnit
          Collection<RouteUnit> routeUnits = getRouteUnits(entry.getKey(), cartesianProduct);
          result.getRouteUnits().addAll(routeUnits);
      }
      return result;
  }
}

案例:select * from t_order a inner join t_order_item b on a.order_id = b.order_id where a.user_id = 2

当没有绑定表t_ordert_order_item时,就会走ShardingCartesianRoutingEngine的route方法。

首先会进入ShardingComplexRoutingEngine,然后分别对逻辑表t_ordert_order_item都进入一次ShardingStandardRoutingEngine,得到两个RouteResult并进入ShardingCartesianRoutingEngine

接下来ShardingCartesianRoutingEngine给实际表做笛卡尔积,生成最终的RouteResult集合。

总结

  • ShardingRouteDecorator是路由的核心处理类,其中最关键的步骤是:

    • getShardingConditions:通过sql上下文,解析where条件,得到RouteValue放入ShardingCondition
    • 获取路由引擎
    • 执行路由引擎
  • ShardingRouteEngine路由引擎的实现有多种,重点关注ShardingStandardRoutingEngineShardingComplexRoutingEngineShardingCartesianRoutingEngine

    • ShardingStandardRoutingEngine:针对单个逻辑表的路由引擎。
    • ShardingComplexRoutingEngine:针对多个逻辑表的路由引擎,本身不实际处理路由逻辑,委托给ShardingStandardRoutingEngine和ShardingCartesianRoutingEngine处理。
    • ShardingCartesianRoutingEngine:通过上游引擎产生的多个RouteResult构造,产生笛卡尔积路由结果。本身只能由ShardingComplexRoutingEngine构造,暂时没有其他入口。

下一章继续路由相关源码解析

  • ShardingRouteEngineFactory路由引擎工厂,知道不同的sql由哪些不同的引擎处理,什么时候会走广播,什么时候会做全库表的扫描。
  • ShardingStrategy分片策略,知道ShardingStrategryConfiguration对运行时的路由有何影响。