一.JobGraph生成源码机制
0.先说结论
StreamGraph 转变成 JobGraph 也是在 Client 完成,主要做了以下几件事:
- 对StreamNode进行划分算子链(如"Source->Map"、"KeyBy->Window")
- 对链化的结果转成JobVertex。
- 对transitiveOutEdges中记录的StreamEdge转成JobEdge。
- JobEdge 和 JobVertex 之间创建 IntermediateDataSet 进行连接
前言:
- JobGraph是在StreamGraph基础上生成的,不了解的可以先去看一下Flink-Graph-2.StreamGraph生成源码
- 下面的源码解析,类似套娃,一层一层调用,直到找到最后干活的人,比较复杂,建议结合最后的举例子多看几遍
createChain()
1.还是从StreamExecutionEnvironment入手
在env.execute()方法调用的时候,会根据已经生成的StreamGraph去生成JobGraph
// 1.这是env.execute()的入口,会掉下面的execute(String jobName)
public JobExecutionResult execute() throws Exception {
return execute(getJobName());
}
// 2.获取StreamGraph,并接着调用execute(StreamGraph streamGraph)
public JobExecutionResult execute(String jobName) throws Exception {
Preconditions.checkNotNull(jobName, "Streaming Job name should not be null.");
return execute(getStreamGraph(jobName));
}
// 3.这里会根据StreamGraph去生成JobGraph,调executeAsync方法,忽略一些try-catch代码,方便观察
public JobExecutionResult execute(StreamGraph streamGraph) throws Exception {
final JobClient jobClient = executeAsync(streamGraph);
final JobExecutionResult jobExecutionResult;
if (configuration.getBoolean(DeploymentOptions.ATTACHED)) {
jobExecutionResult = jobClient.getJobExecutionResult().get();
} else {
jobExecutionResult = new DetachedJobExecutionResult(jobClient.getJobID());
}
jobListeners.forEach(jobListener -> jobListener.onJobExecuted(jobExecutionResult, null));
return jobExecutionResult;
}
// 4.这是最终被调用的方法,在这里回去check一下StreamGraph是否生成,然后在executorFactory的execute方法中根据StreamGraph去生成对应的JobGraph
public JobClient executeAsync(StreamGraph streamGraph) throws Exception {
checkNotNull(streamGraph, "StreamGraph cannot be null.");
checkNotNull(configuration.get(DeploymentOptions.TARGET), "No execution.target specified in your configuration file.");
// 根据提交模式选择匹配的 factory
final PipelineExecutorFactory executorFactory =
executorServiceLoader.getExecutorFactory(configuration);
checkNotNull(
executorFactory,
"Cannot find compatible factory for specified execution.target (=%s)",
configuration.get(DeploymentOptions.TARGET));
//重点:选择合适的 executor 提交任务
CompletableFuture<JobClient> jobClientFuture = executorFactory
.getExecutor(configuration)
.execute(streamGraph, configuration, userClassloader);
JobClient jobClient = jobClientFuture.get();
jobListeners.forEach(jobListener -> jobListener.onJobSubmitted(jobClient, null));
return jobClient;
}
我们会发现executorFactory类型是PipelineExecutorFactory
这是一个工厂接口,他的实现类如图
public interface PipelineExecutorFactory {
String getName();
boolean isCompatibleWith(final Configuration configuration);
PipelineExecutor getExecutor(final Configuration configuration);
}
我们看其中一个YarnJobClusterExecutorFactory
,代码如下
public class YarnJobClusterExecutorFactory implements PipelineExecutorFactory {
@Override
public String getName() {
return YarnJobClusterExecutor.NAME;
}
@Override
public boolean isCompatibleWith(@Nonnull final Configuration configuration) {
return YarnJobClusterExecutor.NAME.equalsIgnoreCase(configuration.get(DeploymentOptions.TARGET));
}
@Override
public PipelineExecutor getExecutor(@Nonnull final Configuration configuration) {
try {
// 重点在这
return new YarnJobClusterExecutor();
} catch (NoClassDefFoundError e) {
throw new IllegalStateException(YarnDeploymentTarget.ERROR_MESSAGE);
}
}
}
可以看到在getExecutor()
又调用了YarnJobClusterExecutor
他的代码如下
public class YarnJobClusterExecutor extends AbstractJobClusterExecutor<ApplicationId, YarnClusterClientFactory> {
public static final String NAME = YarnDeploymentTarget.PER_JOB.getName();
public YarnJobClusterExecutor() {
// 重点在这
super(new YarnClusterClientFactory()); // 调用父类AbstractJobClusterExecutor的方法
}
}
最终,我们这一层干活的人是AbstractJobClusterExecutor
2.再看AbstractJobClusterExecutor
public class AbstractJobClusterExecutor<ClusterID, ClientFactory extends ClusterClientFactory<ClusterID>> implements PipelineExecutor {
private static final Logger LOG = LoggerFactory.getLogger(AbstractJobClusterExecutor.class);
private final ClientFactory clusterClientFactory;
public AbstractJobClusterExecutor(@Nonnull final ClientFactory clusterClientFactory) {
this.clusterClientFactory = checkNotNull(clusterClientFactory);
}
// 重点在这
@Override
public CompletableFuture<JobClient> execute(@Nonnull final Pipeline pipeline, @Nonnull final Configuration configuration, @Nonnull final ClassLoader userCodeClassloader) throws Exception {
// Pipeline是接口,StreamGraph是其实现类,在StreamExecutionEnvironment的executeAsync方法中传入StreamGraph参数
// 1.调用PipelineExecutorUtils.getJobGraph去获得将流图StreamGraph转换成作业图JobStream
final JobGraph jobGraph = PipelineExecutorUtils.getJobGraph(pipeline, configuration);
/*2.集群描述器:创建、启动了 YarnClient, 包含了一些yarn、flink的配置和环境信息*/
try (final ClusterDescriptor<ClusterID> clusterDescriptor = clusterClientFactory.createClusterDescriptor(configuration)) {
final ExecutionConfigAccessor configAccessor = ExecutionConfigAccessor.fromConfiguration(configuration);
/*2.1集群特有资源配置:JobManager内存、TaskManager内存、每个Tm的slot数*/
final ClusterSpecification clusterSpecification = clusterClientFactory.getClusterSpecification(configuration);
final ClusterClientProvider<ClusterID> clusterClientProvider = clusterDescriptor
.deployJobCluster(clusterSpecification, jobGraph, configAccessor.getDetachedMode());
LOG.info("Job has been submitted with JobID " + jobGraph.getJobID());
return CompletableFuture.completedFuture(
new ClusterClientJobClientAdapter<>(clusterClientProvider, jobGraph.getJobID(), userCodeClassloader));
}
}
}
ok,这里其实生成JobStream的方法是PipelineExecutorUtils.getJobGraph
,继续
public class PipelineExecutorUtils {
/**
* Creates the {@link JobGraph} corresponding to the provided {@link Pipeline}.
*
* @param pipeline the pipeline whose job graph we are computing
* @param configuration the configuration with the necessary information such as jars and
* classpaths to be included, the parallelism of the job and potential
* savepoint settings used to bootstrap its state.
* @return the corresponding {@link JobGraph}.
*/
public static JobGraph getJobGraph(@Nonnull final Pipeline pipeline, @Nonnull final Configuration configuration) throws MalformedURLException {
// 1.再次检测StreamGraph不为null,否则抛出异常
checkNotNull(pipeline);
checkNotNull(configuration);
final ExecutionConfigAccessor executionConfigAccessor = ExecutionConfigAccessor.fromConfiguration(configuration);
// 2.再调FlinkPipelineTranslationUtil.getJobGraph,返回JobGraph
final JobGraph jobGraph = FlinkPipelineTranslationUtil
.getJobGraph(pipeline, configuration, executionConfigAccessor.getParallelism());
// 3.下面是一些配置了,给JobGraph配置jar包、classpath、jobID、savepoint等
configuration
.getOptional(PipelineOptionsInternal.PIPELINE_FIXED_JOB_ID)
.ifPresent(strJobID -> jobGraph.setJobID(JobID.fromHexString(strJobID)));
jobGraph.addJars(executionConfigAccessor.getJars());
jobGraph.setClasspaths(executionConfigAccessor.getClasspaths());
jobGraph.setSavepointRestoreSettings(executionConfigAccessor.getSavepointRestoreSettings());
// 4.返回JobGraph
return jobGraph;
}
}
然后看FlinkPipelineTranslationUtil.getJobGraph()
public final class FlinkPipelineTranslationUtil {
public static JobGraph getJobGraph(
Pipeline pipeline,
Configuration optimizerConfiguration,
int defaultParallelism) {
// FlinkPipelineTranslator是一个接口,他的实现类如上图,有StreamGraphTranslator和PlanTranslator
FlinkPipelineTranslator pipelineTranslator = getPipelineTranslator(pipeline);
// 再由其实现类去调translateToJobGraph将StreamGraph转为JobGraph
return pipelineTranslator.translateToJobGraph(pipeline,
optimizerConfiguration,
defaultParallelism);
}
。。。
}
然后看StreamGraphTranslator
public class StreamGraphTranslator implements FlinkPipelineTranslator {
private static final Logger LOG = LoggerFactory.getLogger(StreamGraphTranslator.class);
@Override
public JobGraph translateToJobGraph(
Pipeline pipeline,
Configuration optimizerConfiguration,
int defaultParallelism) {
checkArgument(pipeline instanceof StreamGraph,
"Given pipeline is not a DataStream StreamGraph.");
StreamGraph streamGraph = (StreamGraph) pipeline;
// 调StreamGraph的getJobGraph方法
return streamGraph.getJobGraph(null);
}
。。。
}
在StreamGraph
中的getJobGraph()
又会调StreamingJobGraphGenerator
的createJobGraph()
// 1.StreamGraph
public JobGraph getJobGraph(@Nullable JobID jobID) {
return StreamingJobGraphGenerator.createJobGraph(this, jobID);
}
// 2.StreamingJobGraphGenerator
public static JobGraph createJobGraph(StreamGraph streamGraph, @Nullable JobID jobID) {
return new StreamingJobGraphGenerator(streamGraph, jobID).createJobGraph();
}
好了,到这我们终于找到真正干活的人了---StreamingJobGraphGenerator
3.真正干活的人---StreamingJobGraphGenerator
(1) 相关属性
private final StreamGraph streamGraph;
// id -> JobVertex
private final Map<Integer, JobVertex> jobVertices;
private final JobGraph jobGraph;
// 已经构建的JobVertex的id集合
private final Collection<Integer> builtVertices;
// 物理边集合(排除了chain内部的边), 按创建顺序排序
private final List<StreamEdge> physicalEdgesInOrder;
// 保存chain信息,部署时用来构建 OperatorChain,startNodeId -> (currentNodeId -> StreamConfig)
private final Map<Integer, Map<Integer, StreamConfig>> chainedConfigs;
// 所有节点的配置信息,id -> StreamConfig
private final Map<Integer, StreamConfig> vertexConfigs;
// 保存每个节点的名字,id -> chainedName
private final Map<Integer, String> chainedNames;
private final Map<Integer, ResourceSpec> chainedMinResources;
private final Map<Integer, ResourceSpec> chainedPreferredResources;
private final Map<Integer, InputOutputFormatContainer> chainedInputOutputFormats;
private final StreamGraphHasher defaultStreamGraphHasher;
private final List<StreamGraphHasher> legacyStreamGraphHashers;
(2) 核心处理逻辑
<1> createJobGraph()
StreamingJobGraphGenerator
的成员变量都是为了辅助生成最终的 JobGraph。为所有节点生成一个唯一的 hash id,如果节点在多次提交中没有改变(包括并发度、上下游等),那么这个 id 就不会改变,这主要用于故障恢复
这里不能用 StreamNode.id 来代替,因为StreamNode.id是一个从 1 开始的静态计数变量,同样的 Job 可能会得到不一样的 id,如下代码示例的两个 job 是完全一样的,但是 source 的 id 却不一样了。
// 范例 1:A.id=1 B.id=2
DataStream<String> A = ...
DataStream<String> B = ... A.union(B).print();
// 范例 2:A.id=2 B.id=1
DataStream<String> B = ...
DataStream<String> A = ... A.union(B).print();
private JobGraph createJobGraph() {
preValidate();
// 0.这里都是一些配置
// streaming 模式下,调度模式是所有节点(vertices)一起启动:Eager
jobGraph.setScheduleMode(streamGraph.getScheduleMode());
jobGraph.enableApproximateLocalRecovery(streamGraph.getCheckpointConfig().isApproximateLocalRecoveryEnabled());
// 广度优先遍历 StreamGraph 并且为每个SteamNode生成算子hash id
// 保证如果提交的拓扑没有改变,则每次生成的hash都是一样的
Map<Integer, byte[]> hashes = defaultStreamGraphHasher.traverseStreamGraphAndGenerateHashes(streamGraph);
List<Map<Integer, byte[]>> legacyHashes = new ArrayList<>(legacyStreamGraphHashers.size());
for (StreamGraphHasher hasher : legacyStreamGraphHashers) {
legacyHashes.add(hasher.traverseStreamGraphAndGenerateHashes(streamGraph));
}
// 1.最重要的函数--构建算子链,生成 JobVertex,JobEdge等,并尽可能地将多个节点chain在一起
setChaining(hashes, legacyHashes);
// 2.将每个JobVertex的入边集合也序列化到该JobVertex的StreamConfig中 (出边集合已经在setChaining的时候写入了)
setPhysicalEdges();
// 3.根据group name,为每个 JobVertex 指定所属的 SlotSharingGroup 以及针对 Iteration的头尾设置 CoLocationGroup
setSlotSharingAndCoLocation();
// 4.这里就是一些内存管理和检查点配置
setManagedMemoryFraction(
Collections.unmodifiableMap(jobVertices),
Collections.unmodifiableMap(vertexConfigs),
Collections.unmodifiableMap(chainedConfigs),
id -> streamGraph.getStreamNode(id).getManagedMemoryOperatorScopeUseCaseWeights(),
id -> streamGraph.getStreamNode(id).getManagedMemorySlotScopeUseCases());
configureCheckpointing();
jobGraph.setSavepointRestoreSettings(streamGraph.getSavepointRestoreSettings());
JobGraphUtils.addUserArtifactEntries(streamGraph.getUserArtifacts(), jobGraph);
// set the ExecutionConfig last when it has been finalized
try {
// 5.将 StreamGraph 的 ExecutionConfig 序列化到 JobGraph 的配置中
jobGraph.setExecutionConfig(streamGraph.getExecutionConfig());
}
catch (IOException e) {
throw new IllegalConfigurationException("Could not serialize the ExecutionConfig." +
"This indicates that non-serializable types (like custom serializers) were registered");
}
return jobGraph;
}
<2> 调用setChain()
private void setChaining(Map<Integer, byte[]> hashes, List<Map<Integer, byte[]>> legacyHashes) {
// 1.构建算子链的起始点,规则是:链起始点是source算子或无法与上游链化的算子OperatorChainInfo 保存链的起始节点、优先级等信息
final Map<Integer, OperatorChainInfo> chainEntryPoints = buildChainedInputsAndGetHeadInputs(hashes, legacyHashes);
final Collection<OperatorChainInfo> initialEntryPoints = new ArrayList<>(chainEntryPoints.values());
// 2.从起始点开始建⽴ node chains 算子链
for (OperatorChainInfo info : initialEntryPoints) {
// 构建node chains,返回当前节点的物理出边;startNodeId != currentNodeId 时,说明currentNode是chain中的子节点
createChain(
info.getStartNodeId(),
1, // operators start at position 1 because 0 is for chained source inputs
info,
chainEntryPoints);
}
}
<2>setChain
调用的递归createChain()
我们发现真正创建算子链的其实是createChain操作
1° 递归代码createChian()
---关键
分为9步,这9步可以看最后2° 举例子讲解
// 构建 node chains,返回当前节点的物理出边
// startNodeId != currentNodeId 时,说明 currentNode 是 chain 中的子节点
private List<StreamEdge> createChain(
final Integer currentNodeId, // 当前处理的算子节点ID
final int chainIndex, // 当前算子链中的算子位置索引
final OperatorChainInfo chainInfo, // 链的元数据
final Map<Integer, OperatorChainInfo> chainEntryPoints) // 链入口点的映射map
{
// 1.获取起始点
Integer startNodeId = chainInfo.getStartNodeId();
// 防止相同起始点重复构造算子链
if (!builtVertices.contains(startNodeId)) {
/* 关键:这个transitiveOutEdges存了两类边,这个过程特别绕,具体流程看下面举例子讲解
* 1.递归调用的所有子链的外部边
* 2.不可链化的出边
* 比如有拓扑结构:Source --> Edge1 --> Map -> Edge2 --> KeyBy --> Edge3 --> Window --> Edge4--> Sink
* 链化结果如下:
* Chain1: [Source + Map]
* Chain2: [KeyBy + Window]
* Chain3: [Sink]
* 那么最终各个算子链的transitiveOutEdges结果如下
* chain1(起点是Source)的transitiveOutEdges有[Edge2]
* chain2(起点是KeyBy)的transitiveOutEdges有[Edge4]
* chain3(起点是Sink)的transitiveOutEdges有[]
* */
List<StreamEdge> transitiveOutEdges = new ArrayList<StreamEdge>(); // 传递性出边集合,收集了当前算子链的所有外部输出边,这些边最终会在下面connect方法中去形成物理连接
List<StreamEdge> chainableOutputs = new ArrayList<StreamEdge>();// 可以形成算子链的出边
List<StreamEdge> nonChainableOutputs = new ArrayList<StreamEdge>();// 不可以形成算子链的出边
StreamNode currentNode = streamGraph.getStreamNode(currentNodeId);
// 2.将当前节点的出边分成 chainable 和 nonChainable 两类
for (StreamEdge outEdge : currentNode.getOutEdges()) {
// 判断当前节点和下游算子是否可以形成算子链
if (isChainable(outEdge, streamGraph)) {
chainableOutputs.add(outEdge);
} else {
nonChainableOutputs.add(outEdge);
}
}
// 3.1 递归扩展当前链:对于当前节点中可链化的出边:扩展当前链,算子位置编号递增
for (StreamEdge chainable : chainableOutputs) {
// 这里你可以理解为是链表的遍历
transitiveOutEdges.addAll(
createChain(chainable.getTargetId(), // 下游算子ID
chainIndex + 1, // 链位置+1
chainInfo,
chainEntryPoints));
}
// 3.2 创建新链起点:对于当前节点中不可链化的出边:终止当前链,为目标算子创建新的链起始点
for (StreamEdge nonChainable : nonChainableOutputs) {
transitiveOutEdges.add(nonChainable); // 添加不可链化的出边
// 创建新链
createChain(
nonChainable.getTargetId(), // 出边对应的下游算子id
1, // 以这个下游算子作为新chain的起点
chainEntryPoints.computeIfAbsent( // 确保每个算子只作为一个链的起点
nonChainable.getTargetId(),
(k) -> chainInfo.newChain(nonChainable.getTargetId())),
chainEntryPoints);
}
// 4.生成当前节点的算子链显示名,如:"Source -> Map"
// 以上述案例来说,这里chainedNames是一个map存的是(source算子ID,"Source -> Map")
chainedNames.put(currentNodeId, createChainedName(currentNodeId, chainableOutputs, Optional.ofNullable(chainEntryPoints.get(currentNodeId))));
chainedMinResources.put(currentNodeId, createChainedMinResources(currentNodeId, chainableOutputs));
chainedPreferredResources.put(currentNodeId, createChainedPreferredResources(currentNodeId, chainableOutputs));
// 5.负责将当前算子注册到算子链中生成关键元数据(source算子ID,"Source -> Map")
OperatorID currentOperatorId = chainInfo.addNodeToChain(currentNodeId, chainedNames.get(currentNodeId));
if (currentNode.getInputFormat() != null) {
getOrCreateFormatContainer(startNodeId).addInputFormat(currentOperatorId, currentNode.getInputFormat());
}
if (currentNode.getOutputFormat() != null) {
getOrCreateFormatContainer(startNodeId).addOutputFormat(currentOperatorId, currentNode.getOutputFormat());
}
// 根据元数据信息,去创建作业顶点JobVertex,比如上面的案例,下面createJobVertex方法会创建一个[Source + Map]的作业顶点JobVertex
// 6.1 如果当前节点是起始节点, 则直接创建 JobVertex 并返回其StreamConfig, 否则先创建一个空的 StreamConfig
StreamConfig config = currentNodeId.equals(startNodeId)
? createJobVertex(startNodeId, chainInfo)
: new StreamConfig(new Configuration());
// 6.2 设置 JobVertex 的 StreamConfig, 基本上是序列化 StreamNode 中的配置到 StreamConfig中
setVertexConfig(currentNodeId, config, chainableOutputs, nonChainableOutputs, chainInfo.getChainedSources());
// 7.如果是chain的起始节点,标记成chain start(不是chain中的节点,也会被标记成 chain start)
if (currentNodeId.equals(startNodeId)) {
// 起始点的一些配置
config.setChainStart();
config.setChainIndex(chainIndex);
config.setOperatorName(streamGraph.getStreamNode(currentNodeId).getOperatorName());
// 7.1 将当前这个起始节点与所有出边相连---物理连接
for (StreamEdge edge : transitiveOutEdges) {
// 通过StreamEdge构建出JobEdge,创建 IntermediateDataSet,用来将JobVertex和JobEdge相连
// 其实就是将两个不同chain的JobVertex相连,比如[Source->Map] --> [Map->Window] --> [Sink]
connect(startNodeId, edge);
}
// 7.2 把物理出边写入配置, 部署时会用到
config.setOutEdgesInOrder(transitiveOutEdges);
// 7.3 将chain中所有子节点的StreamConfig写入到 headOfChain 节点的 CHAINED_TASK_CONFIG 配置中
config.setTransitiveChainedTaskConfigs(chainedConfigs.get(startNodeId));
} else {
// 8.如果是 chain 中的子节点,管理子节点配置信息,确保同一chain内的所有算子配置被记录
// 比如map层来了,发现startNode是Source,chainedConfigs没有这个key,会创建一个map
chainedConfigs.computeIfAbsent(startNodeId, k -> new HashMap<Integer, StreamConfig>());
// 对mao节点进行配置
config.setChainIndex(chainIndex);
StreamNode node = streamGraph.getStreamNode(currentNodeId);
config.setOperatorName(node.getOperatorName());
// 将当前节点(map)的StreamConfig添加到该chain(Source->Map)的config集合中
chainedConfigs.get(startNodeId).put(currentNodeId, config);
}
config.setOperatorID(currentOperatorId);
// 8.判断当前节点是否是本次chain的终点
if (chainableOutputs.isEmpty()) {
config.setChainEnd();
}
// 9.返回连往chain外部的出边集合
return transitiveOutEdges;
} else {
return new ArrayList<>();
}
}
《1》调用的isChainable()
方法作用:判断当前的算子和出边能否链起来
规则如下,必须全部都满足
- 上下游算子的slot sharing组相同
- 上下游算子满足可链化策略要求
- 上下游算子中间连接的edge边的分区类型是ForwardPartitioner
- edge边的分区类型不是BATCH
- 上下游算子的parallelism相同
- 全局开启chaining
public static boolean isChainable(StreamEdge edge, StreamGraph streamGraph) {
// 获取出边的下游算子StreamNode
StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);
// 判断条件:他的上游入边只能有一个,且满足isChainableInput
return downStreamVertex.getInEdges().size() == 1
&& isChainableInput(edge, streamGraph);
}
private static boolean isChainableInput(StreamEdge edge, StreamGraph streamGraph) {
// edge的上游算子StreamNode
StreamNode upStreamVertex = streamGraph.getSourceVertex(edge);
// edge的下游算子StreamNode
StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);
/* 规则如下,必须全部都满足
* 1.上下游算子的slot sharing组相同
* 2.上下游算子满足可链化策略要求
* 3.上下游算子中间连接的edge边的分区类型是ForwardPartitioner
* 4.edge边的分区类型不是BATCH
* 5.上下游算子的parallelism相同
* 6.开启chaining
* */
if (!(upStreamVertex.isSameSlotSharingGroup(downStreamVertex)
&& areOperatorsChainable(upStreamVertex, downStreamVertex, streamGraph)
&& (edge.getPartitioner() instanceof ForwardPartitioner)
&& edge.getShuffleMode() != ShuffleMode.BATCH
&& upStreamVertex.getParallelism() == downStreamVertex.getParallelism()
&& streamGraph.isChainingEnabled())) {
return false;
}
《2》调用的createChainedName()
方法作用:返回链化结果,对不能链化的直接返回算子;如"Source->Map"和"Map"
结果传给addNodeToChain
// createChain的调用方式:chainedNames.put(currentNodeId, createChainedName(currentNodeId, chainableOutputs, Optional.ofNullable(chainEntryPoints.get(currentNodeId))));
// 比如Source->Map可用链化,那么就会返回"Source->Map"
private String createChainedName(Integer vertexID, List<StreamEdge> chainedOutputs, Optional<OperatorChainInfo> operatorChainInfo) {
final String operatorName = nameWithChainedSourcesInfo(
streamGraph.getStreamNode(vertexID).getOperatorName(),
operatorChainInfo.map(chain -> chain.getChainedSources().values()).orElse(Collections.emptyList()));
if (chainedOutputs.size() > 1) {
List<String> outputChainedNames = new ArrayList<>();
for (StreamEdge chainable : chainedOutputs) {
outputChainedNames.add(chainedNames.get(chainable.getTargetId()));
}
return operatorName + " -> (" + StringUtils.join(outputChainedNames, ", ") + ")";
} else if (chainedOutputs.size() == 1) {
return operatorName + " -> " + chainedNames.get(chainedOutputs.get(0).getTargetId());
} else {
return operatorName;
}
}
《3》调用的addNodeToChain()
方法作用:利用createChainedName返回的算子链字符串,去注册算子链,后续createJobVetex会用到
结果传给createJobVertex
// createChain中调用方式:chainInfo.addNodeToChain(SourceID, "Source -> Map")
private OperatorID addNodeToChain(int currentNodeId, String operatorName) {
// 1.初始化算子链内的算子哈希列表
List<Tuple2<byte[], byte[]>> operatorHashes =
chainedOperatorHashes.computeIfAbsent(startNodeId, k -> new ArrayList<>());
// 2.获取当前算子的主哈希值
byte[] primaryHashBytes = hashes.get(currentNodeId);
// 3. 处理历史哈希(兼容旧版本)
for (Map<Integer, byte[]> legacyHash : legacyHashes) {
operatorHashes.add(new Tuple2<>(primaryHashBytes, legacyHash.get(currentNodeId)));
}
// 4. 注册算子协调器(OperatorCoordinator)---底层是通过反射创建协调器实例
streamGraph
.getStreamNode(currentNodeId) // Source算子ID
.getCoordinatorProvider(operatorName, new OperatorID(getHash(currentNodeId))) // operatorName如"Source -> Map"
.map(coordinatorProviders::add);
// 5. 返回算子ID(基于主哈希值)
return new OperatorID(primaryHashBytes);
}
《4》调用的createJobVertex()
方法作用:基于addNodeToChain中注册的元数据去将算子链转换成物理节点JobVertex,会把Source->Map这种链化结果,转换成一个物理节点
基于addNodeToChain中注册的元数据去将算子链转换成物理节点JobVertex
private StreamConfig createJobVertex(
Integer streamNodeId,
OperatorChainInfo chainInfo) {
// 1.获取StreamNode,如Source、Map
JobVertex jobVertex;
StreamNode streamNode = streamGraph.getStreamNode(streamNodeId);
byte[] hash = chainInfo.getHash(streamNodeId); // 获取StreamNode的hash
if (hash == null) {
throw new IllegalStateException("Cannot find node hash. " +
"Did you generate them before calling this method?");
}
JobVertexID jobVertexId = new JobVertexID(hash); // 生成顶点ID
// 2.下面的操作就是:将算子连内的哈希转为OperatorIDPair列表
// 获取算子链中的算子哈希,f0是算子的当前哈希值(用于生成OperatorID);f1是算子的历史哈希值(用于版本兼容)
List<Tuple2<byte[], byte[]>> chainedOperators = chainInfo.getChainedOperatorHashes(streamNodeId);
List<OperatorIDPair> operatorIDPairs = new ArrayList<>();
if (chainedOperators != null) {
for (Tuple2<byte[], byte[]> chainedOperator : chainedOperators) {
OperatorID userDefinedOperatorID = chainedOperator.f1 == null ? null : new OperatorID(chainedOperator.f1);
operatorIDPairs.add(OperatorIDPair.of(new OperatorID(chainedOperator.f0), userDefinedOperatorID));
}
}
// 3.创建物理节点JobVertex,比如Source->Map,就会构造成一个物理节点
if (chainedInputOutputFormats.containsKey(streamNodeId)) {
jobVertex = new InputOutputFormatVertex(
chainedNames.get(streamNodeId), // 链名称(如"Source -> Map")
jobVertexId,
operatorIDPairs);
chainedInputOutputFormats
.get(streamNodeId)
.write(new TaskConfig(jobVertex.getConfiguration()));
} else {
jobVertex = new JobVertex(
chainedNames.get(streamNodeId), // 链名称(如"Source -> Map")
jobVertexId,
operatorIDPairs);
}
// 4.算子协调器注册,协调故障恢复的状态
for (OperatorCoordinator.Provider coordinatorProvider : chainInfo.getCoordinatorProviders()) {
try {
jobVertex.addOperatorCoordinator(new SerializedValue<>(coordinatorProvider));
} catch (IOException e) {
throw new FlinkRuntimeException(String.format(
"Coordinator Provider for node %s is not serializable.", chainedNames.get(streamNodeId)), e);
}
}
// 5.下面都是一些资源并行度的设置了
jobVertex.setResources(chainedMinResources.get(streamNodeId), chainedPreferredResources.get(streamNodeId));
jobVertex.setInvokableClass(streamNode.getJobVertexClass());
int parallelism = streamNode.getParallelism();
if (parallelism > 0) {
jobVertex.setParallelism(parallelism);
} else {
parallelism = jobVertex.getParallelism();
}
jobVertex.setMaxParallelism(streamNode.getMaxParallelism());
if (LOG.isDebugEnabled()) {
LOG.debug("Parallelism set: {} for {}", parallelism, streamNodeId);
}
// TODO: inherit InputDependencyConstraint from the head operator
jobVertex.setInputDependencyConstraint(streamGraph.getExecutionConfig().getDefaultInputDependencyConstraint());
// 5.将JobVertex添加到jobVertices和jobGraph中
jobVertices.put(streamNodeId, jobVertex);
builtVertices.add(streamNodeId);
jobGraph.addVertex(jobVertex);
// 6.返回当前生成的物理节点JobVertex的StreamConfig
return new StreamConfig(jobVertex.getConfiguration());
}
《5》调用的connect()
方法作用:将两个链的JobVertex用JobEdge去连接起来;如Chain1[Source->Map] --JobEdge-> Chain2[KeyBy]
private void connect(Integer headOfChain, StreamEdge edge) {
// 假设现在调用connect的是source算子,此时headOfChain为source算子的ID,edge为Map->KeyBy中间的出边Edge2
// 1.记录物理边的顺序
physicalEdgesInOrder.add(edge);
// 2.获取链源头算子和下游算子的JobVertex
// 下游算子ID
Integer downStreamVertexID = edge.getTargetId();
JobVertex headVertex = jobVertices.get(headOfChain); // 上游顶点,如"Source->Map"合起来的JobVertex点
JobVertex downStreamVertex = jobVertices.get(downStreamVertexID); // 下游顶点,如KeyBy算子
StreamConfig downStreamConfig = new StreamConfig(downStreamVertex.getConfiguration());
downStreamConfig.setNumberOfNetworkInputs(downStreamConfig.getNumberOfNetworkInputs() + 1); // KeyBy的网络输入数+1
// 3.获取出边的分区器,Edge2的分区器就是HashPartitioner
StreamPartitioner<?> partitioner = edge.getPartitioner();
// 4.结果分区类型判断,基本都是PIPELINED
ResultPartitionType resultPartitionType;
switch (edge.getShuffleMode()) {
case PIPELINED:
resultPartitionType = ResultPartitionType.PIPELINED_BOUNDED;
break;
case BATCH:
resultPartitionType = ResultPartitionType.BLOCKING;
break;
case UNDEFINED:
resultPartitionType = determineResultPartitionType(partitioner);
break;
default:
throw new UnsupportedOperationException("Data exchange mode " +
edge.getShuffleMode() + " is not supported yet.");
}
checkAndResetBufferTimeout(resultPartitionType, edge);
// 5.创建作业边并连接JobVertex,这个边不是算子链内部的边,而是链与链之前的物理出边,比如:Chain1[Source->Map] --JobEdge-> Chain2[KeyBy]
JobEdge jobEdge;
// 点对点分发:如ForwardPartitioner
if (isPointwisePartitioner(partitioner)) {
jobEdge = downStreamVertex.connectNewDataSetAsInput(
headVertex,
DistributionPattern.POINTWISE,
resultPartitionType);
} else { // 全连接分发:如HashPartitioner、RebalancePartitioner;上述案例走这里
jobEdge = downStreamVertex.connectNewDataSetAsInput(
headVertex,
DistributionPattern.ALL_TO_ALL,
resultPartitionType);
}
// set strategy name so that web interface can show it.
jobEdge.setShipStrategyName(partitioner.toString());
jobEdge.setDownstreamSubtaskStateMapper(partitioner.getDownstreamSubtaskStateMapper());
jobEdge.setUpstreamSubtaskStateMapper(partitioner.getUpstreamSubtaskStateMapper());
if (LOG.isDebugEnabled()) {
LOG.debug("CONNECTED: {} - {} -> {}", partitioner.getClass().getSimpleName(),
headOfChain, downStreamVertexID);
}
}
2° 举例子讲解
比如有拓扑结构:Source --> Edge1 --> Map -> Edge2 --> KeyBy --> Edge3 --> Window --> Edge4--> Sink
-
Source算子来了
- 找
startNode
就是Source算子,currentNode
是Source算子的StreamNode - 划分出边:
chainableOutputs为[Edge1]
,nonChainableOutputs为[]
- 递归调用:
- 可扩展:√-走这里,调
createChain(Map,2,...,....)
待Map层返回结果后,addAll
到transitiveOutEdges
中-TODO-2,Map层返回的是[Edge2]
- 不可扩展开新链:×-不走这
- 可扩展:√-走这里,调
- 根据
chainableOutputs
生成算子链显示名称: 调用createChainedName()
结果为"Source->Map"
- 注册算子链元数据:调用
chainInfo.addNodeToChain()
将"Source->Map"注册到chainInfo中
- 判断
currentNode
是否是startNode
,开始生成JobVertex
- 是:√-走这里,根据
chainInfo
元数据去创建JobVertex
作业节点:也就是把"Source->Map"合起来,生成一个JobVertex节点
- 不是:×-不走这
- 是:√-走这里,根据
- 判断
currentNode
是否是startNode
,开始connect()
实现两个JobVertex
之间的连接- 是:√-走这里,调用
connect()
:遍历transitiveOutEdges
中的外出边就是Map层传回来的[Edge2]
,将startNodeId
对应的JobVertex就是"Source->Map"合起来的JobVertex
和这些外出边的目标算子的JobVertex就是"KeyBy->Window"合起来的JobVertex
连接起来 - 不是:×-不走这
- 是:√-走这里,调用
- 判断当前节点是否为本次chain的终点:就是判断
chainableOutputs
是否为空- 为空:×-不走这
- 不为空:√-走这里,不处理
- 返回
当前层的transitiveOutEdges:就是[Edge2]
- 找
-
Map算子是被Source层中调用
createChain(Map,2,...,....)
来的- 找
startNode
是Source算子,currentNode
是Map算子的StreamNode - 划分出边:
chainableOutputs为[]
,nonChainableOutputs为[Edge2]
- 递归调用:
- 可扩展:×-不走这
- 不可扩展开新链:√-->走这里,
transitiveOutEdges.add(Edge2)
,然后递归调用createChain(KeyBy,1,...,....)
开启新链;
- 根据
chainableOutputs
生成算子链显示名称:此时Map的chainableOutputs是[]
,因此调用createChainedName()
结果为"Map"
- 注册算子链元数据:调用
chainInfo.addNodeToChain()
将"Map"
注册到chainInfo
中 - 判断
currentNode
是否是startNode
,开始生成JobVertex
- 是:×-不走这
- 不是:√-走这里,new一个空的
StreamConfig
注意:这里并没有对Map算子进行构造JobVertex,说明它是算子链中的某一个节点,不是startNode
- 判断
currentNode
是否是startNode
,开始connect()
实现两个JobVertex
之间的连接- 是:×-不走这
- 不是:√-走这里,将Map节点的
StreamConfig
配置加到chainedConfigs
中,这是一个Map<startNodeId,Map<currentNodeId,StreamConfig>>
- 判断当前节点是否为本次chain的终点:就是判断
chainableOutputs
是否为空- 为空:√-走这里,说明当前节点是本次算子链Source->Map的end,调用
config.setChainEnd()
即可 - 不为空:×-不走这
- 为空:√-走这里,说明当前节点是本次算子链Source->Map的end,调用
- 返回
当前层的transitiveOutEdges:就是[Edge2]
- 找
至此,Source->Map的链化我们就明白,Map层给Source返回的transitiveOutEdges:就是[Edge2]
-
KeyBy算子是被Map层中调用
createChain(KeyBy,1,...,....)
来的- 找
startNode
是KeyBy算子,currentNode
是KeyBy算子的StreamNode - 划分出边:
chainableOutputs为[Edge3]
,nonChainableOutputs为[]
- 递归调用:
- 可扩展:√-走这里,调
createChain(Window,2,...,...)
待Window层返回结果后,addAll
到transitiveOutEdges
中-TODO-4,Window层返回的是[Edge4]
- 不可扩展开新链:×-不走这
- 可扩展:√-走这里,调
- 根据
chainableOutputs
生成算子链显示名称:此时KeyBy的chainableOutputs是[Edge3]
,调用createChainedName()
结果为"KeyBy->Window"
- 注册算子链元数据:调用
chainInfo.addNodeToChain()
将"KeyBy->Window"
注册到chainInfo
中 - 判断
currentNode
是否是startNode
,开始生成JobVertex
- 是:√-走这里,根据
chainInfo
元数据去创建JobVertex
作业节点:也就是把"KeyBy->Window"合起来,生成一个JobVertex节点
- 不是:×-不走这 注意:这里并没有对Map算子进行构造JobVertex,说明它是算子链中的某一个节点,不是startNode
- 是:√-走这里,根据
- 判断
currentNode
是否是startNode
,开始connect()
实现两个JobVertex
之间的连接- 是:√-走这里,调用
connect()
:遍历transitiveOutEdges
中的外出边就是Window层传回来的[Edge4]
,然后将startNodeId
对应的JobVertex就是"KeyBy->Window"合起来的JobVertex
和这些外出边目标算子的JobVertex就是"Sink"的JobVertex
连接起来 - 不是:×-不走这
- 是:√-走这里,调用
- 判断当前节点是否为本次chain的终点:就是判断
chainableOutputs
是否为空- 为空:×-不走这
- 不为空:√-走这里,不处理
- 返回
当前层的transitiveOutEdges:就是[Edge4]
- 找
-
Window算子是被KeyBy层中调用
createChain(Window,2,...,...)
来的- 找
startNode
是KeyBy算子,currentNode
是Window算子的StreamNode - 划分出边:
chainableOutputs为[]
,nonChainableOutputs为[Edge4]
- 递归调用:
- 可扩展:×-不走这
- 不可扩展开新链:√-走这里,
transitiveOutEdges.add(Edge4)
,然后递归调用createChain(Sink,1,...,...)
开启新链;
- 根据
chainableOutputs
生成算子链显示名称:此时Window的chainableOutputs是[]
,调用createChainedName()
结果为"Window"
- 注册算子链元数据:调用
chainInfo.addNodeToChain()
将"KeyBy->Window"
注册到chainInfo
中 - 判断
currentNode
是否是startNode
,开始生成JobVertex
- 是:×-不走这
- 不是:√-走这里,new一个空的
StreamConfig
其实看完map就知道Window操作和Map的操作一样,都不会对他俩创建JobVertex,而是在他俩的startNode中去将他们合起来创建一个JobVertex
- 判断
currentNode
是否为startNode
,开始connect()
实现两个JobVertex
之间的连接- 是:×-不走这
- 不是:√-走这里,将Window节点的
StreamConfig
配置加到chainedConfigs
中,这是一个Map<startNodeId,Map<currentNodeId,StreamConfig>>
- 判断当前节点是否为本次chain的终点:就是判断
chainableOutputs
是否为空- 为空:√-走这里,说明当前节点是本次算子链KeyBy->Window的end,调用
config.setChainEnd()
即可 - 不为空:×-不走这
- 为空:√-走这里,说明当前节点是本次算子链KeyBy->Window的end,调用
- 返回
当前层的transitiveOutEdges:就是[Edge4]
- 找
至此,KeyBy->Window的链化我们就明白,Window层给KeyBy返回的transitiveOutEdges:就是[Edge4]
- Sink算子被Window层中调用
createChain(Sink,1,...,...)
来的- 找
startNode
就是Sink算子,currentNode
是Sink算子的StreamNode - 划分出边:
chainableOutputs为[]
,nonChainableOutputs为[]
- 递归调用:这里都不会走,因为Sink是最后一个算子了
- 可扩展:×-不走这
- 不可扩展开新链:×-不走这
- 根据
chainableOutputs
生成算子链显示名称: 调用createChainedName()
结果为"Sink"
- 注册算子链元数据:调用
chainInfo.addNodeToChain()
将"Sink"
注册到chainInfo
中 - 判断currentNode是否是startNode,开始生成JobVertex
- 是:√-走这里,根据
chainInfo
元数据去创建JobVertex
作业节点:也就是把"Sink"生成一个JobVertex节点
- 不是:×-不走这
- 是:√-走这里,根据
- 判断
currentNode
是否为startNode
,开始connect()
实现两个JobVertex
之间的连接- 是:√-走这里,但此时它没有transitiveOutEdges,因此不会调connect()
- 不是:×-不走这
- 判断当前节点是否为本次chain的终点:就是判断
chainableOutputs
是否为空- 为空:√-走这里,说明当前节点是本次算子链Sink的end,调用
config.setChainEnd()
即可 - 不为空:×-不走这
- 为空:√-走这里,说明当前节点是本次算子链Sink的end,调用
- 返回
当前层的transitiveOutEdges:就是[]
- 找
ok,到这所有关于算子链的连接操作已经完成了 我们得到了3个JobVertex和2个transitiveOutEdges如下
JobVertex1:[Source->Map] | transitiveOutEdges为Edge2
JobVertex2:[KeyBy->Window] | transitiveOutEdges为Edge4
JobVertex3:[Sink] | 没有transitiveOutEdges
并且在connect的方法中,我们对两两JobVertex进行连接依靠transitiveOutEdges去找下游的JobVertex.
所以,最终的JobGraph就是:JobVertex1-Edge2->JobVertex2-Edge4->JobVertex3