Flink-Graph-3.JobGraph生成源码一.JobGraph生成源码机制 0.先说结论 StreamGrap

一.JobGraph生成源码机制

0.先说结论

StreamGraph 转变成 JobGraph 也是在 Client 完成，主要做了以下几件事：

对StreamNode进行划分算子链(如"Source->Map"、"KeyBy->Window")
对链化的结果转成JobVertex。
对transitiveOutEdges中记录的StreamEdge转成JobEdge。
JobEdge 和 JobVertex 之间创建 IntermediateDataSet 进行连接

前言：

JobGraph是在StreamGraph基础上生成的，不了解的可以先去看一下Flink-Graph-2.StreamGraph生成源码
下面的源码解析，类似套娃，一层一层调用，直到找到最后干活的人，比较复杂，建议结合最后的举例子多看几遍createChain()

1.还是从StreamExecutionEnvironment入手

在env.execute()方法调用的时候，会根据已经生成的StreamGraph去生成JobGraph

// 1.这是env.execute()的入口，会掉下面的execute(String jobName)
public JobExecutionResult execute() throws Exception {
	return execute(getJobName());
}

// 2.获取StreamGraph，并接着调用execute(StreamGraph streamGraph)
public JobExecutionResult execute(String jobName) throws Exception {
	Preconditions.checkNotNull(jobName, "Streaming Job name should not be null.");

	return execute(getStreamGraph(jobName));
}

// 3.这里会根据StreamGraph去生成JobGraph，调executeAsync方法，忽略一些try-catch代码，方便观察
public JobExecutionResult execute(StreamGraph streamGraph) throws Exception {
	final JobClient jobClient = executeAsync(streamGraph);

	final JobExecutionResult jobExecutionResult;

	if (configuration.getBoolean(DeploymentOptions.ATTACHED)) {
		jobExecutionResult = jobClient.getJobExecutionResult().get();
	} else {
		jobExecutionResult = new DetachedJobExecutionResult(jobClient.getJobID());
	}

	jobListeners.forEach(jobListener -> jobListener.onJobExecuted(jobExecutionResult, null));

	return jobExecutionResult;
}

// 4.这是最终被调用的方法，在这里回去check一下StreamGraph是否生成，然后在executorFactory的execute方法中根据StreamGraph去生成对应的JobGraph
public JobClient executeAsync(StreamGraph streamGraph) throws Exception {
	checkNotNull(streamGraph, "StreamGraph cannot be null.");
	checkNotNull(configuration.get(DeploymentOptions.TARGET), "No execution.target specified in your configuration file.");
	
	// 根据提交模式选择匹配的 factory 
	final PipelineExecutorFactory executorFactory =
		executorServiceLoader.getExecutorFactory(configuration);

	checkNotNull(
		executorFactory,
		"Cannot find compatible factory for specified execution.target (=%s)",
		configuration.get(DeploymentOptions.TARGET));

	//重点：选择合适的 executor 提交任务 
	CompletableFuture<JobClient> jobClientFuture = executorFactory
		.getExecutor(configuration)
		.execute(streamGraph, configuration, userClassloader);

	JobClient jobClient = jobClientFuture.get();
	jobListeners.forEach(jobListener -> jobListener.onJobSubmitted(jobClient, null));
	return jobClient;
}

我们会发现executorFactory类型是PipelineExecutorFactory这是一个工厂接口，他的实现类如图

public interface PipelineExecutorFactory {
	String getName();
	boolean isCompatibleWith(final Configuration configuration);
	PipelineExecutor getExecutor(final Configuration configuration);
}

我们看其中一个YarnJobClusterExecutorFactory，代码如下

public class YarnJobClusterExecutorFactory implements PipelineExecutorFactory {

	@Override
	public String getName() {
		return YarnJobClusterExecutor.NAME;
	}

	@Override
	public boolean isCompatibleWith(@Nonnull final Configuration configuration) {
		return YarnJobClusterExecutor.NAME.equalsIgnoreCase(configuration.get(DeploymentOptions.TARGET));
	}
       
	@Override
	public PipelineExecutor getExecutor(@Nonnull final Configuration configuration) {
		try {
                     // 重点在这
                     return new YarnJobClusterExecutor();
		} catch (NoClassDefFoundError e) {
			throw new IllegalStateException(YarnDeploymentTarget.ERROR_MESSAGE);
		}
	}
}

可以看到在getExecutor()又调用了YarnJobClusterExecutor他的代码如下

public class YarnJobClusterExecutor extends AbstractJobClusterExecutor<ApplicationId, YarnClusterClientFactory> {
	public static final String NAME = YarnDeploymentTarget.PER_JOB.getName();
	public YarnJobClusterExecutor() {
                // 重点在这
		super(new YarnClusterClientFactory()); // 调用父类AbstractJobClusterExecutor的方法
	}
}

最终，我们这一层干活的人是AbstractJobClusterExecutor

2.再看`AbstractJobClusterExecutor`

public class AbstractJobClusterExecutor<ClusterID, ClientFactory extends ClusterClientFactory<ClusterID>> implements PipelineExecutor {

	private static final Logger LOG = LoggerFactory.getLogger(AbstractJobClusterExecutor.class);

	private final ClientFactory clusterClientFactory;

	public AbstractJobClusterExecutor(@Nonnull final ClientFactory clusterClientFactory) {
		this.clusterClientFactory = checkNotNull(clusterClientFactory);
	}
        // 重点在这
	@Override
	public CompletableFuture<JobClient> execute(@Nonnull final Pipeline pipeline, @Nonnull final Configuration configuration, @Nonnull final ClassLoader userCodeClassloader) throws Exception {
		// Pipeline是接口，StreamGraph是其实现类，在StreamExecutionEnvironment的executeAsync方法中传入StreamGraph参数
		// 1.调用PipelineExecutorUtils.getJobGraph去获得将流图StreamGraph转换成作业图JobStream
		final JobGraph jobGraph = PipelineExecutorUtils.getJobGraph(pipeline, configuration);

		/*2.集群描述器：创建、启动了 YarnClient， 包含了一些yarn、flink的配置和环境信息*/
		try (final ClusterDescriptor<ClusterID> clusterDescriptor = clusterClientFactory.createClusterDescriptor(configuration)) {
			final ExecutionConfigAccessor configAccessor = ExecutionConfigAccessor.fromConfiguration(configuration);

			/*2.1集群特有资源配置：JobManager内存、TaskManager内存、每个Tm的slot数*/
			final ClusterSpecification clusterSpecification = clusterClientFactory.getClusterSpecification(configuration);

			final ClusterClientProvider<ClusterID> clusterClientProvider = clusterDescriptor
					.deployJobCluster(clusterSpecification, jobGraph, configAccessor.getDetachedMode());
			LOG.info("Job has been submitted with JobID " + jobGraph.getJobID());

			return CompletableFuture.completedFuture(
					new ClusterClientJobClientAdapter<>(clusterClientProvider, jobGraph.getJobID(), userCodeClassloader));
		}
	}
}

ok，这里其实生成JobStream的方法是PipelineExecutorUtils.getJobGraph，继续

public class PipelineExecutorUtils {

	/**
	 * Creates the {@link JobGraph} corresponding to the provided {@link Pipeline}.
	 *
	 * @param pipeline the pipeline whose job graph we are computing
	 * @param configuration the configuration with the necessary information such as jars and
	 *                         classpaths to be included, the parallelism of the job and potential
	 *                         savepoint settings used to bootstrap its state.
	 * @return the corresponding {@link JobGraph}.
	 */
	public static JobGraph getJobGraph(@Nonnull final Pipeline pipeline, @Nonnull final Configuration configuration) throws MalformedURLException {
		// 1.再次检测StreamGraph不为null，否则抛出异常
		checkNotNull(pipeline);
		checkNotNull(configuration);

		final ExecutionConfigAccessor executionConfigAccessor = ExecutionConfigAccessor.fromConfiguration(configuration);
		// 2.再调FlinkPipelineTranslationUtil.getJobGraph，返回JobGraph
		final JobGraph jobGraph = FlinkPipelineTranslationUtil
				.getJobGraph(pipeline, configuration, executionConfigAccessor.getParallelism());
		
		// 3.下面是一些配置了，给JobGraph配置jar包、classpath、jobID、savepoint等
		configuration
				.getOptional(PipelineOptionsInternal.PIPELINE_FIXED_JOB_ID)
				.ifPresent(strJobID -> jobGraph.setJobID(JobID.fromHexString(strJobID)));

		jobGraph.addJars(executionConfigAccessor.getJars());
		jobGraph.setClasspaths(executionConfigAccessor.getClasspaths());
		jobGraph.setSavepointRestoreSettings(executionConfigAccessor.getSavepointRestoreSettings());
		// 4.返回JobGraph
		return jobGraph;
	}
}

然后看FlinkPipelineTranslationUtil.getJobGraph()

public final class FlinkPipelineTranslationUtil {

	public static JobGraph getJobGraph(
			Pipeline pipeline,
			Configuration optimizerConfiguration,
			int defaultParallelism) {

		// FlinkPipelineTranslator是一个接口，他的实现类如上图，有StreamGraphTranslator和PlanTranslator

		FlinkPipelineTranslator pipelineTranslator = getPipelineTranslator(pipeline);
		// 再由其实现类去调translateToJobGraph将StreamGraph转为JobGraph
		return pipelineTranslator.translateToJobGraph(pipeline,
				optimizerConfiguration,
				defaultParallelism);
	}

	。。。
}

然后看StreamGraphTranslator

public class StreamGraphTranslator implements FlinkPipelineTranslator {

	private static final Logger LOG = LoggerFactory.getLogger(StreamGraphTranslator.class);

	@Override
	public JobGraph translateToJobGraph(
			Pipeline pipeline,
			Configuration optimizerConfiguration,
			int defaultParallelism) {
		checkArgument(pipeline instanceof StreamGraph,
				"Given pipeline is not a DataStream StreamGraph.");

		StreamGraph streamGraph = (StreamGraph) pipeline;
		// 调StreamGraph的getJobGraph方法
		return streamGraph.getJobGraph(null);
	}
	。。。
}

在StreamGraph中的getJobGraph()又会调StreamingJobGraphGenerator的createJobGraph()

// 1.StreamGraph
public JobGraph getJobGraph(@Nullable JobID jobID) {  
    return StreamingJobGraphGenerator.createJobGraph(this, jobID);  
}

// 2.StreamingJobGraphGenerator
public static JobGraph createJobGraph(StreamGraph streamGraph, @Nullable JobID jobID) {     
   return new StreamingJobGraphGenerator(streamGraph, jobID).createJobGraph();
}

好了，到这我们终于找到真正干活的人了---StreamingJobGraphGenerator

3.真正干活的人---`StreamingJobGraphGenerator`

(1) 相关属性

private final StreamGraph streamGraph;  
// id -> JobVertex  
private final Map<Integer, JobVertex> jobVertices;  
private final JobGraph jobGraph;  
// 已经构建的JobVertex的id集合  
private final Collection<Integer> builtVertices;  
// 物理边集合（排除了chain内部的边）, 按创建顺序排序  
private final List<StreamEdge> physicalEdgesInOrder;  
// 保存chain信息，部署时用来构建 OperatorChain，startNodeId -> (currentNodeId -> StreamConfig)  
private final Map<Integer, Map<Integer, StreamConfig>> chainedConfigs;  
// 所有节点的配置信息，id -> StreamConfig  
private final Map<Integer, StreamConfig> vertexConfigs;  
// 保存每个节点的名字，id -> chainedName  
private final Map<Integer, String> chainedNames;  
  
private final Map<Integer, ResourceSpec> chainedMinResources;  
private final Map<Integer, ResourceSpec> chainedPreferredResources;  
  
private final Map<Integer, InputOutputFormatContainer> chainedInputOutputFormats;  
  
private final StreamGraphHasher defaultStreamGraphHasher;  
private final List<StreamGraphHasher> legacyStreamGraphHashers;

(2) 核心处理逻辑

<1> `createJobGraph()`

StreamingJobGraphGenerator 的成员变量都是为了辅助生成最终的 JobGraph。为所有节点生成一个唯一的 hash id，如果节点在多次提交中没有改变（包括并发度、上下游等），那么这个 id 就不会改变，这主要用于故障恢复

这里不能用 StreamNode.id 来代替，因为StreamNode.id是一个从 1 开始的静态计数变量，同样的 Job 可能会得到不一样的 id，如下代码示例的两个 job 是完全一样的，但是 source 的 id 却不一样了。

// 范例 1：A.id=1  B.id=2
DataStream<String> A = ...
DataStream<String> B = ... A.union(B).print(); 
// 范例 2：A.id=2  B.id=1
DataStream<String> B = ...
DataStream<String> A = ... A.union(B).print();

	private JobGraph createJobGraph() {
		preValidate();
		// 0.这里都是一些配置
		// streaming 模式下，调度模式是所有节点（vertices）一起启动：Eager
		jobGraph.setScheduleMode(streamGraph.getScheduleMode());
		jobGraph.enableApproximateLocalRecovery(streamGraph.getCheckpointConfig().isApproximateLocalRecoveryEnabled());

		// 广度优先遍历 StreamGraph 并且为每个SteamNode生成算子hash id
		// 保证如果提交的拓扑没有改变，则每次生成的hash都是一样的
		Map<Integer, byte[]> hashes = defaultStreamGraphHasher.traverseStreamGraphAndGenerateHashes(streamGraph);

		List<Map<Integer, byte[]>> legacyHashes = new ArrayList<>(legacyStreamGraphHashers.size());
		for (StreamGraphHasher hasher : legacyStreamGraphHashers) {
			legacyHashes.add(hasher.traverseStreamGraphAndGenerateHashes(streamGraph));
		}

		// 1.最重要的函数--构建算子链，生成 JobVertex，JobEdge等，并尽可能地将多个节点chain在一起
		setChaining(hashes, legacyHashes);

		// 2.将每个JobVertex的入边集合也序列化到该JobVertex的StreamConfig中 (出边集合已经在setChaining的时候写入了)
		setPhysicalEdges();

		// 3.根据group name，为每个 JobVertex 指定所属的 SlotSharingGroup 以及针对 Iteration的头尾设置  CoLocationGroup
		setSlotSharingAndCoLocation();
		// 4.这里就是一些内存管理和检查点配置
		setManagedMemoryFraction(
			Collections.unmodifiableMap(jobVertices),
			Collections.unmodifiableMap(vertexConfigs),
			Collections.unmodifiableMap(chainedConfigs),
			id -> streamGraph.getStreamNode(id).getManagedMemoryOperatorScopeUseCaseWeights(),
			id -> streamGraph.getStreamNode(id).getManagedMemorySlotScopeUseCases());

		configureCheckpointing();

		jobGraph.setSavepointRestoreSettings(streamGraph.getSavepointRestoreSettings());

		JobGraphUtils.addUserArtifactEntries(streamGraph.getUserArtifacts(), jobGraph);

		// set the ExecutionConfig last when it has been finalized
		try {
			// 5.将 StreamGraph 的 ExecutionConfig 序列化到 JobGraph 的配置中
			jobGraph.setExecutionConfig(streamGraph.getExecutionConfig());
		}
		catch (IOException e) {
			throw new IllegalConfigurationException("Could not serialize the ExecutionConfig." +
					"This indicates that non-serializable types (like custom serializers) were registered");
		}

		return jobGraph;
	}

<2> 调用`setChain()`

private void setChaining(Map<Integer, byte[]> hashes, List<Map<Integer, byte[]>> legacyHashes) {
	// 1.构建算子链的起始点，规则是：链起始点是source算子或无法与上游链化的算子OperatorChainInfo 保存链的起始节点、优先级等信息
	final Map<Integer, OperatorChainInfo> chainEntryPoints = buildChainedInputsAndGetHeadInputs(hashes, legacyHashes);
	final Collection<OperatorChainInfo> initialEntryPoints = new ArrayList<>(chainEntryPoints.values());


	// 2.从起始点开始建⽴ node chains 算子链
	for (OperatorChainInfo info : initialEntryPoints) {
		// 构建node chains，返回当前节点的物理出边；startNodeId != currentNodeId 时,说明currentNode是chain中的子节点
		createChain(
				info.getStartNodeId(),
				1,  // operators start at position 1 because 0 is for chained source inputs
				info,
				chainEntryPoints);
	}
}

<2>`setChain`调用的`递归createChain()`

我们发现真正创建算子链的其实是createChain操作

1° 递归代码`createChian()`---关键

分为9步，这9步可以看最后2° 举例子讲解

// 构建 node chains，返回当前节点的物理出边
// startNodeId != currentNodeId 时,说明 currentNode 是 chain 中的子节点
private List<StreamEdge> createChain(
		final Integer currentNodeId, 			// 当前处理的算子节点ID
		final int chainIndex, 					// 当前算子链中的算子位置索引
		final OperatorChainInfo chainInfo,		// 链的元数据
		final Map<Integer, OperatorChainInfo> chainEntryPoints) // 链入口点的映射map
{
	// 1.获取起始点
	Integer startNodeId = chainInfo.getStartNodeId();
	// 防止相同起始点重复构造算子链
	if (!builtVertices.contains(startNodeId)) {
		/* 关键：这个transitiveOutEdges存了两类边，这个过程特别绕，具体流程看下面举例子讲解
		 *  1.递归调用的所有子链的外部边
		 *  2.不可链化的出边
		 * 比如有拓扑结构：Source --> Edge1 --> Map -> Edge2 --> KeyBy --> Edge3 --> Window --> Edge4--> Sink
		 * 链化结果如下：
		 * 	Chain1: [Source + Map]
		 * 	Chain2: [KeyBy + Window]
		 * 	Chain3: [Sink]
		 * 那么最终各个算子链的transitiveOutEdges结果如下
		 * chain1(起点是Source)的transitiveOutEdges有[Edge2]
		 * chain2(起点是KeyBy)的transitiveOutEdges有[Edge4]
		 * chain3(起点是Sink)的transitiveOutEdges有[]
		 * */
		List<StreamEdge> transitiveOutEdges = new ArrayList<StreamEdge>(); // 传递性出边集合，收集了当前算子链的所有外部输出边，这些边最终会在下面connect方法中去形成物理连接

		List<StreamEdge> chainableOutputs = new ArrayList<StreamEdge>();// 可以形成算子链的出边

		List<StreamEdge> nonChainableOutputs = new ArrayList<StreamEdge>();// 不可以形成算子链的出边

		StreamNode currentNode = streamGraph.getStreamNode(currentNodeId);

		// 2.将当前节点的出边分成 chainable 和 nonChainable 两类
		for (StreamEdge outEdge : currentNode.getOutEdges()) {
			// 判断当前节点和下游算子是否可以形成算子链
			if (isChainable(outEdge, streamGraph)) {
				chainableOutputs.add(outEdge);
			} else {
				nonChainableOutputs.add(outEdge);
			}
		}

		// 3.1 递归扩展当前链：对于当前节点中可链化的出边：扩展当前链，算子位置编号递增
		for (StreamEdge chainable : chainableOutputs) {
			// 这里你可以理解为是链表的遍历
			transitiveOutEdges.addAll(
					createChain(chainable.getTargetId(),  // 下游算子ID
						chainIndex + 1, 				  // 链位置+1
						chainInfo,
						chainEntryPoints));
		}

		// 3.2 创建新链起点：对于当前节点中不可链化的出边：终止当前链，为目标算子创建新的链起始点
		for (StreamEdge nonChainable : nonChainableOutputs) {
			transitiveOutEdges.add(nonChainable); // 添加不可链化的出边
			// 创建新链
			createChain(
					nonChainable.getTargetId(), // 出边对应的下游算子id
					1, 							// 以这个下游算子作为新chain的起点
					chainEntryPoints.computeIfAbsent( // 确保每个算子只作为一个链的起点
						nonChainable.getTargetId(),
						(k) -> chainInfo.newChain(nonChainable.getTargetId())),
					chainEntryPoints);
		}

		// 4.生成当前节点的算子链显示名，如："Source -> Map"
		// 以上述案例来说，这里chainedNames是一个map存的是(source算子ID,"Source -> Map")
		chainedNames.put(currentNodeId, createChainedName(currentNodeId, chainableOutputs, Optional.ofNullable(chainEntryPoints.get(currentNodeId))));
		chainedMinResources.put(currentNodeId, createChainedMinResources(currentNodeId, chainableOutputs));
		chainedPreferredResources.put(currentNodeId, createChainedPreferredResources(currentNodeId, chainableOutputs));

		// 5.负责将当前算子注册到算子链中生成关键元数据(source算子ID,"Source -> Map")
		OperatorID currentOperatorId = chainInfo.addNodeToChain(currentNodeId, chainedNames.get(currentNodeId));

		if (currentNode.getInputFormat() != null) {
			getOrCreateFormatContainer(startNodeId).addInputFormat(currentOperatorId, currentNode.getInputFormat());
		}

		if (currentNode.getOutputFormat() != null) {
			getOrCreateFormatContainer(startNodeId).addOutputFormat(currentOperatorId, currentNode.getOutputFormat());
		}

		// 根据元数据信息，去创建作业顶点JobVertex，比如上面的案例，下面createJobVertex方法会创建一个[Source + Map]的作业顶点JobVertex
		// 6.1 如果当前节点是起始节点, 则直接创建 JobVertex 并返回其StreamConfig, 否则先创建一个空的 StreamConfig
		StreamConfig config = currentNodeId.equals(startNodeId)
				? createJobVertex(startNodeId, chainInfo)
				: new StreamConfig(new Configuration());

		// 6.2 设置 JobVertex 的 StreamConfig, 基本上是序列化 StreamNode 中的配置到 StreamConfig中
		setVertexConfig(currentNodeId, config, chainableOutputs, nonChainableOutputs, chainInfo.getChainedSources());

		// 7.如果是chain的起始节点，标记成chain start（不是chain中的节点，也会被标记成 chain start）
		if (currentNodeId.equals(startNodeId)) {
			// 起始点的一些配置
			config.setChainStart();
			config.setChainIndex(chainIndex);
			config.setOperatorName(streamGraph.getStreamNode(currentNodeId).getOperatorName());

			// 7.1 将当前这个起始节点与所有出边相连---物理连接
			for (StreamEdge edge : transitiveOutEdges) {
				// 通过StreamEdge构建出JobEdge，创建 IntermediateDataSet，用来将JobVertex和JobEdge相连
				// 其实就是将两个不同chain的JobVertex相连，比如[Source->Map] --> [Map->Window] --> [Sink]
				connect(startNodeId, edge);
			}

			// 7.2 把物理出边写入配置, 部署时会用到
			config.setOutEdgesInOrder(transitiveOutEdges);
			// 7.3 将chain中所有子节点的StreamConfig写入到 headOfChain 节点的 CHAINED_TASK_CONFIG 配置中
			config.setTransitiveChainedTaskConfigs(chainedConfigs.get(startNodeId));

		} else {
			// 8.如果是 chain 中的子节点，管理子节点配置信息，确保同一chain内的所有算子配置被记录
			// 比如map层来了，发现startNode是Source，chainedConfigs没有这个key，会创建一个map
			chainedConfigs.computeIfAbsent(startNodeId, k -> new HashMap<Integer, StreamConfig>());
			// 对mao节点进行配置
			config.setChainIndex(chainIndex);
			StreamNode node = streamGraph.getStreamNode(currentNodeId);
			config.setOperatorName(node.getOperatorName());
			// 将当前节点(map)的StreamConfig添加到该chain(Source->Map)的config集合中
			chainedConfigs.get(startNodeId).put(currentNodeId, config);
		}

		config.setOperatorID(currentOperatorId);
		// 8.判断当前节点是否是本次chain的终点
		if (chainableOutputs.isEmpty()) {
			config.setChainEnd();
		}
		// 9.返回连往chain外部的出边集合
		return transitiveOutEdges;

	} else {
		return new ArrayList<>();
	}
}

《1》调用的`isChainable()`

方法作用：判断当前的算子和出边能否链起来

规则如下，必须全部都满足

上下游算子的slot sharing组相同
上下游算子满足可链化策略要求
上下游算子中间连接的edge边的分区类型是ForwardPartitioner
edge边的分区类型不是BATCH
上下游算子的parallelism相同
全局开启chaining

public static boolean isChainable(StreamEdge edge, StreamGraph streamGraph) {
	// 获取出边的下游算子StreamNode
	StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);
	// 判断条件：他的上游入边只能有一个，且满足isChainableInput
	return downStreamVertex.getInEdges().size() == 1
			&& isChainableInput(edge, streamGraph);
}

private static boolean isChainableInput(StreamEdge edge, StreamGraph streamGraph) {
	// edge的上游算子StreamNode
	StreamNode upStreamVertex = streamGraph.getSourceVertex(edge);
	// edge的下游算子StreamNode
	StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);
	/* 规则如下，必须全部都满足
	*  1.上下游算子的slot sharing组相同
	*  2.上下游算子满足可链化策略要求
	*  3.上下游算子中间连接的edge边的分区类型是ForwardPartitioner
	*  4.edge边的分区类型不是BATCH
	*  5.上下游算子的parallelism相同
	*  6.开启chaining
	* */
	if (!(upStreamVertex.isSameSlotSharingGroup(downStreamVertex)
		&& areOperatorsChainable(upStreamVertex, downStreamVertex, streamGraph)
		&& (edge.getPartitioner() instanceof ForwardPartitioner)
		&& edge.getShuffleMode() != ShuffleMode.BATCH
		&& upStreamVertex.getParallelism() == downStreamVertex.getParallelism()
		&& streamGraph.isChainingEnabled())) {

		return false;
	}

《2》调用的`createChainedName()`

方法作用：返回链化结果，对不能链化的直接返回算子；如"Source->Map"和"Map"

结果传给addNodeToChain

// createChain的调用方式：chainedNames.put(currentNodeId, createChainedName(currentNodeId, chainableOutputs, Optional.ofNullable(chainEntryPoints.get(currentNodeId))));
// 比如Source->Map可用链化，那么就会返回"Source->Map"
private String createChainedName(Integer vertexID, List<StreamEdge> chainedOutputs, Optional<OperatorChainInfo> operatorChainInfo) {
	final String operatorName = nameWithChainedSourcesInfo(
		streamGraph.getStreamNode(vertexID).getOperatorName(),
		operatorChainInfo.map(chain -> chain.getChainedSources().values()).orElse(Collections.emptyList()));
	if (chainedOutputs.size() > 1) {
		List<String> outputChainedNames = new ArrayList<>();
		for (StreamEdge chainable : chainedOutputs) {
			outputChainedNames.add(chainedNames.get(chainable.getTargetId()));
		}
		return operatorName + " -> (" + StringUtils.join(outputChainedNames, ", ") + ")";
	} else if (chainedOutputs.size() == 1) {
		return operatorName + " -> " + chainedNames.get(chainedOutputs.get(0).getTargetId());
	} else {
		return operatorName;
	}
}

《3》调用的`addNodeToChain()`

方法作用：利用createChainedName返回的算子链字符串，去注册算子链，后续createJobVetex会用到

结果传给createJobVertex

// createChain中调用方式：chainInfo.addNodeToChain(SourceID, "Source -> Map")
private OperatorID addNodeToChain(int currentNodeId, String operatorName) {
	// 1.初始化算子链内的算子哈希列表
	List<Tuple2<byte[], byte[]>> operatorHashes =
			chainedOperatorHashes.computeIfAbsent(startNodeId, k -> new ArrayList<>());
	// 2.获取当前算子的主哈希值
	byte[] primaryHashBytes = hashes.get(currentNodeId);
	// 3. 处理历史哈希（兼容旧版本）
	for (Map<Integer, byte[]> legacyHash : legacyHashes) {
		operatorHashes.add(new Tuple2<>(primaryHashBytes, legacyHash.get(currentNodeId)));
	}
	// 4. 注册算子协调器（OperatorCoordinator）---底层是通过反射创建协调器实例
	streamGraph
			.getStreamNode(currentNodeId) // Source算子ID
			.getCoordinatorProvider(operatorName, new OperatorID(getHash(currentNodeId))) // operatorName如"Source -> Map"
			.map(coordinatorProviders::add);
	// 5. 返回算子ID（基于主哈希值）
	return new OperatorID(primaryHashBytes);
}

《4》调用的`createJobVertex()`

方法作用：基于addNodeToChain中注册的元数据去将算子链转换成物理节点JobVertex，会把Source->Map这种链化结果，转换成一个物理节点

基于addNodeToChain中注册的元数据去将算子链转换成物理节点JobVertex
private StreamConfig createJobVertex(
		Integer streamNodeId,
		OperatorChainInfo chainInfo) {
	// 1.获取StreamNode，如Source、Map
	JobVertex jobVertex;
	StreamNode streamNode = streamGraph.getStreamNode(streamNodeId);

	byte[] hash = chainInfo.getHash(streamNodeId); // 获取StreamNode的hash

	if (hash == null) {
		throw new IllegalStateException("Cannot find node hash. " +
				"Did you generate them before calling this method?");
	}

	JobVertexID jobVertexId = new JobVertexID(hash); // 生成顶点ID
	// 2.下面的操作就是：将算子连内的哈希转为OperatorIDPair列表
	// 获取算子链中的算子哈希，f0是算子的当前哈希值（用于生成OperatorID）；f1是算子的历史哈希值（用于版本兼容）
	List<Tuple2<byte[], byte[]>> chainedOperators = chainInfo.getChainedOperatorHashes(streamNodeId);
	List<OperatorIDPair> operatorIDPairs = new ArrayList<>();
	if (chainedOperators != null) {
		for (Tuple2<byte[], byte[]> chainedOperator : chainedOperators) {
			OperatorID userDefinedOperatorID = chainedOperator.f1 == null ? null : new OperatorID(chainedOperator.f1);
			operatorIDPairs.add(OperatorIDPair.of(new OperatorID(chainedOperator.f0), userDefinedOperatorID));
		}
	}

	// 3.创建物理节点JobVertex，比如Source->Map，就会构造成一个物理节点
	if (chainedInputOutputFormats.containsKey(streamNodeId)) {
		jobVertex = new InputOutputFormatVertex(
				chainedNames.get(streamNodeId), // 链名称（如"Source -> Map"）
				jobVertexId,
				operatorIDPairs);

		chainedInputOutputFormats
			.get(streamNodeId)
			.write(new TaskConfig(jobVertex.getConfiguration()));
	} else {
		jobVertex = new JobVertex(
				chainedNames.get(streamNodeId), // 链名称（如"Source -> Map"）
				jobVertexId,
				operatorIDPairs);
	}

	// 4.算子协调器注册,协调故障恢复的状态
	for (OperatorCoordinator.Provider coordinatorProvider : chainInfo.getCoordinatorProviders()) {
		try {
			jobVertex.addOperatorCoordinator(new SerializedValue<>(coordinatorProvider));
		} catch (IOException e) {
			throw new FlinkRuntimeException(String.format(
					"Coordinator Provider for node %s is not serializable.", chainedNames.get(streamNodeId)), e);
		}
	}
	// 5.下面都是一些资源并行度的设置了
	jobVertex.setResources(chainedMinResources.get(streamNodeId), chainedPreferredResources.get(streamNodeId));

	jobVertex.setInvokableClass(streamNode.getJobVertexClass());

	int parallelism = streamNode.getParallelism();

	if (parallelism > 0) {
		jobVertex.setParallelism(parallelism);
	} else {
		parallelism = jobVertex.getParallelism();
	}

	jobVertex.setMaxParallelism(streamNode.getMaxParallelism());

	if (LOG.isDebugEnabled()) {
		LOG.debug("Parallelism set: {} for {}", parallelism, streamNodeId);
	}

	// TODO: inherit InputDependencyConstraint from the head operator
	jobVertex.setInputDependencyConstraint(streamGraph.getExecutionConfig().getDefaultInputDependencyConstraint());
	// 5.将JobVertex添加到jobVertices和jobGraph中
	jobVertices.put(streamNodeId, jobVertex);
	builtVertices.add(streamNodeId);
	jobGraph.addVertex(jobVertex);
	// 6.返回当前生成的物理节点JobVertex的StreamConfig
	return new StreamConfig(jobVertex.getConfiguration());
}

《5》调用的`connect()`

方法作用：将两个链的JobVertex用JobEdge去连接起来；如Chain1[Source->Map] --JobEdge-> Chain2[KeyBy]

private void connect(Integer headOfChain, StreamEdge edge)  {
	// 假设现在调用connect的是source算子，此时headOfChain为source算子的ID，edge为Map->KeyBy中间的出边Edge2
	// 1.记录物理边的顺序
	physicalEdgesInOrder.add(edge);

	// 2.获取链源头算子和下游算子的JobVertex
	// 下游算子ID
	Integer downStreamVertexID = edge.getTargetId();

	JobVertex headVertex = jobVertices.get(headOfChain); // 上游顶点，如"Source->Map"合起来的JobVertex点
	JobVertex downStreamVertex = jobVertices.get(downStreamVertexID); // 下游顶点，如KeyBy算子

	StreamConfig downStreamConfig = new StreamConfig(downStreamVertex.getConfiguration());

	downStreamConfig.setNumberOfNetworkInputs(downStreamConfig.getNumberOfNetworkInputs() + 1); // KeyBy的网络输入数+1

	// 3.获取出边的分区器，Edge2的分区器就是HashPartitioner
	StreamPartitioner<?> partitioner = edge.getPartitioner();

	// 4.结果分区类型判断，基本都是PIPELINED
	ResultPartitionType resultPartitionType;
	switch (edge.getShuffleMode()) {
		case PIPELINED:
			resultPartitionType = ResultPartitionType.PIPELINED_BOUNDED;
			break;
		case BATCH:
			resultPartitionType = ResultPartitionType.BLOCKING;
			break;
		case UNDEFINED:
			resultPartitionType = determineResultPartitionType(partitioner);
			break;
		default:
			throw new UnsupportedOperationException("Data exchange mode " +
				edge.getShuffleMode() + " is not supported yet.");
	}

	checkAndResetBufferTimeout(resultPartitionType, edge);
	// 5.创建作业边并连接JobVertex，这个边不是算子链内部的边，而是链与链之前的物理出边，比如：Chain1[Source->Map] --JobEdge-> Chain2[KeyBy]
	JobEdge jobEdge;
	// 点对点分发:如ForwardPartitioner
	if (isPointwisePartitioner(partitioner)) {
		jobEdge = downStreamVertex.connectNewDataSetAsInput(
			headVertex,
			DistributionPattern.POINTWISE,
			resultPartitionType);
	} else { // 全连接分发:如HashPartitioner、RebalancePartitioner；上述案例走这里
		jobEdge = downStreamVertex.connectNewDataSetAsInput(
			headVertex,
			DistributionPattern.ALL_TO_ALL,
			resultPartitionType);
	}
	// set strategy name so that web interface can show it.
	jobEdge.setShipStrategyName(partitioner.toString());
	jobEdge.setDownstreamSubtaskStateMapper(partitioner.getDownstreamSubtaskStateMapper());
	jobEdge.setUpstreamSubtaskStateMapper(partitioner.getUpstreamSubtaskStateMapper());

	if (LOG.isDebugEnabled()) {
		LOG.debug("CONNECTED: {} - {} -> {}", partitioner.getClass().getSimpleName(),
			headOfChain, downStreamVertexID);
	}
}

2° 举例子讲解

比如有拓扑结构：Source --> Edge1 --> Map -> Edge2 --> KeyBy --> Edge3 --> Window --> Edge4--> Sink

Source算子来了
1. 找startNode就是Source算子，currentNode是Source算子的StreamNode
2. 划分出边：chainableOutputs为[Edge1]，nonChainableOutputs为[]
3. 递归调用：
  - 可扩展：√-走这里，调createChain(Map,2,...,....) 待Map层返回结果后，addAll到transitiveOutEdges中-TODO-2，Map层返回的是[Edge2]
  - 不可扩展开新链：×-不走这
4. 根据chainableOutputs生成算子链显示名称：调用createChainedName()结果为"Source->Map"
5. 注册算子链元数据：调用chainInfo.addNodeToChain()将"Source->Map"注册到chainInfo中
6. 判断currentNode是否是startNode，开始生成JobVertex
  - 是：√-走这里，根据chainInfo元数据去创建JobVertex作业节点：也就是把"Source->Map"合起来，生成一个JobVertex节点
  - 不是：×-不走这
7. 判断currentNode是否是startNode，开始connect()实现两个JobVertex之间的连接
  - 是：√-走这里，调用connect()：遍历transitiveOutEdges中的外出边就是Map层传回来的[Edge2],将startNodeId对应的JobVertex就是"Source->Map"合起来的JobVertex和这些外出边的目标算子的JobVertex就是"KeyBy->Window"合起来的JobVertex连接起来
  - 不是：×-不走这
8. 判断当前节点是否为本次chain的终点：就是判断chainableOutputs是否为空
  - 为空：×-不走这
  - 不为空：√-走这里，不处理
9. 返回当前层的transitiveOutEdges：就是[Edge2]
Map算子是被Source层中调用createChain(Map,2,...,....)来的
1. 找startNode是Source算子，currentNode是Map算子的StreamNode
2. 划分出边：chainableOutputs为[]，nonChainableOutputs为[Edge2]
3. 递归调用：
  - 可扩展：×-不走这
  - 不可扩展开新链：√-->走这里，transitiveOutEdges.add(Edge2)，然后递归调用createChain(KeyBy,1,...,....)开启新链;
4. 根据chainableOutputs生成算子链显示名称：此时Map的chainableOutputs是[]，因此调用createChainedName()结果为"Map"
5. 注册算子链元数据：调用chainInfo.addNodeToChain()将"Map"注册到chainInfo中
6. 判断currentNode是否是startNode，开始生成JobVertex
  - 是：×-不走这
  - 不是：√-走这里，new一个空的StreamConfig 注意：这里并没有对Map算子进行构造JobVertex，说明它是算子链中的某一个节点，不是startNode
7. 判断currentNode是否是startNode，开始connect()实现两个JobVertex之间的连接
  - 是：×-不走这
  - 不是：√-走这里，将Map节点的StreamConfig配置加到chainedConfigs中，这是一个Map<startNodeId,Map<currentNodeId,StreamConfig>>
8. 判断当前节点是否为本次chain的终点：就是判断chainableOutputs是否为空
  - 为空：√-走这里，说明当前节点是本次算子链Source->Map的end，调用config.setChainEnd()即可
  - 不为空：×-不走这
9. 返回当前层的transitiveOutEdges：就是[Edge2]

至此，Source->Map的链化我们就明白，Map层给Source返回的transitiveOutEdges：就是[Edge2]

KeyBy算子是被Map层中调用createChain(KeyBy,1,...,....) 来的
1. 找startNode是KeyBy算子，currentNode是KeyBy算子的StreamNode
2. 划分出边：chainableOutputs为[Edge3]，nonChainableOutputs为[]
3. 递归调用：
  - 可扩展：√-走这里，调createChain(Window,2,...,...) 待Window层返回结果后，addAll到transitiveOutEdges中-TODO-4，Window层返回的是[Edge4]
  - 不可扩展开新链：×-不走这
4. 根据chainableOutputs生成算子链显示名称： 此时KeyBy的chainableOutputs是[Edge3]，调用createChainedName()结果为"KeyBy->Window"
5. 注册算子链元数据：调用chainInfo.addNodeToChain()将"KeyBy->Window"注册到chainInfo中
6. 判断currentNode是否是startNode，开始生成JobVertex
  - 是：√-走这里，根据chainInfo元数据去创建JobVertex作业节点：也就是把"KeyBy->Window"合起来，生成一个JobVertex节点
  - 不是：×-不走这注意:这里并没有对Map算子进行构造JobVertex，说明它是算子链中的某一个节点，不是startNode
7. 判断currentNode是否是startNode，开始connect()实现两个JobVertex之间的连接
  - 是：√-走这里，调用connect()：遍历transitiveOutEdges中的外出边就是Window层传回来的[Edge4] ，然后将startNodeId对应的JobVertex就是"KeyBy->Window"合起来的JobVertex和这些外出边目标算子的JobVertex就是"Sink"的JobVertex连接起来
  - 不是：×-不走这
8. 判断当前节点是否为本次chain的终点：就是判断chainableOutputs是否为空
  - 为空：×-不走这
  - 不为空：√-走这里，不处理
9. 返回当前层的transitiveOutEdges：就是[Edge4]
Window算子是被KeyBy层中调用createChain(Window,2,...,...)来的
1. 找startNode是KeyBy算子，currentNode是Window算子的StreamNode
2. 划分出边：chainableOutputs为[]，nonChainableOutputs为[Edge4]
3. 递归调用：
  - 可扩展：×-不走这
  - 不可扩展开新链：√-走这里，transitiveOutEdges.add(Edge4)，然后递归调用createChain(Sink,1,...,...)开启新链;
4. 根据chainableOutputs生成算子链显示名称： 此时Window的chainableOutputs是[]，调用createChainedName()结果为"Window"
5. 注册算子链元数据：调用chainInfo.addNodeToChain()将"KeyBy->Window"注册到chainInfo中
6. 判断currentNode是否是startNode，开始生成JobVertex
  - 是：×-不走这
  - 不是：√-走这里，new一个空的StreamConfig 其实看完map就知道Window操作和Map的操作一样，都不会对他俩创建JobVertex，而是在他俩的startNode中去将他们合起来创建一个JobVertex
7. 判断currentNode是否为startNode，开始connect()实现两个JobVertex之间的连接
  - 是：×-不走这
  - 不是：√-走这里，将Window节点的StreamConfig配置加到chainedConfigs中，这是一个Map<startNodeId,Map<currentNodeId,StreamConfig>>
8. 判断当前节点是否为本次chain的终点：就是判断chainableOutputs是否为空
  - 为空：√-走这里，说明当前节点是本次算子链KeyBy->Window的end，调用config.setChainEnd()即可
  - 不为空：×-不走这
9. 返回当前层的transitiveOutEdges：就是[Edge4]

至此，KeyBy->Window的链化我们就明白，Window层给KeyBy返回的transitiveOutEdges：就是[Edge4]

Sink算子被Window层中调用createChain(Sink,1,...,...)来的
1. 找startNode就是Sink算子，currentNode是Sink算子的StreamNode
2. 划分出边：chainableOutputs为[]，nonChainableOutputs为[]
3. 递归调用：这里都不会走，因为Sink是最后一个算子了
  - 可扩展：×-不走这
  - 不可扩展开新链：×-不走这
4. 根据chainableOutputs生成算子链显示名称：调用createChainedName()结果为"Sink"
5. 注册算子链元数据：调用chainInfo.addNodeToChain()将"Sink"注册到chainInfo中
6. 判断currentNode是否是startNode，开始生成JobVertex
  - 是：√-走这里，根据chainInfo元数据去创建JobVertex作业节点：也就是把"Sink"生成一个JobVertex节点
  - 不是：×-不走这
7. 判断currentNode是否为startNode，开始connect()实现两个JobVertex之间的连接
  - 是：√-走这里，但此时它没有transitiveOutEdges，因此不会调connect()
  - 不是：×-不走这
8. 判断当前节点是否为本次chain的终点：就是判断chainableOutputs是否为空
  - 为空：√-走这里，说明当前节点是本次算子链Sink的end，调用config.setChainEnd()即可
  - 不为空：×-不走这
9. 返回当前层的transitiveOutEdges：就是[]

ok，到这所有关于算子链的连接操作已经完成了我们得到了3个JobVertex和2个transitiveOutEdges如下

JobVertex1：[Source->Map]		| transitiveOutEdges为Edge2
JobVertex2：[KeyBy->Window]	| transitiveOutEdges为Edge4
JobVertex3：[Sink]		| 没有transitiveOutEdges

并且在connect的方法中，我们对两两JobVertex进行连接依靠transitiveOutEdges去找下游的JobVertex. 所以，最终的JobGraph就是：JobVertex1-Edge2->JobVertex2-Edge4->JobVertex3

Flink-Graph-3.JobGraph生成源码

一.JobGraph生成源码机制

0.先说结论

1.还是从StreamExecutionEnvironment入手

2.再看AbstractJobClusterExecutor

3.真正干活的人---StreamingJobGraphGenerator

(1) 相关属性

(2) 核心处理逻辑

<1> createJobGraph()

<2> 调用setChain()

<2>setChain调用的递归createChain()

1° 递归代码createChian()---关键

《1》调用的isChainable()

《2》调用的createChainedName()

《3》调用的addNodeToChain()

《4》调用的createJobVertex()

《5》调用的connect()