Flink从入门到放弃之源码解析系列-第1章 Flink组件和逻辑计划本文参考了网上很多博客，大多数博客都是基于1.1.

本文参考了网上很多博客，大多数博客都是基于1.1.0版本的，已经严重滞后，本系列文章做了很多订正，欢迎大家指正。

概要和背景

flink是一个被誉为 the 4th G 的计算框架，不同的框架特性及其代表项目列表如下：

第一代	第二代	第三代	第四代
Batch	BatchInteractive	Batch Interactive Near-Real-TimeInterative-processing	Hybrid Interactive Real-Time-StreamingNative-Iterative-processing
	DAG Dataflows	RDD	Cyclic Dataflows
Hadoop MapReduce	TEZ	Spark	Flink

本文主要介绍flink的核心组件以及物理计划的生成过程

参考代码分支 flink-1.7.1 本系列大概有7-10章，那3章是留给阿里开源的Blink的新特性的。。。

核心组件介绍

这里只介绍 on yarn 模式下的组件

flink 的 on yarn 模式支持两种不同的类型：

单作业单集群
多作业单集群

首先介绍单作业单集群的架构，单作业单集群下一个正常的 flink 程序会拥有以下组件

job Cli: 非 detatched 模式下的客户端进程，用以获取 yarn Application Master 的运行状态并将日志输出掉终端

JobManager[JM]: 负责作业的运行时计划 ExecutionGraph 的生成、物理计划生成和作业调度

TaskManager[TM]: 负责被分发 task 的执行、心跳/状态上报、资源管理

Tips：

启动Flink Yarn Session有2种模式：分离模式、客户端模式

通过-d指定分离模式，即客户端在启动Flink Yarn Session后，就不再属于Yarn Cluster的一部分。如果想要停止Flink Yarn Application，需要通过yarn application -kill 命令来停止。

整体的架构大致如下图所示：

下面将以一次 Job 的提交过程描述 flink 的各组件的作用及协同

作业提交流程分析

单作业单集群模式下，一个作业会启动一个 JM，并依据用户的参数传递启动相应数量的 TM，每个 TM 运行在 yarn 的一个 container 中，

一个通常的 flink on yarn 提交命令：./bin/flink run -m yarn-cluster -yn 2 -j flink-demo-1.0.0-with-dependencies.jar —ytm 1024 -yst 4 -yjm 1024 —yarnname flink_demo flink 在收到这样一条命令后会首先通过 Cli 获取 flink 的配置，并解析命令行参数。

配置加载

CliFrontend.java 是 flink 提交作业的入口

//CliFrontend line144

// 1. find the configuration directory

final String configurationDirectory = getConfigurationDirectoryFromEnv();

这里会尝试加载 conf 文件夹下的所有 yaml 文件，配置文件的命名并没有强制限制

参数解析

解析命令行参数的第一步是路由用户的命令，然后交由run方法去处理

//CliFrontend line1119

try {

final CliFrontend cli = new CliFrontend(

configuration,

customCommandLines);

SecurityUtils.install(new SecurityConfiguration(cli.configuration));

int retCode = SecurityUtils.getInstalledContext()

.runSecured(() -> cli.parseParameters(args));

System.exit(retCode);

}

catch (Throwable t) {

final Throwable strippedThrowable = ExceptionUtils.stripException(t, UndeclaredThrowableException.class);

LOG.error("Fatal error while running command line interface.", strippedThrowable);

strippedThrowable.printStackTrace();

System.exit(31);

}

//CliFrontend line1046

try {

// do action

switch (action) {

case ACTION_RUN:

run(params);

return 0;

case ACTION_LIST:

list(params);

return 0;

case ACTION_INFO:

info(params);

return 0;

case ACTION_CANCEL:

cancel(params);

return 0;

case ACTION_STOP:

stop(params);

return 0;

case ACTION_SAVEPOINT:

savepoint(params);

return 0;

case ACTION_MODIFY:

modify(params);

return 0;

case "-h":

case "--help":

CliFrontendParser.printHelp(customCommandLines);

return 0;

case "-v":

case "--version":

String version = EnvironmentInformation.getVersion();

String commitID = EnvironmentInformation.getRevisionInformation().commitId;

System.out.print("Version: " + version);

System.out.println(commitID.equals(EnvironmentInformation.UNKNOWN) ? "" : ", Commit ID: " + commitID);

return 0;

default:

System.out.printf("\"%s\" is not a valid action.\n", action);

System.out.println();

System.out.println("Valid actions are \"run\", \"list\", \"info\", \"savepoint\", \"stop\", or \"cancel\".");

System.out.println();

System.out.println("Specify the version option (-v or --version) to print Flink version.");

System.out.println();

System.out.println("Specify the help option (-h or --help) to get help on the command.");

return 1;

}

} catch (CliArgsException ce) {

return handleArgException(ce);

} catch (ProgramParametrizationException ppe) {

return handleParametrizationException(ppe);

} catch (ProgramMissingJobException pmje) {

return handleMissingJobException();

} catch (Exception e) {

return handleError(e);

}

接下来是程序参数设置过程，flink 将 jar包路径和参数配置封装成了 PackagedProgram

//CliFrontend line201

final PackagedProgram program;

flink集群的构建

集群类型的解析

获取参数后下一步就是集群的构建和部署，flink 通过两个不同的 CustomCommandLine 来实现不同集群模式的解析，分别是 FlinkYarnSessionCli和 DefaultCLI 解析命令行参数

//CliFrontend line1187

final String flinkYarnSessionCLI = "org.apache.flink.yarn.cli.FlinkYarnSessionCli";

try {

customCommandLines.add(

loadCustomCommandLine(flinkYarnSessionCLI,

configuration,

configurationDirectory,

"y",

"yarn"));

} catch (NoClassDefFoundError | Exception e) {

LOG.warn("Could not load CLI class {}.", flinkYarnSessionCLI, e);

}

customCommandLines.add(new DefaultCLI(configuration));

return customCommandLines;

...

//line210 这里将决定Cli的类型

final CustomCommandLine<?> customCommandLine = getActiveCustomCommandLine(commandLine);

那么什么时候解析成 Yarn Cluster 什么时候解析成 Standalone 呢？由于FlinkYarnSessionCli被优先添加到customCommandLine,所以会先触发下面这段逻辑

//FlinkYarnSessionCli line422

@Override

public boolean isActive(CommandLine commandLine) {

String jobManagerOption = commandLine.getOptionValue(addressOption.getOpt(), null);

boolean yarnJobManager = ID.equals(jobManagerOption);

boolean yarnAppId = commandLine.hasOption(applicationId.getOpt());

return yarnJobManager || yarnAppId || (isYarnPropertiesFileMode(commandLine) && yarnApplicationIdFromYarnProperties != null);

}

从上面可以看出如果用户传入了 -m参数或者application id或者配置了yarn properties 文件，则启动yarn cluster模式，否则是Standalone模式的集群

集群部署

flink通过YarnClusterDescriptor来描述yarn集群的部署配置，具体对应的配置文件为flink-conf.yaml，通过下面这段逻辑触发集群部署：

//YarnClusterDescriptor line39

public class YarnClusterDescriptor extends AbstractYarnClusterDescriptor {

public YarnClusterDescriptor(

Configuration flinkConfiguration,

YarnConfiguration yarnConfiguration,

String configurationDirectory,

YarnClient yarnClient,

boolean sharedYarnClient) {

super(

flinkConfiguration,

yarnConfiguration,

configurationDirectory,

yarnClient,

sharedYarnClient);

}

//AbstractYarnClusterDescriptor 471

protected ClusterClient<ApplicationId> deployInternal(

ClusterSpecification clusterSpecification,

String applicationName,

String yarnClusterEntrypoint,

@Nullable JobGraph jobGraph,

boolean detached) throws Exception {

大致过程：

check yarn 集群队列资源是否满足请求
设置 AM Context、启动命令、submission context
通过 yarn client submit am context
将yarn client 及相关配置封装成 YarnClusterClient 返回

真正在 AM 中运行的主类是 YarnApplicationMasterRunner，它的 run方法做了如下工作：

启动JobManager ActorSystem
启动 flink ui
启动YarnFlinkResourceManager来负责与yarn的ResourceManager交互，管理yarn资源
启动 actor System supervise 进程

到这里 JobManager 已经启动起来

这样一个 flink 集群便构建出来了。下面附图解释下这个流程：

flink cli 解析本地环境配置，启动 ApplicationMaster
在 ApplicationMaster 中启动 JobManager
在 ApplicationMaster 中启动YarnFlinkResourceManager
YarnFlinkResourceManager给JobManager发送注册信息
YarnFlinkResourceManager注册成功后，JobManager给YarnFlinkResourceManager发送注册成功信息
YarnFlinkResourceManage知道自己注册成功后像ResourceManager申请和TaskManager数量对等的 container
在container中启动TaskManager
TaskManager将自己注册到JobManager中

接下来便是程序的提交和运行

程序在CliFrontend中被提交后，会触发这样一段逻辑

//ClusterClient 39

public JobSubmissionResult run(PackagedProgram prog, int parallelism)

throws ProgramInvocationException, ProgramMissingJobException {

Thread.currentThread().setContextClassLoader(prog.getUserCodeClassLoader());

if (prog.isUsingProgramEntryPoint()) {

final JobWithJars jobWithJars;

if (hasUserJarsInClassPath(prog.getAllLibraries())) {

jobWithJars = prog.getPlanWithoutJars();

} else {

jobWithJars = prog.getPlanWithJars();

}

return run(jobWithJars, parallelism, prog.getSavepointSettings());

}

else if (prog.isUsingInteractiveMode()) {

log.info("Starting program in interactive mode (detached: {})", isDetached());

final List<URL> libraries;

if (hasUserJarsInClassPath(prog.getAllLibraries())) {

libraries = Collections.emptyList();

} else {

libraries = prog.getAllLibraries();

}

ContextEnvironmentFactory factory = new ContextEnvironmentFactory(this, libraries,

prog.getClasspaths(), prog.getUserCodeClassLoader(), parallelism, isDetached(),

prog.getSavepointSettings());

ContextEnvironment.setAsContext(factory);

try {

// invoke main method

prog.invokeInteractiveModeForExecution();

if (lastJobExecutionResult == null && factory.getLastEnvCreated() == null) {

throw new ProgramMissingJobException("The program didn't contain a Flink job.");

}

if (isDetached()) {

// in detached mode, we execute the whole user code to extract the Flink job, afterwards we run it here

return ((DetachedEnvironment) factory.getLastEnvCreated()).finalizeExecute();

}

else {

// in blocking mode, we execute all Flink jobs contained in the user code and then return here

return this.lastJobExecutionResult;

}

finally {

ContextEnvironment.unsetContext();

}

else {

throw new ProgramInvocationException("PackagedProgram does not have a valid invocation mode.");

}

注意到有一段prog.invokeInteractiveModeForExecution()，这是客户端生成初步逻辑计划的核心逻辑，下面将详细介绍

客户端逻辑计划

上面提到prog.invokeInteractiveModeForExecution()这段逻辑会触发客户端逻辑计划的生成，那么是怎样一个过程呢？其实这里只是调用了用户jar包的主函数，真正的触发生成过程由用户代码的执行来完成。例如用户写了这样一段 flink 代码：

object FlinkDemo extends App with Logging{

override def main(args: Array[String]): Unit ={

val properties = new Properties

properties.setProperty("bootstrap.servers", DemoConfig.kafkaBrokerList)

properties.setProperty("zookeeper.connect","host01:2181,host02:2181,host03:2181/kafka08")

properties.setProperty("group.id", "flink-demo")

val env = StreamExecutionEnvironment.getExecutionEnvironment

env.enableCheckpointing(5000L, CheckpointingMode.EXACTLY_ONCE) //checkpoint every 5 seconds.

val stream = env.addSource(new FlinkKafkaConsumer08[String]("log.waimai_e", new SimpleStringSchema, properties)).setParallelism(2)

val counts = stream.name("log.waimai_e").map(toPoiIdTuple(_)).filter(_._2 != null)

.keyBy(0)

.timeWindow(Time.seconds(5))

.sum(1)

counts.addSink(sendToKafka(_))

env.execute()

}

注意到这样一段val env = StreamExecutionEnvironment.getExecutionEnvironment，这段代码会获取客户端的环境配置，它首先会转到这样一段逻辑：

//StreamExecutionEnvironment 1256

public static StreamExecutionEnvironment getExecutionEnvironment() {

if (contextEnvironmentFactory != null) {

return contextEnvironmentFactory.createExecutionEnvironment();

}

// because the streaming project depends on "flink-clients" (and not the other way around)

// we currently need to intercept the data set environment and create a dependent stream env.

// this should be fixed once we rework the project dependencies

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

ExecutionEnvironment.getExecutionEnvironment();获取环境的逻辑如下：

//ExecutionEnvironment line1137

public static ExecutionEnvironment getExecutionEnvironment() {

return contextEnvironmentFactory == null ?

createLocalEnvironment() : contextEnvironmentFactory.createExecutionEnvironment();

}

这里的contextEnvironmentFactory是一个静态成员，早在ContextEnvironment.setAsContext(factory)已经触发过初始化了，其中包含了如下的环境信息:

//ContextEnvironmentFactory line51

public ContextEnvironmentFactory(ClusterClient client, List<URL> jarFilesToAttach,

List<URL> classpathsToAttach, ClassLoader userCodeClassLoader, int defaultParallelism,

boolean isDetached, String savepointPath)

{

this.client = client;

this.jarFilesToAttach = jarFilesToAttach;

this.classpathsToAttach = classpathsToAttach;

this.userCodeClassLoader = userCodeClassLoader;

this.defaultParallelism = defaultParallelism;

this.isDetached = isDetached;

this.savepointPath = savepointPath;

}

其中的 client 就是上面生成的 YarnClusterClient，其它的意思较明显，就不多做解释了。

用户在执行val env = StreamExecutionEnvironment.getExecutionEnvironment这样一段逻辑后会得到一个StreamContextEnvironment，其中封装了 streaming 的一些执行配置【buffer time out等】，另外保存了上面提到的 ContextEnvironment 的引用。

到这里关于 streaming 需要的执行环境信息已经设置完成。

初步逻辑计划 StreamGraph 的生成

接下来用户代码执行到DataStream<String> stream = env.addSource(consumer);这段逻辑实际会生成一个DataStream抽象，DataStream是flink关于streaming抽象的最核心抽象，后续所有的算子转换都会在DataStream上来完成，上面的addSource操作会触发下面这段逻辑:

public <OUT> DataStreamSource<OUT> addSource(SourceFunction<OUT> function, String sourceName, TypeInformation<OUT> typeInfo) {

if (typeInfo == null) {

if (function instanceof ResultTypeQueryable) {

typeInfo = ((ResultTypeQueryable<OUT>) function).getProducedType();

} else {

try {

typeInfo = TypeExtractor.createTypeInfo(

SourceFunction.class,

function.getClass(), 0, null, null);

} catch (final InvalidTypesException e) {

typeInfo = (TypeInformation<OUT>) new MissingTypeInfo(sourceName, e);

}

boolean isParallel = function instanceof ParallelSourceFunction;

clean(function);

StreamSource<OUT, ?> sourceOperator;

if (function instanceof StoppableFunction) {

sourceOperator = new StoppableStreamSource<>(cast2StoppableSourceFunction(function));

} else {

sourceOperator = new StreamSource<>(function);

}

return new DataStreamSource<>(this, typeInfo, sourceOperator, isParallel, sourceName);

}

简要总结下上面的逻辑：

获取数据源 source 的 output 信息 TypeInformation
生成 StreamSource sourceOperator
生成 DataStreamSource【封装了 sourceOperator】，并返回
将 StreamTransformation 添加到算子列表 transformations 中【只有转换 transform 操作才会添加算子，其它都只是暂时做了 transformation 的叠加封装】
后续会在 DataStream 上做操作

DataStreamSource 是一个 DataStream 数据流抽象，StreamSource 是一个 StreamOperator 算子抽象，在 flink 中一个 DataStream 封装了一次数据流转换，一个 StreamOperator 封装了一个函数接口，比如 map、reduce、keyBy等。关于算子的介绍会另起一节：flink算子的生命周期

可以看到在 DataStream 上可以进行一系列的操作(map filter 等)，来看一个常规操作比如 map 会发生什么：

//DataStream 583

public <R> SingleOutputStreamOperator<R> map(MapFunction<T, R> mapper) {

TypeInformation<R> outType = TypeExtractor.getMapReturnTypes(clean(mapper), getType(),

Utils.getCallLocationName(), true);

return transform("Map", outType, new StreamMap<>(clean(mapper)));

}

一个map操作会触发一次 transform，那么transform做了什么工作呢？

//DataStream line1175

@PublicEvolving

public <R> SingleOutputStreamOperator<R> transform(String operatorName, TypeInformation<R> outTypeInfo, OneInputStreamOperator<T, R> operator) {

// read the output type of the input Transform to coax out errors about MissingTypeInfo

transformation.getOutputType();

OneInputTransformation<T, R> resultTransform = new OneInputTransformation<>(

this.transformation,

operatorName,

operator,

outTypeInfo,

environment.getParallelism());

@SuppressWarnings({ "unchecked", "rawtypes" })

SingleOutputStreamOperator<R> returnStream = new SingleOutputStreamOperator(environment, resultTransform);

getExecutionEnvironment().addOperator(resultTransform);

return returnStream;

}

这一步生成了一个 StreamTransformation并以此作为成员变量封装成另一个 DataStream 返回，StreamTransformation是 flink关于数据流转换的核心抽象，只有需要 transform 的流才会生成新的DataStream 算子，后面会详细解释，注意上面有这一行getExecutionEnvironment().addOperator(resultTransform)flink会将transformation维护起来：

//StreamExecutionEnvironment line 1576

@Internal

public void addOperator(StreamTransformation<?> transformation) {

Preconditions.checkNotNull(transformation, "transformation must not be null.");

this.transformations.add(transformation);

}

所以，用户的一连串操作 map join等实际上在 DataStream 上做了转换，并且flink将这些 StreamTransformation 维护起来，一直到最后，用户执行 env.execute()这样一段逻辑，StreamGraph 的构建才算真正开始...

用户在执行 env.execute()会触发这样一段逻辑：

//StreamContextEnvironment line32

public JobExecutionResult execute(String jobName) throws Exception {

Preconditions.checkNotNull("Streaming Job name should not be null.");

StreamGraph streamGraph = this.getStreamGraph();

streamGraph.setJobName(jobName);

transformations.clear();

// execute the programs

if (ctx instanceof DetachedEnvironment) {

LOG.warn("Job was executed in detached mode, the results will be available on completion.");

((DetachedEnvironment) ctx).setDetachedPlan(streamGraph);

return DetachedEnvironment.DetachedJobExecutionResult.INSTANCE;

} else {

return ctx.getClient().runBlocking(streamGraph, ctx.getJars(), ctx.getClasspaths(), ctx.getUserCodeClassLoader(), ctx.getSavepointPath());

}

这段代码做了两件事情：

首先使用 StreamGraphGenerator 产生 StreamGraph
使用 Client 运行 stream graph

那么 StreamGraphGenerator 做了哪些操作呢？

StreamGraphGenerator会依据添加算子时保存的 transformations 信息生成 job graph 中的节点，并创建节点连接，分流操作如 union,select,split 不会添加边，只会创建虚拟节点或在上有节点添加 selector

这里会将 StreamTransformation 转换为 StreamNode，StreamNode 保存了算子的信息，如下图所示

到这里由 StreamNode 构成的 DAG 图 StreamGraph就生成了

不过在提交给 client 的时候，flink 会做进一步的优化:

StreamGraph 将进一步转换为 JobGraph，这一步工作由 StreamingJobGraphGenerator 来完成，为什么要做这一步转换呢？主要因为有可以 chain 的算子，这里进一步将 StreamNode 转换为 JobVertex，主要工作是将可以 chain 的算子合并【这一步优化是默认打开的】，并设置资源，重试策略等，最终生成可以提交给 JobManager 的 JobGraph

Tips：

JobVertex：经过优化后符合条件的多个StreamNode可能会chain在一起生成一个JobVertex，即一个JobVertex包含一个或多个operator，JobVertex的输入是JobEdge，输出是IntermediateDataSet。 IntermediateDataSet：表示JobVertex的输出，即经过operator处理产生的数据集。producer是JobVertex，consumer是JobEdge。 JobEdge：代表了job graph中的一条数据传输通道。source 是 IntermediateDataSet，target 是 JobVertex。即数据通过JobEdge由IntermediateDataSet传递给目标JobVertex。