Dolphinscheduler 任务运行流程

935 阅读3分钟

我正在参加「掘金·启航计划」

最近在对接Dolphinscheduler到公司系统中,想了解Flink任务是如何被调度起来的,于是在本地进行了一些调式,本文会进行一些说明

启动Dolphinscheduler

对于Dolphinscheduler的启动,这里直接选择了本地启动。

运行dolphinscheduler-standalone-server模块中的StandaloneServer即可。

因为我需要运行Flink任务,因此做了如下配置

资源相关的配置,这里配置了HDFS相关的地址,因为需要在本地启动好HDFS。

./sbin/start-dfs.sh

# resource storage type: HDFS, S3, NONE
resource.storage.type=HDFS

# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
resource.storage.upload.base.path=/dolphinscheduler

# whether to startup kerberos
hadoop.security.authentication.startup.state=false

# resource view suffixs
#resource.view.suffixs=txt,log,sh,bat,conf,cfg,py,java,sql,xml,hql,properties,json,yml,yaml,ini,js

# if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
resource.hdfs.root.user=lizu

# if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
resource.hdfs.fs.defaultFS=hdfs://0.0.0.0:9000


# use sudo or not, if set true, executing user is tenant user and deploy user needs sudo permissions; if set false, executing user is the deploy user and doesn't need sudo permissions
sudo.enable=true

环境变量相关,我是直接配置的本机的环境变量信息,Dolphinscheduler在运行的时候会判断resources目录下是否有环境变量文件,存在就会执行source 环境变量的命令。

# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_281.jdk/Contents/Home
export PATH=$PATH:$JAVA_HOME/bin
export M2_HOME=/Users/lizu/app/apache-maven-3.6.3
export PATH=$PATH:$M2_HOME/bin
export SCALA_HOME=/Users/lizu/app/scala-2.11.12
#export SCALA_HOME=/Users/lizu/app/scala-2.12.13
export PATH=$PATH:$SCALA_HOME/bin
#export SPARK_HOME=/Users/lizu/app/spark-2.4.3-bin-hadoop2.7
export SPARK_HOME=/Users/lizu/app/spark-3.0.2-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export FLINK_HOME=/Users/lizu/app/flink-1.13.6
export PATH=$PATH:$FLINK_HOME/bin
export PATH=/Library/Frameworks/Python.framework/Versions/3.6/bin:$PATH
export HADOOP_HOME=/Users/lizu/app/hadoop-2.7.6
export PATH=$PATH:$HADOOP_HOME/bin
#export HIVE_HOME=/Users/lizu/app/apache-hive-1.2.1-bin
export HIVE_HOME=/Users/lizu/app/hive-2.3.4
export PATH=$PATH:$HIVE_HOME/bin
#export NODE_HOME=/Users/lizu/app/node-v14.16.0
export NODE_HOME=/Users/lizu/app/node-v16.17.0
export PATH=$PATH:$NODE_HOME/bin
export PATH=$PATH:/Users/lizu/app/go/bin
export PATH=$PATH:/Users/lizu/app/gradle-6.3/bin
alias python="/usr/local/bin/python3"
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/pyspark:$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH

然后启动前端就可以访问了地址如下:http://127.0.0.1:5173/

Flink任务配置

启动local集群

因为要运行Flink任务,因此需要提前准备好Flink的环境,这里我选择了local模型。直接在本地进行启动。

./start-cluster.sh

启动完成后就可以看到Flink的web页面了

然后就可以去Dolphinscheduler配置Flink任务了。配置Flink任务主要分了如下几个步骤

上传资源文件

可以直接选择官方example中的SocketWindowWordCount.jar

配置工作流

这里主要配置主函数以及前面上传的资源,因为SocketWindowWordCount运行需要参数,这里在主程序参数中设置了其入参

启动运行

最后可以启动进行测试,需要首先开启socket端口

nc -l 9999

通过页面就可以看到我们的Flink任务在执行中了。

Flink任务运行相关源码

通过上面的内容,已经将Flink任务在Dolphinscheduler中运行了起来,最后来通过源码看看任务运行的一些细节吧。

在Dolphinscheduler中,任务的是通过Master分配到Worker中进行运行的。并且任务运行的状态也会实时的通知到Master。两者之间的交互是通过netty实现的。

worker节点对任务的接收

worker启动后,会启动rpc的服务端和客户端,以及一个workerManagerThread用于从队列中获取需要执行的任务

@PostConstruct
public void run() {
	this.workerRpcServer.start();
	this.workerRpcClient.start();
	this.taskPluginManager.loadPlugin();

	this.workerRegistryClient.setRegistryStoppable(this);
	this.workerRegistryClient.start();

	this.workerManagerThread.start();

	this.messageRetryRunner.start();

	/*
	 * registry hooks, which are called before the process exits
	 */
	Runtime.getRuntime().addShutdownHook(new Thread(() -> {
		if (!ServerLifeCycleManager.isStopped()) {
			close("WorkerServer shutdown hook");
		}
	}));
}

在WorkerRpcServer中注册了很多处理器,暂时只关注这一个吧TaskDispatchProcessor

this.nettyRemotingServer.registerProcessor(CommandType.TASK_DISPATCH_REQUEST, taskDispatchProcessor);

TaskDispatchProcessor会接收从Master发送的task dispatch消息,并加入到worker任务队列waitSubmitQueue中

final String workflowMasterAddress = taskDispatchCommand.getMessageSenderAddress();
logger.info("Receive task dispatch request, command: {}", taskDispatchCommand);

//....忽略部分代码	
	

WorkerDelayTaskExecuteRunnable workerTaskExecuteRunnable = WorkerTaskExecuteRunnableFactoryBuilder
				.createWorkerDelayTaskExecuteRunnableFactory(
						taskExecutionContext,
						workerConfig,
						workflowMasterAddress,
						workerMessageSender,
						alertClientService,
						taskPluginManager,
						storageOperate)
				.createWorkerTaskExecuteRunnable();
		// submit task to manager
		boolean offer = workerManager.offer(workerTaskExecuteRunnable);
		if (!offer) {
			logger.warn("submit task to wait queue error, queue is full, current queue size is {}, will send a task reject message to master", workerManager.getWaitSubmitQueueSize());
			workerMessageSender.sendMessageWithRetry(taskExecutionContext, workflowMasterAddress, CommandType.TASK_REJECT);
		} else {
			logger.info("Submit task to wait queue success, current queue size is {}", workerManager.getWaitSubmitQueueSize());
		}        

WorkerTaskExecuteRunnable执行任务

WorkerManagerThread线程会不断的从worker任务队列waitSubmitQueue中获取到需要执行的任务,然后提交到线程池中

@Override
public void run() {
	Thread.currentThread().setName("Worker-Execute-Manager-Thread");
	while (!ServerLifeCycleManager.isStopped()) {
		try {
			if (!ServerLifeCycleManager.isRunning()) {
				Thread.sleep(Constants.SLEEP_TIME_MILLIS);
			}
			if (this.getThreadPoolQueueSize() <= workerExecThreads) {
				final WorkerDelayTaskExecuteRunnable workerDelayTaskExecuteRunnable = waitSubmitQueue.take();
				workerExecService.submit(workerDelayTaskExecuteRunnable);
			} else {
				WorkerServerMetrics.incWorkerOverloadCount();
				logger.info("Exec queue is full, waiting submit queue {}, waiting exec queue size {}",
						this.getWaitSubmitQueueSize(), this.getThreadPoolQueueSize());
				ThreadUtils.sleep(Constants.SLEEP_TIME_MILLIS);
			}
		} catch (Exception e) {
			logger.error("An unexpected interrupt is happened, "
					+ "the exception will be ignored and this thread will continue to run", e);
		}
	}
}

//work线程池
 workerExecService = new WorkerExecService(
                ThreadUtils.newDaemonFixedThreadExecutor("Worker-Execute-Thread", workerConfig.getExecThreads()),
                taskExecuteThreadMap);

最终的执行逻辑就在WorkerTaskExecuteRunnable中了。

@Override
public void run() {
	try {
		// set the thread name to make sure the log be written to the task log file
		Thread.currentThread().setName(taskExecutionContext.getTaskLogName());

		LoggerUtils.setWorkflowAndTaskInstanceIDMDC(taskExecutionContext.getProcessInstanceId(),
				taskExecutionContext.getTaskInstanceId());
		logger.info("Begin to pulling task");

		initializeTask();

		if (Constants.DRY_RUN_FLAG_YES == taskExecutionContext.getDryRun()) {
			taskExecutionContext.setCurrentExecutionStatus(TaskExecutionStatus.SUCCESS);
			taskExecutionContext.setEndTime(new Date());
			TaskExecutionContextCacheManager.removeByTaskInstanceId(taskExecutionContext.getTaskInstanceId());
			workerMessageSender.sendMessageWithRetry(taskExecutionContext, masterAddress,
					CommandType.TASK_EXECUTE_RESULT);
			logger.info(
					"The current execute mode is dry run, will stop the subsequent process and set the taskInstance status to success");
			return;
		}

		beforeExecute();

		TaskCallBack taskCallBack = TaskCallbackImpl.builder().workerMessageSender(workerMessageSender)
				.masterAddress(masterAddress).build();
		executeTask(taskCallBack);

		afterExecute();

	} catch (Throwable ex) {
		logger.error("Task execute failed, due to meet an exception", ex);
		afterThrowing(ex);
	} finally {
		LoggerUtils.removeWorkflowAndTaskInstanceIdMDC();
	}
}

在WorkerTaskExecuteRunnable中主要进行了几个步骤

1-initializeTask:设置任务的环境变量Set task envFile,起始时间等

2-beforeExecute:任务执行的前置校验,任务状态变成running

3-executeTask:任务的执行

4-afterExecute:任务执行完成后的操作,通知master修改状态

FlinkTask执行逻辑

最后来看看Flink任务的具体执行逻辑,executeTask中会调用具体实现类的handle方法

@Override
public void executeTask(TaskCallBack taskCallBack) throws TaskException {
	if (task == null) {
		throw new TaskException("The task plugin instance is not initialized");
	}
	task.handle(taskCallBack);
}

Flink任务的具体实现类是FlinkTask继承自AbstractYarnTask

// todo split handle to submit and track
@Override
public void handle(TaskCallBack taskCallBack) throws TaskException {
	try {
		// SHELL task exit code
		TaskResponse response = shellCommandExecutor.run(buildCommand());
		setExitStatusCode(response.getExitStatusCode());
		// set appIds
		setAppIds(String.join(TaskConstants.COMMA, getApplicationIds()));
		setProcessId(response.getProcessId());
	} catch (InterruptedException ex) {
		Thread.currentThread().interrupt();
		logger.info("The current yarn task has been interrupted", ex);
		setExitStatusCode(TaskConstants.EXIT_CODE_FAILURE);
		throw new TaskException("The current yarn task has been interrupted", ex);
	} catch (Exception e) {
		logger.error("yarn process failure", e);
		exitStatusCode = -1;
		throw new TaskException("Execute task failed", e);
	}
}

其本质就是调用shellCommandExecutor来执行。最终是提交了一个shell命令来执行Flink任务。

public TaskResponse run(String execCommand) throws IOException, InterruptedException {
	TaskResponse result = new TaskResponse();
	int taskInstanceId = taskRequest.getTaskInstanceId();
	if (null == TaskExecutionContextCacheManager.getByTaskInstanceId(taskInstanceId)) {
		result.setExitStatusCode(EXIT_CODE_KILL);
		return result;
	}
	if (StringUtils.isEmpty(execCommand)) {
		TaskExecutionContextCacheManager.removeByTaskInstanceId(taskInstanceId);
		return result;
	}

	String commandFilePath = buildCommandFilePath();

	// create command file if not exists
	createCommandFileIfNotExists(execCommand, commandFilePath);

	// build process
	buildProcess(commandFilePath);

	// parse process output
	parseProcessOutput(process);

	int processId = getProcessId(process);

	result.setProcessId(processId);

	// cache processId
	taskRequest.setProcessId(processId);
	boolean updateTaskExecutionContextStatus =
			TaskExecutionContextCacheManager.updateTaskExecutionContext(taskRequest);
	if (Boolean.FALSE.equals(updateTaskExecutionContextStatus)) {
		ProcessUtils.kill(taskRequest);
		result.setExitStatusCode(EXIT_CODE_KILL);
		return result;
	}
	// print process id
	logger.info("process start, process id is: {}", processId);

	// if timeout occurs, exit directly
	long remainTime = getRemainTime();

	// waiting for the run to finish
	boolean status = process.waitFor(remainTime, TimeUnit.SECONDS);

	// if SHELL task exit
	if (status) {

		// SHELL task state
		result.setExitStatusCode(process.exitValue());

	} else {
		logger.error("process has failure, the task timeout configuration value is:{}, ready to kill ...",
				taskRequest.getTaskTimeout());
		ProcessUtils.kill(taskRequest);
		result.setExitStatusCode(EXIT_CODE_FAILURE);
	}
	int exitCode = process.exitValue();
	String exitLogMessage = EXIT_CODE_KILL == exitCode ? "process has killed." : "process has exited.";
	logger.info(exitLogMessage
			+ " execute path:{}, processId:{} ,exitStatusCode:{} ,processWaitForStatus:{} ,processExitValue:{}",
			taskRequest.getExecutePath(), processId, result.getExitStatusCode(), status, exitCode);
	return result;

}

我们可以通过执行日志来看看上面做了什么

INFO] 2023-06-22 23:49:12.520 +0800 - flink task command : flink run -p 1 -sae -c org.apache.flink.streaming.examples.socket.SocketWindowWordCount flink-example/SocketWindowWordCount.jar --hostname localhost --port 9999
[INFO] 2023-06-22 23:49:12.520 +0800 - Begin to create command file:/tmp/dolphinscheduler/exec/process/lizu/9986178674720/9986206731168_3/5/5/5_5.command
[INFO] 2023-06-22 23:49:12.521 +0800 - Success create command file, command: #!/bin/bash
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source /Users/lizu/idea/scheduler/dolphinscheduler/dolphinscheduler-standalone-server/target/classes/dolphinscheduler_env.sh
flink run -p 1 -sae -c org.apache.flink.streaming.examples.socket.SocketWindowWordCount flink-example/SocketWindowWordCount.jar --hostname localhost --port 9999
[INFO] 2023-06-22 23:49:12.526 +0800 - task run command: sudo -u lizu -E bash /tmp/dolphinscheduler/exec/process/lizu/9986178674720/9986206731168_3/5/5/5_5.command
[INFO] 2023-06-22 23:49:12.547 +0800 - process start, process id is: 23572
[INFO] 2023-06-22 23:49:15.560 +0800 -  -> Job has been submitted with JobID 192c0f1b984f2bc10cd2ec6d39525fbb

其实就是讲Flink运行的命令和环境变量信息都写入到了一个脚本文件中,然后去运行这个脚本文件,比如:(sudo -u lizu -E bash /tmp/dolphinscheduler/exec/process/lizu/9986178674720/9986206731168_3/5/5/5_5.command)

如果是Flink任务则会启动一个CliFrontend进程并且代码中通过process.waitFor一直等待进程的返回,直到返回失败或者成功的标志。

最终还会通过afterExecute方法去通知master任务运行的情况

protected void sendTaskResult() {
	taskExecutionContext.setCurrentExecutionStatus(task.getExitStatus());
	taskExecutionContext.setEndTime(new Date());
	taskExecutionContext.setProcessId(task.getProcessId());
	taskExecutionContext.setAppIds(task.getAppIds());
	taskExecutionContext.setVarPool(JSONUtils.toJsonString(task.getParameters().getVarPool()));
	workerMessageSender.sendMessageWithRetry(taskExecutionContext, masterAddress, CommandType.TASK_EXECUTE_RESULT);

	logger.info("Send task execute result to master, the current task status: {}",
			taskExecutionContext.getCurrentExecutionStatus());
}

相关运行日志

[INFO] 2023-06-22 23:49:15.560 +0800 -  -> Job has been submitted with JobID 192c0f1b984f2bc10cd2ec6d39525fbb
[INFO] 2023-06-23 00:11:35.405 +0800 - process has exited. execute path:/tmp/dolphinscheduler/exec/process/lizu/9986178674720/9986206731168_3/5/5, processId:23572 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1
[INFO] 2023-06-23 00:11:35.411 +0800 - Send task execute result to master, the current task status: TaskExecutionStatus{code=6, desc='failure'}

总结

上面就是Flink任务在Dolphinscheduler中如何运行的简单介绍,如果有不对的地方,希望可以提出来一起进步。