xxl-job调度中心(xxl-job-admin)

1,266 阅读8分钟

定位

xxl-job-admin模块是一个前后端一体项目(包括前端的js和spring boot后端项目),作为xxl-job的调度中心,作用是管理和调度定时任务。

启动流程

调度中心后端服务在启动过程中会加载XxlJobAdminConfig配置类,在配置类中,会创建一个XxlJobScheduler(任务调度器)对象以及对XxlJobScheduler(任务调度器)对象进行一系列的初始化工作(初始化工作主要是加载配置信息,创建以及初始化一系列化线程在后台一直异步运行,提高了性能)。 XxlJobAdminConfig配置类的核心代码(新建一个XxlJobScheduler对象,并执行XxlJobScheduler的初始化方法)如下:

@Component
public class XxlJobAdminConfig implements InitializingBean, DisposableBean {
   @Override
   public void afterPropertiesSet() throws Exception {
       adminConfig = this;
   
       // 创建调度器以及对调度器进行初始化
       xxlJobScheduler = new XxlJobScheduler();
       xxlJobScheduler.init();
   }
}

下面我们对XxlJobScheduler(任务调度器)的初始化逻辑进行具体分析。 XxlJobScheduler的初始化方法代码如下:

public class XxlJobScheduler  {
    public void init() throws Exception {
        // init i18n
        initI18n();
    
        // admin trigger pool start
        JobTriggerPoolHelper.toStart();

        // admin registry monitor run
        JobRegistryHelper.getInstance().start();

        // admin fail-monitor run
        JobFailMonitorHelper.getInstance().start();

        // admin lose-monitor run ( depend on JobTriggerPoolHelper)  
        JobCompleteHelper.getInstance().start();

        // admin log report start
        JobLogReportHelper.getInstance().start();

        // start-schedule  ( depend on JobTriggerPoolHelper)
        JobScheduleHelper.getInstance().start();

        logger.info(">>>>>>>>> init xxl-job admin success.");
    }
}

如代码所示,XxlJobScheduler的初始化方法逻辑分为以下7个部分: (1)初始国际化:设置“任务执行阻塞策略”的title值,支持中文简体、英文和中文繁体三种 (2)创建任务触发线程池:创建任务触发(快慢)线程池(包含具体参数设置) (3)启动(执行器)注册监控器线程:创建调度任务注册/移除线程池,并启动(执行器)注册监控器线程(删除已下线的执行器,更新自动注册执行器的地址列表) (4)启动失败任务监控线程:进行定时任务失败重试、失败告警等 (5)启动任务执行结果丢失监控线程:调度记录停留在 "运行中" 状态超过10min,且对应执行器心跳注册失败不在线,则将本次调度主动标记失败 (6)启动日志报告线程:统计任务运行情况更新到xxl_job_log_report表、清除xxl_job_log中的过期日志 (7)开启任务调度:启动任务调度线程(不断从数据库把5秒内要执行的任务读出,立即触发或者放到时间轮等待触发),启动(时间)轮线程(从时间轮中获取任务,并触发任务)

init i18n 初始国际化

XxlJobScheduler.initI18n()方法代码如下:

public class XxlJobScheduler  {
    private void initI18n(){
        for (ExecutorBlockStrategyEnum item:ExecutorBlockStrategyEnum.values()) {
            /**
             * I18nUtil.getString方法就是根据配置读取resources/i18n/目录下的其中一个文件, 该目录下有message_en.properties、message_zh_CN.properties、message_zh_TC.properties三个文件,分别为英语、中文简体、中文繁体是属性文件。I18nUtil.getString方法获取到执行阻塞策略的值赋值给title. 三个properties文件中有jobconf_block开始的三项配置
             */
            item.setTitle(I18nUtil.getString("jobconf_block_".concat(item.name())));
        }
    }
}

i18n(其来源是英文单词 internationalization的首末字符i和n,18为中间的字符数)是“国际化”的简称,initI18n方法就是设置“任务执行阻塞策略”的title值。

admin trigger pool start 创建任务触发线程池

JobTriggerPoolHelper.toStart()方法代码如下:

public class JobTriggerPoolHelper {
    public static void toStart() {
        helper.start();
    }
    // 创建快慢触发器线程池
    public void start(){
        fastTriggerPool = new ThreadPoolExecutor(
                10,
                XxlJobAdminConfig.getAdminConfig().getTriggerPoolFastMax(),
                60L,
                TimeUnit.SECONDS,
                new LinkedBlockingQueue<Runnable>(1000),
                new ThreadFactory() {
                    @Override
                    public Thread newThread(Runnable r) {
                        return new Thread(r, "xxl-job, admin JobTriggerPoolHelper-fastTriggerPool-" + r.hashCode());
                    }
                });

        slowTriggerPool = new ThreadPoolExecutor(
                10,
                XxlJobAdminConfig.getAdminConfig().getTriggerPoolSlowMax(),
                60L,
                TimeUnit.SECONDS,
                new LinkedBlockingQueue<Runnable>(2000),
                new ThreadFactory() {
                    @Override
                    public Thread newThread(Runnable r) {
                        return new Thread(r, "xxl-job, admin JobTriggerPoolHelper-slowTriggerPool-" + r.hashCode());
                    }
                });
    }
}

这里主要创建了执行定时任务的快慢线程池,分别有一个大小为1000和2000的阻塞队列。

admin registry monitor run 启动(执行器)注册监控器线程

JobRegistryHelper.getInstance().start()方法代码如下:

public class JobRegistryHelper {
	public void start(){
		// for registry or remove  调度任务注册/移除线程池(用来执行调度任务的注册/移除)
		registryOrRemoveThreadPool = new ThreadPoolExecutor(...});

		// for monitor  (执行器)注册监控器线程(删除已下线的执行器,更新自动注册执行器的地址列表)
		registryMonitorThread = new Thread(new Runnable() {
			@Override
			public void run() {
				while (!toStop) {
					try {
						// auto registry group  从xxl_job_group表中获取自动注册的执行器列表
						List<XxlJobGroup> groupList = XxlJobAdminConfig.getAdminConfig().getXxlJobGroupDao().findByAddressType(0);
						if (groupList!=null && !groupList.isEmpty()) {

							// remove dead address (admin/executor)  从xxl_job_registry表中获取已经下线的执行器地址记录
							List<Integer> ids = XxlJobAdminConfig.getAdminConfig().getXxlJobRegistryDao().findDead(RegistryConfig.DEAD_TIMEOUT, new Date());
							// 从xxl_job_registry表中删除已经下线的执行器地址记录
							if (ids!=null && ids.size()>0) {
								XxlJobAdminConfig.getAdminConfig().getXxlJobRegistryDao().removeDead(ids);
							}

							// fresh online address (admin/executor)
							// 从xxl_job_registry表获取每个执行器对应的地址列表,appAddressMap的key为appname(执行器名称),value为存活的执行器地址列表
							HashMap<String, List<String>> appAddressMap = new HashMap<String, List<String>>();
							List<XxlJobRegistry> list = XxlJobAdminConfig.getAdminConfig().getXxlJobRegistryDao().findAll(RegistryConfig.DEAD_TIMEOUT, new Date());
							if (list != null) {
								for (XxlJobRegistry item: list) {
									if (RegistryConfig.RegistType.EXECUTOR.name().equals(item.getRegistryGroup())) {
										String appname = item.getRegistryKey();
										List<String> registryList = appAddressMap.get(appname);
										if (registryList == null) {
											registryList = new ArrayList<String>();
										}

										if (!registryList.contains(item.getRegistryValue())) {
											registryList.add(item.getRegistryValue());
										}
										appAddressMap.put(appname, registryList);
									}
								}
							}

							// fresh group address  将xxl_job_registry表获取每个(自动注册)	执行器对应的地址列表刷新到xxl_job_group里
							for (XxlJobGroup group: groupList) {
								List<String> registryList = appAddressMap.get(group.getAppname());
								String addressListStr = null;
								if (registryList!=null && !registryList.isEmpty()) {
									Collections.sort(registryList);
									StringBuilder addressListSB = new StringBuilder();
									for (String item:registryList) {
										addressListSB.append(item).append(",");
									}
									addressListStr = addressListSB.toString();
									addressListStr = addressListStr.substring(0, addressListStr.length()-1);
								}
								group.setAddressList(addressListStr);
								group.setUpdateTime(new Date());

								XxlJobAdminConfig.getAdminConfig().getXxlJobGroupDao().update(group);
							}
						}
					} catch (Exception e) {
						if (!toStop) {
							logger.error(">>>>>>>>>>> xxl-job, job registry monitor thread error:{}", e);
						}
					}
					try {
						TimeUnit.SECONDS.sleep(RegistryConfig.BEAT_TIMEOUT);
					} catch (InterruptedException e) {
						if (!toStop) {
							logger.error(">>>>>>>>>>> xxl-job, job registry monitor thread error:{}", e);
						}
					}
				}
				logger.info(">>>>>>>>>>> xxl-job, job registry monitor thread stop");
			}
		});
		registryMonitorThread.setDaemon(true);
		registryMonitorThread.setName("xxl-job, admin JobRegistryMonitorHelper-registryMonitorThread");
		registryMonitorThread.start();
	}
}

这里主要创建了注册监控器线程,线程中会删除过期的执行器(执行器端会定期向admin调度中心发送心跳检测,更新执行器的地址registry_value,如果执行器长时间不发送心跳检测,则调度中心任务该执行器已下线(dead),并从xxl_job_registry表中将其删除),然后从xxl_job_registry表获取每个执行器对应的地址列表,将该地址列表更新到xxl_job_group中的自动注册的执行器的address_list字段,这样xxl_job_group中的address_list为自动注册的执行器的最新地址。 需要说明的是,手动注册的执行器,其地址是需要在新建执行器手动录入的,而且该值是不会进行更新。

admin fail-monitor run 启动失败任务监控线程

JobFailMonitorHelper.getInstance().start()方法代码如下:

public class JobFailMonitorHelper {
	public void start(){
		monitorThread = new Thread(new Runnable() {

			@Override
			public void run() {

				// monitor
				while (!toStop) {
					try {

						//获取xxl_job_log失败任务日志,最多1000条
						List<Long> failLogIds = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findFailJobLogIds(1000);
						if (failLogIds!=null && !failLogIds.isEmpty()) {
							for (long failLogId: failLogIds) {

								// lock log  将默认(0)告警状态设置为锁定状态(-1)
								int lockRet = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateAlarmStatus(failLogId, 0, -1);
								if (lockRet < 1) {
									continue;
								}
								XxlJobLog log = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().load(failLogId);
								XxlJobInfo info = XxlJobAdminConfig.getAdminConfig().getXxlJobInfoDao().loadById(log.getJobId());

								// 1、fail retry monitor  失败重试监视器
								//如果失败重试次数大于0,触发任务执行
								if (log.getExecutorFailRetryCount() > 0) {
									JobTriggerPoolHelper.trigger(log.getJobId(), TriggerTypeEnum.RETRY, (log.getExecutorFailRetryCount()-1), log.getExecutorShardingParam(), log.getExecutorParam(), null);
									String retryMsg = "<br><br><span style=\"color:#F39C12;\" > >>>>>>>>>>>"+ I18nUtil.getString("jobconf_trigger_type_retry") +"<<<<<<<<<<< </span><br>";
									log.setTriggerMsg(log.getTriggerMsg() + retryMsg);
									XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateTriggerInfo(log);
								}

								// 2、fail alarm monitor  失败告警监视器
								int newAlarmStatus = 0;		// 告警状态:0-默认、-1=锁定状态、1-无需告警、2-告警成功、3-告警失败
								if (info != null) {
									boolean alarmResult = XxlJobAdminConfig.getAdminConfig().getJobAlarmer().alarm(info, log);
									newAlarmStatus = alarmResult?2:3;
								} else {
									newAlarmStatus = 1;
								}

								//将锁定的日志状态更新为告警状态 newAlarmStatus
								XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateAlarmStatus(failLogId, -1, newAlarmStatus);
							}
						}

					} catch (Exception e) {
						if (!toStop) {
							logger.error(">>>>>>>>>>> xxl-job, job fail monitor thread error:{}", e);
						}
					}

                    try {
                        TimeUnit.SECONDS.sleep(10);
                    } catch (Exception e) {
                        if (!toStop) {
                            logger.error(e.getMessage(), e);
                        }
                    }

                }

				logger.info(">>>>>>>>>>> xxl-job, job fail monitor thread stop");

			}
		});
		monitorThread.setDaemon(true);
		monitorThread.setName("xxl-job, admin JobFailMonitorHelper");
		monitorThread.start();
	}
}

从xxl_job_log表中获取触发失败的任务日志(最多1000条),失败日志告警状态为默认状态时,根据任务日志中的jobId查询到任务的详情,如果任务的失败重试次数大于0,则重新触发任务,并将触发失败的日志进行告警,并更新失败日志的告警状态。

admin lose-monitor run 启动任务执行结果丢失监控线程

JobCompleteHelper.getInstance().start()方法代码如下:

public class JobCompleteHelper {
   public void start(){

      // for callback  创建回调线程池
      callbackThreadPool = new ThreadPoolExecutor(...);


      // for monitor
      monitorThread = new Thread(new Runnable() {

         @Override
         public void run() {

            // wait for JobTriggerPoolHelper-init
            try {
               TimeUnit.MILLISECONDS.sleep(50);
            } catch (InterruptedException e) {
               if (!toStop) {
                  logger.error(e.getMessage(), e);
               }
            }

            // monitor
            while (!toStop) {
               try {
                  // 任务结果丢失处理:调度记录停留在 "运行中" 状态超过10min,且对应执行器心跳注册失败不在线,则将本地调度主动标记失败;
                  Date losedTime = DateUtil.addMinutes(new Date(), -10);
                  // todo 这里应该使用losedLogIds
                  List<Long> losedJobIds  = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findLostJobIds(losedTime);

                  if (losedJobIds!=null && losedJobIds.size()>0) {
                     for (Long logId: losedJobIds) {

                        XxlJobLog jobLog = new XxlJobLog();
                        jobLog.setId(logId);

                        jobLog.setHandleTime(new Date());
                        jobLog.setHandleCode(ReturnT.FAIL_CODE);
                        jobLog.setHandleMsg( I18nUtil.getString("joblog_lost_fail") );

                        XxlJobCompleter.updateHandleInfoAndFinish(jobLog);
                     }

                  }
               } catch (Exception e) {
                  if (!toStop) {
                     logger.error(">>>>>>>>>>> xxl-job, job fail monitor thread error:{}", e);
                  }
               }

                    try {
                        TimeUnit.SECONDS.sleep(60);
                    } catch (Exception e) {
                        if (!toStop) {
                            logger.error(e.getMessage(), e);
                        }
                    }

                }

            logger.info(">>>>>>>>>>> xxl-job, JobLosedMonitorHelper stop");

         }
      });
      monitorThread.setDaemon(true);
      monitorThread.setName("xxl-job, admin JobLosedMonitorHelper");
      monitorThread.start();
   }
}

启动任务执行结果丢失监控线程(一分钟执行一次),当xxl_job_log表中调度日志停留在 "运行中" 状态超过10min,且调度日志中的执行器地址在xxl_job_registry中不存在(对应执行器心跳注册失败不在线)时,则将本地调度主动标记失败,并主动结束任务。

admin log report start 启动日志报告线程

JobLogReportHelper.getInstance().start()方法代码如下:

public class JobLogReportHelper {
    // 第一,统计当前时间前三天的触发任务的数量、运行中的任务的数量、成功的任务数量、任务失败的数量,然后保存在数据库中。
    // 第二,根据配置的保存日志的过期时间,将已经过期的日志从数据库中查出来,然后清理过期的日志。
    public void start(){
        logrThread = new Thread(new Runnable() {

            @Override
            public void run() {

                // last clean log time
                long lastCleanLogTime = 0;


                while (!toStop) {

                    // 1、log-report refresh: refresh log report in 3 days
                    try {

                        for (int i = 0; i < 3; i++) {

                            // today
                            Calendar itemDay = Calendar.getInstance();
                            itemDay.add(Calendar.DAY_OF_MONTH, -i);
                            itemDay.set(Calendar.HOUR_OF_DAY, 0);
                            itemDay.set(Calendar.MINUTE, 0);
                            itemDay.set(Calendar.SECOND, 0);
                            itemDay.set(Calendar.MILLISECOND, 0);
                            // 00:00:00:000-23:59:59:999
                            Date todayFrom = itemDay.getTime();

                            itemDay.set(Calendar.HOUR_OF_DAY, 23);
                            itemDay.set(Calendar.MINUTE, 59);
                            itemDay.set(Calendar.SECOND, 59);
                            itemDay.set(Calendar.MILLISECOND, 999);

                            Date todayTo = itemDay.getTime();

                            // refresh log-report every minute
                            XxlJobLogReport xxlJobLogReport = new XxlJobLogReport();
                            xxlJobLogReport.setTriggerDay(todayFrom);
                            xxlJobLogReport.setRunningCount(0);
                            xxlJobLogReport.setSucCount(0);
                            xxlJobLogReport.setFailCount(0);

                            Map<String, Object> triggerCountMap = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findLogReport(todayFrom, todayTo);  // 00:00:00:000-23:59:59:999
                            if (triggerCountMap!=null && triggerCountMap.size()>0) {
                                int triggerDayCount = triggerCountMap.containsKey("triggerDayCount")?Integer.valueOf(String.valueOf(triggerCountMap.get("triggerDayCount"))):0;
                                int triggerDayCountRunning = triggerCountMap.containsKey("triggerDayCountRunning")?Integer.valueOf(String.valueOf(triggerCountMap.get("triggerDayCountRunning"))):0;
                                int triggerDayCountSuc = triggerCountMap.containsKey("triggerDayCountSuc")?Integer.valueOf(String.valueOf(triggerCountMap.get("triggerDayCountSuc"))):0;
                                int triggerDayCountFail = triggerDayCount - triggerDayCountRunning - triggerDayCountSuc;

                                xxlJobLogReport.setRunningCount(triggerDayCountRunning);
                                xxlJobLogReport.setSucCount(triggerDayCountSuc);
                                xxlJobLogReport.setFailCount(triggerDayCountFail);
                            }

                            // do refresh  刷新或者插入xxl_job_log_report数据
                            int ret = XxlJobAdminConfig.getAdminConfig().getXxlJobLogReportDao().update(xxlJobLogReport);
                            if (ret < 1) {
                                XxlJobAdminConfig.getAdminConfig().getXxlJobLogReportDao().save(xxlJobLogReport);
                            }
                        }

                    } catch (Exception e) {
                        if (!toStop) {
                            logger.error(">>>>>>>>>>> xxl-job, job log report thread error:{}", e);
                        }
                    }

                    // 2、log-clean: switch open & once each day
                    // 清除xxl_job_log中的日志
                    if (XxlJobAdminConfig.getAdminConfig().getLogretentiondays()>0
                            && System.currentTimeMillis() - lastCleanLogTime > 24*60*60*1000) {

                        // expire-time
                        Calendar expiredDay = Calendar.getInstance();
                        expiredDay.add(Calendar.DAY_OF_MONTH, -1 * XxlJobAdminConfig.getAdminConfig().getLogretentiondays());
                        expiredDay.set(Calendar.HOUR_OF_DAY, 0);
                        expiredDay.set(Calendar.MINUTE, 0);
                        expiredDay.set(Calendar.SECOND, 0);
                        expiredDay.set(Calendar.MILLISECOND, 0);
                        Date clearBeforeTime = expiredDay.getTime();

                        // clean expired log
                        List<Long> logIds = null;
                        do {
                            logIds = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findClearLogIds(0, 0, clearBeforeTime, 0, 1000);
                            if (logIds!=null && logIds.size()>0) {
                                XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().clearLog(logIds);
                            }
                        } while (logIds!=null && logIds.size()>0);

                        // update clean time
                        lastCleanLogTime = System.currentTimeMillis();
                    }

                    try {
                        TimeUnit.MINUTES.sleep(1);
                    } catch (Exception e) {
                        if (!toStop) {
                            logger.error(e.getMessage(), e);
                        }
                    }

                }

                logger.info(">>>>>>>>>>> xxl-job, job log report thread stop");

            }
        });
        logrThread.setDaemon(true);
        logrThread.setName("xxl-job, admin JobLogReportHelper");
        logrThread.start();
    }
}

启动日志报告线程(每分钟执行一次):统计当前时间前三天的触发任务的数量、运行中的任务的数量、成功的任务数量、任务失败的数量,然后保存在xxl_job_log_report表中;如果开启了日志保留时间,上次清理时间过去一天,则清除xxl_job_log中的过期日志;

start-schedule 启动任务调度线程