定位
xxl-job-admin模块是一个前后端一体项目(包括前端的js和spring boot后端项目),作为xxl-job的调度中心,作用是管理和调度定时任务。
启动流程
调度中心后端服务在启动过程中会加载XxlJobAdminConfig配置类,在配置类中,会创建一个XxlJobScheduler(任务调度器)对象以及对XxlJobScheduler(任务调度器)对象进行一系列的初始化工作(初始化工作主要是加载配置信息,创建以及初始化一系列化线程在后台一直异步运行,提高了性能)。 XxlJobAdminConfig配置类的核心代码(新建一个XxlJobScheduler对象,并执行XxlJobScheduler的初始化方法)如下:
@Component
public class XxlJobAdminConfig implements InitializingBean, DisposableBean {
@Override
public void afterPropertiesSet() throws Exception {
adminConfig = this;
// 创建调度器以及对调度器进行初始化
xxlJobScheduler = new XxlJobScheduler();
xxlJobScheduler.init();
}
}
下面我们对XxlJobScheduler(任务调度器)的初始化逻辑进行具体分析。 XxlJobScheduler的初始化方法代码如下:
public class XxlJobScheduler {
public void init() throws Exception {
// init i18n
initI18n();
// admin trigger pool start
JobTriggerPoolHelper.toStart();
// admin registry monitor run
JobRegistryHelper.getInstance().start();
// admin fail-monitor run
JobFailMonitorHelper.getInstance().start();
// admin lose-monitor run ( depend on JobTriggerPoolHelper)
JobCompleteHelper.getInstance().start();
// admin log report start
JobLogReportHelper.getInstance().start();
// start-schedule ( depend on JobTriggerPoolHelper)
JobScheduleHelper.getInstance().start();
logger.info(">>>>>>>>> init xxl-job admin success.");
}
}
如代码所示,XxlJobScheduler的初始化方法逻辑分为以下7个部分: (1)初始国际化:设置“任务执行阻塞策略”的title值,支持中文简体、英文和中文繁体三种 (2)创建任务触发线程池:创建任务触发(快慢)线程池(包含具体参数设置) (3)启动(执行器)注册监控器线程:创建调度任务注册/移除线程池,并启动(执行器)注册监控器线程(删除已下线的执行器,更新自动注册执行器的地址列表) (4)启动失败任务监控线程:进行定时任务失败重试、失败告警等 (5)启动任务执行结果丢失监控线程:调度记录停留在 "运行中" 状态超过10min,且对应执行器心跳注册失败不在线,则将本次调度主动标记失败 (6)启动日志报告线程:统计任务运行情况更新到xxl_job_log_report表、清除xxl_job_log中的过期日志 (7)开启任务调度:启动任务调度线程(不断从数据库把5秒内要执行的任务读出,立即触发或者放到时间轮等待触发),启动(时间)轮线程(从时间轮中获取任务,并触发任务)
init i18n 初始国际化
XxlJobScheduler.initI18n()方法代码如下:
public class XxlJobScheduler {
private void initI18n(){
for (ExecutorBlockStrategyEnum item:ExecutorBlockStrategyEnum.values()) {
/**
* I18nUtil.getString方法就是根据配置读取resources/i18n/目录下的其中一个文件, 该目录下有message_en.properties、message_zh_CN.properties、message_zh_TC.properties三个文件,分别为英语、中文简体、中文繁体是属性文件。I18nUtil.getString方法获取到执行阻塞策略的值赋值给title. 三个properties文件中有jobconf_block开始的三项配置
*/
item.setTitle(I18nUtil.getString("jobconf_block_".concat(item.name())));
}
}
}
i18n(其来源是英文单词 internationalization的首末字符i和n,18为中间的字符数)是“国际化”的简称,initI18n方法就是设置“任务执行阻塞策略”的title值。
admin trigger pool start 创建任务触发线程池
JobTriggerPoolHelper.toStart()方法代码如下:
public class JobTriggerPoolHelper {
public static void toStart() {
helper.start();
}
// 创建快慢触发器线程池
public void start(){
fastTriggerPool = new ThreadPoolExecutor(
10,
XxlJobAdminConfig.getAdminConfig().getTriggerPoolFastMax(),
60L,
TimeUnit.SECONDS,
new LinkedBlockingQueue<Runnable>(1000),
new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
return new Thread(r, "xxl-job, admin JobTriggerPoolHelper-fastTriggerPool-" + r.hashCode());
}
});
slowTriggerPool = new ThreadPoolExecutor(
10,
XxlJobAdminConfig.getAdminConfig().getTriggerPoolSlowMax(),
60L,
TimeUnit.SECONDS,
new LinkedBlockingQueue<Runnable>(2000),
new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
return new Thread(r, "xxl-job, admin JobTriggerPoolHelper-slowTriggerPool-" + r.hashCode());
}
});
}
}
这里主要创建了执行定时任务的快慢线程池,分别有一个大小为1000和2000的阻塞队列。
admin registry monitor run 启动(执行器)注册监控器线程
JobRegistryHelper.getInstance().start()方法代码如下:
public class JobRegistryHelper {
public void start(){
// for registry or remove 调度任务注册/移除线程池(用来执行调度任务的注册/移除)
registryOrRemoveThreadPool = new ThreadPoolExecutor(...});
// for monitor (执行器)注册监控器线程(删除已下线的执行器,更新自动注册执行器的地址列表)
registryMonitorThread = new Thread(new Runnable() {
@Override
public void run() {
while (!toStop) {
try {
// auto registry group 从xxl_job_group表中获取自动注册的执行器列表
List<XxlJobGroup> groupList = XxlJobAdminConfig.getAdminConfig().getXxlJobGroupDao().findByAddressType(0);
if (groupList!=null && !groupList.isEmpty()) {
// remove dead address (admin/executor) 从xxl_job_registry表中获取已经下线的执行器地址记录
List<Integer> ids = XxlJobAdminConfig.getAdminConfig().getXxlJobRegistryDao().findDead(RegistryConfig.DEAD_TIMEOUT, new Date());
// 从xxl_job_registry表中删除已经下线的执行器地址记录
if (ids!=null && ids.size()>0) {
XxlJobAdminConfig.getAdminConfig().getXxlJobRegistryDao().removeDead(ids);
}
// fresh online address (admin/executor)
// 从xxl_job_registry表获取每个执行器对应的地址列表,appAddressMap的key为appname(执行器名称),value为存活的执行器地址列表
HashMap<String, List<String>> appAddressMap = new HashMap<String, List<String>>();
List<XxlJobRegistry> list = XxlJobAdminConfig.getAdminConfig().getXxlJobRegistryDao().findAll(RegistryConfig.DEAD_TIMEOUT, new Date());
if (list != null) {
for (XxlJobRegistry item: list) {
if (RegistryConfig.RegistType.EXECUTOR.name().equals(item.getRegistryGroup())) {
String appname = item.getRegistryKey();
List<String> registryList = appAddressMap.get(appname);
if (registryList == null) {
registryList = new ArrayList<String>();
}
if (!registryList.contains(item.getRegistryValue())) {
registryList.add(item.getRegistryValue());
}
appAddressMap.put(appname, registryList);
}
}
}
// fresh group address 将xxl_job_registry表获取每个(自动注册) 执行器对应的地址列表刷新到xxl_job_group里
for (XxlJobGroup group: groupList) {
List<String> registryList = appAddressMap.get(group.getAppname());
String addressListStr = null;
if (registryList!=null && !registryList.isEmpty()) {
Collections.sort(registryList);
StringBuilder addressListSB = new StringBuilder();
for (String item:registryList) {
addressListSB.append(item).append(",");
}
addressListStr = addressListSB.toString();
addressListStr = addressListStr.substring(0, addressListStr.length()-1);
}
group.setAddressList(addressListStr);
group.setUpdateTime(new Date());
XxlJobAdminConfig.getAdminConfig().getXxlJobGroupDao().update(group);
}
}
} catch (Exception e) {
if (!toStop) {
logger.error(">>>>>>>>>>> xxl-job, job registry monitor thread error:{}", e);
}
}
try {
TimeUnit.SECONDS.sleep(RegistryConfig.BEAT_TIMEOUT);
} catch (InterruptedException e) {
if (!toStop) {
logger.error(">>>>>>>>>>> xxl-job, job registry monitor thread error:{}", e);
}
}
}
logger.info(">>>>>>>>>>> xxl-job, job registry monitor thread stop");
}
});
registryMonitorThread.setDaemon(true);
registryMonitorThread.setName("xxl-job, admin JobRegistryMonitorHelper-registryMonitorThread");
registryMonitorThread.start();
}
}
这里主要创建了注册监控器线程,线程中会删除过期的执行器(执行器端会定期向admin调度中心发送心跳检测,更新执行器的地址registry_value,如果执行器长时间不发送心跳检测,则调度中心任务该执行器已下线(dead),并从xxl_job_registry表中将其删除),然后从xxl_job_registry表获取每个执行器对应的地址列表,将该地址列表更新到xxl_job_group中的自动注册的执行器的address_list字段,这样xxl_job_group中的address_list为自动注册的执行器的最新地址。 需要说明的是,手动注册的执行器,其地址是需要在新建执行器手动录入的,而且该值是不会进行更新。
admin fail-monitor run 启动失败任务监控线程
JobFailMonitorHelper.getInstance().start()方法代码如下:
public class JobFailMonitorHelper {
public void start(){
monitorThread = new Thread(new Runnable() {
@Override
public void run() {
// monitor
while (!toStop) {
try {
//获取xxl_job_log失败任务日志,最多1000条
List<Long> failLogIds = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findFailJobLogIds(1000);
if (failLogIds!=null && !failLogIds.isEmpty()) {
for (long failLogId: failLogIds) {
// lock log 将默认(0)告警状态设置为锁定状态(-1)
int lockRet = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateAlarmStatus(failLogId, 0, -1);
if (lockRet < 1) {
continue;
}
XxlJobLog log = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().load(failLogId);
XxlJobInfo info = XxlJobAdminConfig.getAdminConfig().getXxlJobInfoDao().loadById(log.getJobId());
// 1、fail retry monitor 失败重试监视器
//如果失败重试次数大于0,触发任务执行
if (log.getExecutorFailRetryCount() > 0) {
JobTriggerPoolHelper.trigger(log.getJobId(), TriggerTypeEnum.RETRY, (log.getExecutorFailRetryCount()-1), log.getExecutorShardingParam(), log.getExecutorParam(), null);
String retryMsg = "<br><br><span style=\"color:#F39C12;\" > >>>>>>>>>>>"+ I18nUtil.getString("jobconf_trigger_type_retry") +"<<<<<<<<<<< </span><br>";
log.setTriggerMsg(log.getTriggerMsg() + retryMsg);
XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateTriggerInfo(log);
}
// 2、fail alarm monitor 失败告警监视器
int newAlarmStatus = 0; // 告警状态:0-默认、-1=锁定状态、1-无需告警、2-告警成功、3-告警失败
if (info != null) {
boolean alarmResult = XxlJobAdminConfig.getAdminConfig().getJobAlarmer().alarm(info, log);
newAlarmStatus = alarmResult?2:3;
} else {
newAlarmStatus = 1;
}
//将锁定的日志状态更新为告警状态 newAlarmStatus
XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateAlarmStatus(failLogId, -1, newAlarmStatus);
}
}
} catch (Exception e) {
if (!toStop) {
logger.error(">>>>>>>>>>> xxl-job, job fail monitor thread error:{}", e);
}
}
try {
TimeUnit.SECONDS.sleep(10);
} catch (Exception e) {
if (!toStop) {
logger.error(e.getMessage(), e);
}
}
}
logger.info(">>>>>>>>>>> xxl-job, job fail monitor thread stop");
}
});
monitorThread.setDaemon(true);
monitorThread.setName("xxl-job, admin JobFailMonitorHelper");
monitorThread.start();
}
}
从xxl_job_log表中获取触发失败的任务日志(最多1000条),失败日志告警状态为默认状态时,根据任务日志中的jobId查询到任务的详情,如果任务的失败重试次数大于0,则重新触发任务,并将触发失败的日志进行告警,并更新失败日志的告警状态。
admin lose-monitor run 启动任务执行结果丢失监控线程
JobCompleteHelper.getInstance().start()方法代码如下:
public class JobCompleteHelper {
public void start(){
// for callback 创建回调线程池
callbackThreadPool = new ThreadPoolExecutor(...);
// for monitor
monitorThread = new Thread(new Runnable() {
@Override
public void run() {
// wait for JobTriggerPoolHelper-init
try {
TimeUnit.MILLISECONDS.sleep(50);
} catch (InterruptedException e) {
if (!toStop) {
logger.error(e.getMessage(), e);
}
}
// monitor
while (!toStop) {
try {
// 任务结果丢失处理:调度记录停留在 "运行中" 状态超过10min,且对应执行器心跳注册失败不在线,则将本地调度主动标记失败;
Date losedTime = DateUtil.addMinutes(new Date(), -10);
// todo 这里应该使用losedLogIds
List<Long> losedJobIds = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findLostJobIds(losedTime);
if (losedJobIds!=null && losedJobIds.size()>0) {
for (Long logId: losedJobIds) {
XxlJobLog jobLog = new XxlJobLog();
jobLog.setId(logId);
jobLog.setHandleTime(new Date());
jobLog.setHandleCode(ReturnT.FAIL_CODE);
jobLog.setHandleMsg( I18nUtil.getString("joblog_lost_fail") );
XxlJobCompleter.updateHandleInfoAndFinish(jobLog);
}
}
} catch (Exception e) {
if (!toStop) {
logger.error(">>>>>>>>>>> xxl-job, job fail monitor thread error:{}", e);
}
}
try {
TimeUnit.SECONDS.sleep(60);
} catch (Exception e) {
if (!toStop) {
logger.error(e.getMessage(), e);
}
}
}
logger.info(">>>>>>>>>>> xxl-job, JobLosedMonitorHelper stop");
}
});
monitorThread.setDaemon(true);
monitorThread.setName("xxl-job, admin JobLosedMonitorHelper");
monitorThread.start();
}
}
启动任务执行结果丢失监控线程(一分钟执行一次),当xxl_job_log表中调度日志停留在 "运行中" 状态超过10min,且调度日志中的执行器地址在xxl_job_registry中不存在(对应执行器心跳注册失败不在线)时,则将本地调度主动标记失败,并主动结束任务。
admin log report start 启动日志报告线程
JobLogReportHelper.getInstance().start()方法代码如下:
public class JobLogReportHelper {
// 第一,统计当前时间前三天的触发任务的数量、运行中的任务的数量、成功的任务数量、任务失败的数量,然后保存在数据库中。
// 第二,根据配置的保存日志的过期时间,将已经过期的日志从数据库中查出来,然后清理过期的日志。
public void start(){
logrThread = new Thread(new Runnable() {
@Override
public void run() {
// last clean log time
long lastCleanLogTime = 0;
while (!toStop) {
// 1、log-report refresh: refresh log report in 3 days
try {
for (int i = 0; i < 3; i++) {
// today
Calendar itemDay = Calendar.getInstance();
itemDay.add(Calendar.DAY_OF_MONTH, -i);
itemDay.set(Calendar.HOUR_OF_DAY, 0);
itemDay.set(Calendar.MINUTE, 0);
itemDay.set(Calendar.SECOND, 0);
itemDay.set(Calendar.MILLISECOND, 0);
// 00:00:00:000-23:59:59:999
Date todayFrom = itemDay.getTime();
itemDay.set(Calendar.HOUR_OF_DAY, 23);
itemDay.set(Calendar.MINUTE, 59);
itemDay.set(Calendar.SECOND, 59);
itemDay.set(Calendar.MILLISECOND, 999);
Date todayTo = itemDay.getTime();
// refresh log-report every minute
XxlJobLogReport xxlJobLogReport = new XxlJobLogReport();
xxlJobLogReport.setTriggerDay(todayFrom);
xxlJobLogReport.setRunningCount(0);
xxlJobLogReport.setSucCount(0);
xxlJobLogReport.setFailCount(0);
Map<String, Object> triggerCountMap = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findLogReport(todayFrom, todayTo); // 00:00:00:000-23:59:59:999
if (triggerCountMap!=null && triggerCountMap.size()>0) {
int triggerDayCount = triggerCountMap.containsKey("triggerDayCount")?Integer.valueOf(String.valueOf(triggerCountMap.get("triggerDayCount"))):0;
int triggerDayCountRunning = triggerCountMap.containsKey("triggerDayCountRunning")?Integer.valueOf(String.valueOf(triggerCountMap.get("triggerDayCountRunning"))):0;
int triggerDayCountSuc = triggerCountMap.containsKey("triggerDayCountSuc")?Integer.valueOf(String.valueOf(triggerCountMap.get("triggerDayCountSuc"))):0;
int triggerDayCountFail = triggerDayCount - triggerDayCountRunning - triggerDayCountSuc;
xxlJobLogReport.setRunningCount(triggerDayCountRunning);
xxlJobLogReport.setSucCount(triggerDayCountSuc);
xxlJobLogReport.setFailCount(triggerDayCountFail);
}
// do refresh 刷新或者插入xxl_job_log_report数据
int ret = XxlJobAdminConfig.getAdminConfig().getXxlJobLogReportDao().update(xxlJobLogReport);
if (ret < 1) {
XxlJobAdminConfig.getAdminConfig().getXxlJobLogReportDao().save(xxlJobLogReport);
}
}
} catch (Exception e) {
if (!toStop) {
logger.error(">>>>>>>>>>> xxl-job, job log report thread error:{}", e);
}
}
// 2、log-clean: switch open & once each day
// 清除xxl_job_log中的日志
if (XxlJobAdminConfig.getAdminConfig().getLogretentiondays()>0
&& System.currentTimeMillis() - lastCleanLogTime > 24*60*60*1000) {
// expire-time
Calendar expiredDay = Calendar.getInstance();
expiredDay.add(Calendar.DAY_OF_MONTH, -1 * XxlJobAdminConfig.getAdminConfig().getLogretentiondays());
expiredDay.set(Calendar.HOUR_OF_DAY, 0);
expiredDay.set(Calendar.MINUTE, 0);
expiredDay.set(Calendar.SECOND, 0);
expiredDay.set(Calendar.MILLISECOND, 0);
Date clearBeforeTime = expiredDay.getTime();
// clean expired log
List<Long> logIds = null;
do {
logIds = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findClearLogIds(0, 0, clearBeforeTime, 0, 1000);
if (logIds!=null && logIds.size()>0) {
XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().clearLog(logIds);
}
} while (logIds!=null && logIds.size()>0);
// update clean time
lastCleanLogTime = System.currentTimeMillis();
}
try {
TimeUnit.MINUTES.sleep(1);
} catch (Exception e) {
if (!toStop) {
logger.error(e.getMessage(), e);
}
}
}
logger.info(">>>>>>>>>>> xxl-job, job log report thread stop");
}
});
logrThread.setDaemon(true);
logrThread.setName("xxl-job, admin JobLogReportHelper");
logrThread.start();
}
}
启动日志报告线程(每分钟执行一次):统计当前时间前三天的触发任务的数量、运行中的任务的数量、成功的任务数量、任务失败的数量,然后保存在xxl_job_log_report表中;如果开启了日志保留时间,上次清理时间过去一天,则清除xxl_job_log中的过期日志;