【线程池】java 1.8 ThreadPoolExecutor 类源码解读

223 阅读10分钟

1 线程池:why?

主要解决

  • 1 异步任务
  • 2 生产者消费者场景

2 参数详解

2.1 核心线程数
// 小于核心线程数:任务提交时创建新的线程,即使其他线程空闲;
When a new task is submitted in method {@link #execute(Runnable)}, and fewer than corePoolSize threads are running, a new thread is created to handle the request, even if other worker threads are idle.
// 大于核心线程数但是小于最大线程数,只有在queue满的情况下,才会创建线程;
If there are more than corePoolSize but less than maximumPoolSize threads running, a new thread will be created only if the queue is full.
// 核心线程数=最大线程数时,即固定线程数的线程池
By setting corePoolSize and maximumPoolSize the same, you create a fixed-size thread pool
// 设置最大线程数本质上无界时 比如Integer最大值,可以无限提交任务
By setting maximumPoolSize to an essentially unbounded value such as {@code Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary number of concurrent tasks
// 正常情况下动态调整 业务代码自己实现
Most typically, core and maximum pool sizes are set only upon construction, but they may also be changed dynamically using {@link #setCorePoolSize} and {@link #setMaximumPoolSize}.
// 默认情况下核心线程只有新任务到来时才创建。
By default, even core threads are initially created and started only when new tasks arrive. but this can be overridden dynamically using method {@link #prestartCoreThread} or {@link #prestartAllCoreThreads}. Starts all core threads, causing them to idly wait for work. This overrides the default policy of starting core threads only when new tasks are executed.
为什么? 有这样需求,比如tomcat的中有些组件使用的线程池就是初始化时:调用prestartAllCoreThreads,预先启动核心线程;

2.2 源码解析-核心线程数

    public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        /*
         * Proceed in 3 steps:
         *
         * 1. If fewer than corePoolSize threads are running, try to
         * start a new thread with the given command as its first // 直接创建核心
         * task.  The call to addWorker atomically checks runState and
         * workerCount, and so prevents false alarms that would add
         * threads when it shouldn't, by returning false.
         *
         * 2. If a task can be successfully queued, then we still need  // 加入队列二次检测,线程池可能关闭或者existing ones died since last checking
         * to double-check whether we should have added a thread
         * (because existing ones died since last checking) or that
         * the pool shut down since entry into this method. So we
         * recheck state and if necessary roll back the enqueuing if
         * stopped, or start a new thread if there are none.
         *
         * 3. If we cannot queue task, then we try to add a new
         * thread.  If it fails, we know we are shut down or saturated // 饱和策略
         * and so reject the task.
         */
        int c = ctl.get();
        if (workerCountOf(c) < corePoolSize) {  // 小于核心pool size addWorker
            if (addWorker(command, true))
                return;
            c = ctl.get();                      // 创建核心失败
        }
        if (isRunning(c) && workQueue.offer(command)) { // 入队成功
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))  // pool关闭 移除任务
                reject(command);  // 移除成功,走拒绝策略                        
            else if (workerCountOf(recheck) == 0) // 如果工作线程空,增加非核心工作者
                addWorker(null, false);
        }
        else if (!addWorker(command, false))  //入队失败创建非核心线程
            reject(command); // 创建非核心线程失败走拒绝策略
    }

2.3 pool ctl

ctl = new AtomicInteger(ctlOf(RUNNING, 0));

线程池状态control state控制状态 ,lifecycle control 生命周期控制,状态含义

  • * RUNNING: Accept new tasks and process queued tasks
  • * SHUTDOWN: Don't accept new tasks, but process queued tasks
  • * STOP: Don't accept new tasks, don't process queued tasks,
  • * and interrupt in-progress tasks
  • * TIDYING: All tasks have terminated, workerCount is zero,
  • * the thread transitioning to state TIDYING
  • * will run the terminated() hook method
  • * TERMINATED: terminated() has completed
  • Threads waiting in awaitTermination() will return when the state reaches TERMINATED.

包含workerCount 线程池工作线程数量,runState线程池状态(2^29)-1最多5亿的数量,workercount可能暂时不同于实际的线程数.

111-1 1111 | 1111 1111 | 1111 1111 | 1111 1111  
高3位表示状态 
 【111 RUNNING】 -1
 【000 SHUTDOWN】 0
 【001 STOP】 1
 【010 TIDYING】 2
 【011 TERMINATED】 3
The workerCount is the number of workers that have been permitted to start and not permitted to stop,

2.4 线程池启动参数 - 线程工厂 why?
线程创建使用的工厂类,默认线程为非守护线程(main方法就是非守护线程。守护线程则是如gc线程,主线程终止时立即结束,所以try finally方法可以不被执行)。线程优先级为默认值。The default thread factory.但是项目实际使用中会使用自定义线程池我们可以:can alter the thread's name, thread group, priority, daemon status.等方便查看线上日志,定位问题.
使用线程工厂如果创建线程失败,会返回null,the executor will continue, but might not be able to execute any tasks(线程池可以继续工作,但是可能不能执行任何任务)Threads should possess the "modifyThread",如果没有该权限service may be degraded 并且termination is possible but not completed.

2.5 守护线程 VS 非守护线程

  • // 守护线程或者非守护线程 非守护程序线程也称为“用户”线程,优先级,分组,类加载器默认都是集成子父线程;
  • // 其实线程就是一个对象,包含了虚拟机栈,方法栈,持有的任务,自己有id(同步器加锁自增),stackSize,
  • // 即使只有1个非守护程序(即用户)线程处于活动状态,JVM也不会退出
  • // 创建新线程时,它将继承其父级的守护程序状态。您可以通过调用setDaemon(boolean)来使线程守护程序线程或非守护程序线程。但是应该在启动线程之前调用此方法。
  • // Java 虚拟机启动时,只有一个非守护进程线程,该线程调用指定类的main()方法。这就是默认情况下由主线程创建的所有子线程都是非守护程序的原因,因为主线程是非守护程序。 这就是main方法中的如果有默认线程池,jvm不会退出的原因。

2.6 线程池启动参数 Keep-alive times

If the pool currently has more than corePoolSize threads, excess threads will be terminated if they have been idle for more than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}). // 存活时间默认作用于超过核心线程数的线程,当idle时间超过存活时间线程将会终止。实现了线程池使用对资源使用的伸缩。(By default, the keep-alive policy applies only when there are more than corePoolSize threads)But method {@link #allowCoreThreadTimeOut(boolean)} can be used to apply this time-out policy to core threads as well, so long as the keepAliveTime value is non-zero. // 这个参数也可以作用于核心线程数,如果设置 allowCoreThreadTimeOut 允许核心线程超时。那个pool size 用可能为降为0, 主(工作线程)循环停止,等待任务的到来。
// 每个worker都在自旋(主(工作线程)循环,堵塞在从queue中获取任务)即java.util.concurrent.ThreadPoolExecutor#getTask 其中 boolean timed = allowCoreThreadTimeOut || wc > corePoolSize; 标识是否可以Are workers subject to culling,是否可以退出。

2.7 源码解析-Keep-alive times

    private Runnable getTask() {
        boolean timedOut = false; // Did the last poll() time out?

        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                decrementWorkerCount();
                return null;
            }

            int wc = workerCountOf(c);

            // Are workers subject to culling?
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

            if ((wc > maximumPoolSize || (timed && timedOut))// (大于最大核心 或者 已经超时) && (工作线程数多于一个 或者 任务队列为空)则应该结束该线程
                && (wc > 1 || workQueue.isEmpty())) {
                if (compareAndDecrementWorkerCount(c)) // workerCount --
                    return null;  // 1 可能是响应中断缩容而返回null 2 非核心缩容
                continue;
            }

            try {
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    return r;
                timedOut = true;   // 超时说明有任务了,下次循环try回收线程
            } catch (InterruptedException retry) {
                timedOut = false;  // 响应等待过程中的中断,下次循环try回收线程
// 如果设置了允许核心线程池超时timedOut=true && timed=true,在该方法执行时,会给所有的worker(未设置中断标志的线程)设置中断标志,这样线程在等待任务(堵塞队列获取任务时)那么该线程将会立即从阻塞状态中退出,并抛出一个InterruptedException异常,同时,该线程的中断状态被设为false, 除此之外,不会发生任何事。


            }
        }
    }

涉及到中断请参考另一篇整理文章; java.util.concurrent.ThreadPoolExecutor#interruptIdleWorkers(boolean)

2.8 源码解析-BlockQueue

/ BlockingQueue
// queue一般和pool size 交互
Any {@link BlockingQueue} may be used to transfer and hold submitted tasks.  The use of this queue interacts with pool sizing:

如果运行的线程数量小于核心线程数量,则会新建线程而不是入队
If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing

如果运行的线程数量大于核心线程数量,则会执行入队
If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.

如果请求不能入队,可能创建线程不超过最大线程数,如果超了执行拒绝策略
If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.  // 注意默认策略是拒绝执行抛出异常,如果导致外层线程结束,如果线程池是通过同一个外层线程创建的并且是非守护线程,则线程池不会结束。


入队策略:3种

There are three general strategies for queuing:
why? how? 
// 第一种 同步队列 SynchronousQueue ---> Direct handoffs 直接交接
当平均而言,命令继续以比其处理速度更快的速度到达时,就可以实现无限线程增长的可能性。
用于将任务移交给线程而不用其他方式持有它们。在这里,如果没有立即可用的线程来运行任务,则尝试将任务排队的尝试将失败,因此将构建新线程。此策略可避免在以下情况下锁定处理可能具有内部依赖性的请求集。直接越区切换通常需要无限制的maximumPoolSizes来避免拒绝新提交的任务。
问题:处理不过来,线程数量会激增,资源消耗不可控;

// 第二种 无限队列 LinkedBlockingQueue --->Unbounded queues
// why? 处理短暂的请求增多可以。但是一直增加会有问题;
new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created。 此种队列隐含的是:最大线程数配置是无效的。
问题:处理不过来,队列长度激增,资源消耗不可控;

// 第三种 有界队列 ArrayBlockingQueue ---> A bounded queue  最常用的一种队列;
// why?  使用无线线程数量时,限制资源过度消耗
helps prevent resource exhaustion when used with finite maximumPoolSizes;
Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead;// 使用大队列小核心池,节省CPU OS 上下文切换的开销,结果就是:人为地降低吞吐量;If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow: 系统可能能够安排更多线程.Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput. // 使用小队了,多pool线程,可能导致调度开销过重,也会导致吞吐降低。

2.9 Worker 实现了AQS why ?以及ThreadPoolExecutor 成员及内部类

  2.9  Worker 实现了AQS why ?以及ThreadPoolExecutor 成员及内部类


HashSet<Worker> workers  // 所有的worker在hashset中 Accessed only when holding mainLock.
ReentrantLock mainLock  控制workers队列锁
BlockingQueue<Runnable> workQueue  任务线程 
AtomicInteger ctl 线程池stat
Condition termination  support awaitTermination
int largestPoolSize  Tracks largest attained pool size // Accessed only under mainLock  pool size 峰值
long completedTaskCount  Counter for completed tasks   // Updated only on termination of worker threads  Accessed only under mainLock
volatile ThreadFactory threadFactory  // Factory for new threads

     * All user control parameters are declared as volatiles so that
     * ongoing actions are based on freshest values, but without need
     * for locking, since no internal invariants depend on them  // 无内部变量变化依赖它;
     * changing synchronously with respect to other actions.
volatile RejectedExecutionHandler handler; // Handler called when saturated or shutdown in execute. // 饱和策略
volatile long keepAliveTime // Timeout in nanoseconds for idle threads waiting for work. allowCoreThreadTimeOut 开启或者超过核心
volatile boolean allowCoreThreadTimeOut; // false :core threads stay alive even when idle; core threads use keepAliveTime to time out waiting for work
volatile int corePoolSize; // allowCoreThreadTimeOut is set, in which case the minimum is zero  // 并发访问或修改需要hb,volatile
volatile int maximumPoolSize; // Maximum pool size
static final RejectedExecutionHandler defaultHandler = new AbortPolicy() // rejected execution handler

private final class Worker extends AbstractQueuedSynchronizer implements Runnable  
// worker 实现了aqs why? 锁定线程work 独享模式,修改绑定线程中断状态以及worker 内部变量,防止别人更改;





  2.10  CTL控制方法

/*
 * Methods for setting control state
*/

private void advanceRunState(int targetState) 
final void tryTerminate() 
private void checkShutdownAccess()
private void interruptWorkers()
private void interruptIdleWorkers()
final void reject(Runnable command)
private void interruptIdleWorkers(boolean onlyOne)
private void interruptIdleWorkers()
final void reject(Runnable command)
void onShutdown() // hook
final boolean isRunningOrShutdown(boolean shutdownOK)
private List<Runnable> drainQueue()
  2.11  主工作队列循环
扩容:不是差多少就扩容多少,而是去了任务队列和delta值的最小值,调用后有可能一段时间内都不扩容 // 但是pool size已经被更改了
缩容设置 通过:interruptIdleWorkers()完成,不会立即响应,如果都是堵塞可能全部被回收吗,核心线程也是可以回收的!!最终退出时:检测没有任务,的确可以降到pool size = 0    
 从  Runnable getTask() 响应了interruptIdleWorkers()中断,获取任务醒过来后如何处理中断呢?看下面主工作队列循环
final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;  // 默认不会从主循环中异常退出
        try {
            while (task != null || (task = getTask()) != null) { // (task = getTask()) == null 就是工作线程退出的标志
                w.lock();
                // If pool is stopping, ensure thread is interrupted;
                // if not, ensure thread is not interrupted.  This
                // requires a recheck in second case to deal with
                // shutdownNow race while clearing interrupt
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        afterExecute(task, thrown);
                    }
                } finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;  // 工作线程退出主循环
        } finally {
            processWorkerExit(w, completedAbruptly); // 工作线程退出主循环善后处理,
如:统计该工作线程处理过的任务数量,累加到总任务数(completedTaskCount  Counter for completed tasks )等
        }
    }

  2.12  工作线程退出处理源码:

 final void runWorker(Worker w)  // Main worker run loop. final 最重要的核心方法 主(工作线程)循环,如果这里抛出了异常,worker就死了
 // Repeatedly gets tasks from queue and  executes them
 1. We may start out with an initial task
 2. Before running any task, the lock is acquired to prevent
 3. Each task run is preceded by a call to beforeExecute, in which case we cause thread to die (breaking loop with completedAbruptly true) without processing the task. 直接把异常抛到了外层 // 任务丢了啊!

   
 private void processWorkerExit(Worker w, boolean completedAbruptly) {
        // false :正常退出 不修改workerCount 因为 getTask()退出时已经做了减法
        // 参考 java.util.concurrent.ThreadPoolExecutor#getTask
        // compareAndDecrementWorkerCount(c) 和 decrementWorkerCount() 方法
        // true : 运行任务时异常,需要做减法处理
        if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted  
            decrementWorkerCount();

        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            completedTaskCount += w.completedTasks;
            workers.remove(w);
        } finally {
            mainLock.unlock();
        }

        tryTerminate();

        int c = ctl.get();
        if (runStateLessThan(c, STOP)) {
            if (!completedAbruptly) { // completedAbruptly == false :正常退出
                int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
                if (min == 0 && ! workQueue.isEmpty()) // 有任务时 pool size 至少1
                    min = 1;
                if (workerCountOf(c) >= min)    //  满足 pool size 至少1 
                    return; // replacement not needed
            }
            addWorker(null, false);  //  增加非核心工作线程
        }
    }
  2.13  其他方法汇总:


/*
 * Methods for creating, running and cleaning up after workers
 */

 private boolean addWorker(Runnable firstTask, boolean core)
 private void addWorkerFailed(Worker w)
 private void processWorkerExit(Worker w, boolean completedAbruptly)
 private Runnable getTask()  // 核心方法之一 获取任务
 // Public constructors and methods

 public void execute(Runnable command)   // 核心方法之一 外部任务入口
 public void shutdown()
 public List<Runnable> shutdownNow()
 public boolean isShutdown()
 public boolean isTerminating()
 public boolean isTerminated()
 public boolean awaitTermination(long timeout, TimeUnit unit)  // 等待指定时间,检测是否到达TERMINATED 状态
 // causes the current thread to wait until it is signalled or interrupted, or the specified waiting time elapses
 protected void finalize()
 public void setThreadFactory(ThreadFactory threadFactory)
 public ThreadFactory getThreadFactory() 
 public void setRejectedExecutionHandler(RejectedExecutionHandler handler)
 public RejectedExecutionHandler getRejectedExecutionHandler()
 public void setCorePoolSize(int corePoolSize)
 public int getCorePoolSize()
 public boolean prestartCoreThread()
 void ensurePrestart()
 public int prestartAllCoreThreads()
 public boolean allowsCoreThreadTimeOut()
 public void allowCoreThreadTimeOut(boolean value)
 public void allowCoreThreadTimeOut(boolean value)
 public int getMaximumPoolSize()
 public void setKeepAliveTime(long time, TimeUnit unit)
 public long getKeepAliveTime(TimeUnit unit)


 /* User-level queue utilities */
 public BlockingQueue<Runnable> getQueue()
 public boolean remove(Runnable task) // task -->submit --->转换成Future :fail 
 public void purge()  // 清除所有取消的Future任务
 NOTE:
Take slow path if we encounter interference during traversal. // 遍历中遇到干扰
Make copy for traversal and call remove for cancelled entries. // 使用copy副本遍历
The slow path is more likely to be O(N*N). // O(N*N) 复杂度
// 复垦 --->  存储回收操作// 取消的工作可能累计在队列中,根本不会执行


 /* Statistics */  统计

to String 方法: 

pool size = nworkers = workers.size()
active threads = sum(w.isLocked())
// public int getActiveCount()  // 获取活跃线程数  w.isLocked()  即被锁定的worker数量 boolean isHeldExclusively() ==  getState() != 0
queued tasks = workQueue.size()
completed tasks = completedTaskCount + sum(w.completedTasks)  // 历史线程处理task数量 + 现有线程完成的task数量

3 小结

  1. Worker 的整个工作流程,worker 为什么继承了AQS?因为实现低开销的线程安全

  2. 中断在其中起到的作用:pool 关闭,伸缩

  3. AQS 源码详解next page