手撕ThreadPoolExecutor的源码之核心参数和拒绝策略/背压

70 阅读8分钟

首先点进ThreadPoolExecutor,可以看到有以下变量

/**
 * The queue used for holding tasks and handing off to worker
 * threads.  We do not require that workQueue.poll() returning
 * null necessarily means that workQueue.isEmpty(), so rely
 * solely on isEmpty to see if the queue is empty (which we must
 * do for example when deciding whether to transition from
 * SHUTDOWN to TIDYING).  This accommodates special-purpose
 * queues such as DelayQueues for which poll() is allowed to
 * return null even if it may later return non-null when delays
 * expire.
 */
private final BlockingQueue<Runnable> workQueue;
/*
 * All user control parameters are declared as volatiles so that
 * ongoing actions are based on freshest values, but without need
 * for locking, since no internal invariants depend on them
 * changing synchronously with respect to other actions.
 */

/**
 * Factory for new threads. All threads are created using this
 * factory (via method addWorker).  All callers must be prepared
 * for addWorker to fail, which may reflect a system or user's
 * policy limiting the number of threads.  Even though it is not
 * treated as an error, failure to create threads may result in
 * new tasks being rejected or existing ones remaining stuck in
 * the queue.
 *
 * We go further and preserve pool invariants even in the face of
 * errors such as OutOfMemoryError, that might be thrown while
 * trying to create threads.  Such errors are rather common due to
 * the need to allocate a native stack in Thread.start, and users
 * will want to perform clean pool shutdown to clean up.  There
 * will likely be enough memory available for the cleanup code to
 * complete without encountering yet another OutOfMemoryError.
 */
private volatile ThreadFactory threadFactory;

/**
 * Handler called when saturated or shutdown in execute.
 */
private volatile RejectedExecutionHandler handler;

/**
 * Timeout in nanoseconds for idle threads waiting for work.
 * Threads use this timeout when there are more than corePoolSize
 * present or if allowCoreThreadTimeOut. Otherwise they wait
 * forever for new work.
 */
private volatile long keepAliveTime;

/**
 * If false (default), core threads stay alive even when idle.
 * If true, core threads use keepAliveTime to time out waiting
 * for work.
 */
private volatile boolean allowCoreThreadTimeOut;

/**
 * Core pool size is the minimum number of workers to keep alive
 * (and not allow to time out etc) unless allowCoreThreadTimeOut
 * is set, in which case the minimum is zero.
 *
 * Since the worker count is actually stored in COUNT_BITS bits,
 * the effective limit is {@code corePoolSize & COUNT_MASK}.
 */
private volatile int corePoolSize;

/**
 * Maximum pool size.
 *
 * Since the worker count is actually stored in COUNT_BITS bits,
 * the effective limit is {@code maximumPoolSize & COUNT_MASK}.
 */
private volatile int maximumPoolSize;

/**
 * The default rejected execution handler.
 */
private static final RejectedExecutionHandler defaultHandler =
    new AbortPolicy();

可以看到主要有

线程创建的工厂方法 threadFactory 拒绝策略 handler 核心线程活跃时间 keepAliveTime 是否允许核心线程超时 allowCoreThreadTimeOut 核心线程 corePoolSize 最大线程数量 maximumPoolSize 拒绝策略 defaultHandler

线程池执行的过程

/*
 * Proceed in 3 steps:
 *
 * 1. If fewer than corePoolSize threads are running, try to
 * start a new thread with the given command as its first
 * task.  The call to addWorker atomically checks runState and
 * workerCount, and so prevents false alarms that would add
 * threads when it shouldn't, by returning false.
 *
 * 2. If a task can be successfully queued, then we still need
 * to double-check whether we should have added a thread
 * (because existing ones died since last checking) or that
 * the pool shut down since entry into this method. So we
 * recheck state and if necessary roll back the enqueuing if
 * stopped, or start a new thread if there are none.
 *
 * 3. If we cannot queue task, then we try to add a new
 * thread.  If it fails, we know we are shut down or saturated
 * and so reject the task.
 */

默认的拒绝策略是AbortPolicy

以下再看每种拒绝策略的定义

截屏2025-06-23 14.44.26.png

  • AbortPolicy 直接拒绝
public static class AbortPolicy implements RejectedExecutionHandler {
    /**
     * Creates an {@code AbortPolicy}.
     */
    public AbortPolicy() { }

    /**
     * Always throws RejectedExecutionException.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     * @throws RejectedExecutionException always
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        throw new RejectedExecutionException("Task " + r.toString() +
                                             " rejected from " +
                                             e.toString());
    }
}

可以看到就是抛出异常RejectedExecutionException

  • DiscardPolicy
public static class DiscardPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardPolicy}.
     */
    public DiscardPolicy() { }

    /**
     * Does nothing, which has the effect of discarding task r.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    }
}

可以看到什么都不做,忽略

  • DiscardOldestPolicy
public static class DiscardOldestPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardOldestPolicy} for the given executor.
     */
    public DiscardOldestPolicy() { }

    /**
     * Obtains and ignores the next task that the executor
     * would otherwise execute, if one is immediately available,
     * and then retries execution of task r, unless the executor
     * is shut down, in which case task r is instead discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        if (!e.isShutdown()) {
            e.getQueue().poll();
            e.execute(r);
        }
    }
}

可以看到将线程池移出一位,并且在线程池中执行线程r

  • CallerRunsPolicy
public static class CallerRunsPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code CallerRunsPolicy}.
     */
    public CallerRunsPolicy() { }

    /**
     * Executes task r in the caller's thread, unless the executor
     * has been shut down, in which case the task is discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        if (!e.isShutdown()) {
            r.run();
        }
    }
}

可以看到是以非核心线程的方法去执行,即谁发起就让谁去执行,交给线程本身去执行。

这里涉及到一个背压的机制

线程池的背压机制解析

背压(Backpressure)机制是系统在高负载情况下的一种自我保护策略,用于防止系统因过载而崩溃。在Java线程池中,背压机制主要通过任务队列和拒绝策略来实现,确保当系统资源不足时能够合理地控制任务流入速度。

线程池背压的基本原理

线程池的背压机制是指当线程池处理能力达到上限时,通过某种方式限制新任务的提交,防止任务无限堆积导致系统资源耗尽。这种机制类似于流体力学中的"背压"概念,即下游向上游反馈压力以调节流量5

线程池实现背压主要通过以下两个核心组件:

  1. 有界任务队列​:限制可以排队等待的任务数量,当队列满时触发背压机制3,6
  2. 拒绝策略​:定义当队列满且线程数达到最大值时如何处理新提交的任务3,6

线程池背压的具体实现方式

1. 通过任务队列实现初级背压

线程池的任务队列(BlockingQueue)是背压的第一道防线:

  • 有界队列​:如ArrayBlockingQueue,设置固定容量,当队列满时,提交新任务会阻塞或触发拒绝策略3
  • 无界队列​:如LinkedBlockingQueue(未指定容量时),理论上可以无限增长,可能导致内存溢出,不提供真正的背压保护3

当线程池的核心线程都在忙且队列未满时,新任务会被放入队列等待;当队列满时,背压机制开始发挥作用3

2. 通过拒绝策略实现高级背压

当线程池中的线程数达到maximumPoolSize且工作队列已满时,会触发拒绝策略。ThreadPoolExecutor提供了四种内置拒绝策略,其中与背压直接相关的是:

  1. CallerRunsPolicy(调用者运行策略)​​:最典型的背压策略,将任务回退给调用者线程执行,从而降低新任务提交速度6。这种策略会强制调用线程同步执行该任务,从而自然减缓任务生产速度3,6
  2. DiscardOldestPolicy(丢弃最旧策略)​​:丢弃队列中最旧的一个任务,然后尝试重新提交当前任务6
  3. DiscardPolicy(丢弃策略)​​:直接丢弃无法处理的任务6
  4. AbortPolicy(中止策略)​​:默认策略,直接抛出RejectedExecutionException异常6

其中CallerRunsPolicy是最符合背压理念的策略,因为它通过让生产者(调用者)直接参与任务执行来自然降低生产速率6

背压机制的工作流程

根据ThreadPoolExecutor的执行逻辑,背压机制触发的工作流程如下3

  1. 当任务提交时,如果运行的线程数小于corePoolSize,则创建新线程执行任务
  2. 如果运行的线程数达到或超过corePoolSize,则将任务加入工作队列
  3. 如果队列已满且运行的线程数小于maximumPoolSize,则创建新线程执行任务
  4. 如果队列已满且运行的线程数达到maximumPoolSize,则根据拒绝策略处理任务(触发背压)

不同场景下的背压表现

1. CPU密集型任务

对于CPU密集型任务,背压机制通常会较早触发,因为:

  • 任务处理速度受限于CPU核心数
  • 建议配置较小的队列,以便快速触发背压,防止任务堆积3

2. IO密集型任务

对于IO密集型任务:

  • 可以配置较大的队列,因为线程可能在等待IO时释放CPU
  • 可以设置较大的maximumPoolSize,但要注意系统资源限制3

配置建议

为了有效利用背压机制,合理配置线程池参数很重要:

  1. 使用有界队列​:避免无界队列导致内存溢出,推荐ArrayBlockingQueue3,6
  2. 设置合理的队列容量​:根据任务特性(执行时间、资源消耗)确定3
  3. 选择适当的拒绝策略​:CallerRunsPolicy是实现背压的最佳选择6
  4. 监控线程池状态​:通过getActiveCount()、getQueue().size()等方法监控,及时调整参数3

与其他系统中背压机制的对比

线程池的背压机制与响应式编程(如Reactor)或分布式系统(如BookKeeper)中的背压有相似理念但实现方式不同:

  1. 响应式系统中的背压​:通过Subscriber主动请求数据量控制(citation:5)
  2. BookKeeper中的背压​:通过限制写缓存、队列大小等实现(citation:4)
  3. 线程池背压​:通过队列满和拒绝策略实现(citation:3][citation:6)

总结

线程池的背压机制是系统稳定性的重要保障,它通过有界队列和拒绝策略的组合,在系统过载时保护系统不被压垮。合理配置线程池参数并选择适当的拒绝策略(特别是CallerRunsPolicy),可以构建出既高效又健壮的系统。在实际应用中,需要根据任务特性(CPU密集型或IO密集型)调整背压触发阈值,并通过监控不断优化参数配置。