Java 源码 - java.util.concurrent.CyclicBarrier前言在之前的介绍CountDow

前言

在之前的介绍 CountDownLatch 的文章中，CountDown 可以实现多个线程协调，在所有指定线程完成后，主线程才执行任务。

但是，CountDownLatch 有个缺陷，这点 JDK 的文档中也说了：他只能使用一次。在有些场合，似乎有些浪费，需要不停的创建 CountDownLatch 实例，JDK 在 CountDownLatch 的文档中向我们介绍了 CyclicBarrier——循环栅栏。

CyclicBarrier类在进行多线程编程时使用很多，比如，你希望创建一组任务，它们并行执行工作，然后在进行下一个步骤之前等待，直至所有的任务都完成，和join很类似。

而JDK关于CyclicBarrier的说明如下

/*
 * A synchronization aid that allows a set of threads to all wait for
 * each other to reach a common barrier point.  CyclicBarriers are
 * useful in programs involving a fixed sized party of threads that
 * must occasionally wait for each other. The barrier is called
 * <em>cyclic</em> because it can be re-used after the waiting threads
 * are released.
 *
 * <p>A {@code CyclicBarrier} supports an optional {@link Runnable} command
 * that is run once per barrier point, after the last thread in the party
 * arrives, but before any threads are released.
 * This <em>barrier action</em> is useful
 * for updating shared-state before any of the parties continue.
 */

CyclicBarrier和CountDownLatch的区别

CountDownLatch: 
A synchronization aid that allows one or more threads to wait until a set of 
operations being performed in other threads completes.

CyclicBarrier : 
A synchronization aid that allows a set of threads to all wait for each other 
to reach a common barrier point.

CountDownLatch : 一个线程(或者多个)，等待另外N个线程完成某个事情之后才能执行。 CyclicBarrier : N个线程相互等待，任何一个线程完成之前，所有的线程都必须等待。这样应该就清楚一点了，对于CountDownLatch来说，重点是那个“一个线程”, 是它在等待，而另外那N的线程在把“某个事情”做完之后可以继续等待，可以终止。而对于CyclicBarrier来说，重点是那N个线程，他们之间任何一个没有完成，所有的线程都必须等待。

所以

CountDownLatch 是计数器, 线程完成一个就记一个, 就像报数一样, 只不过是递减的。

而CyclicBarrier更像一个水闸, 线程执行就想水流, 在水闸处都会堵住, 等到水满(线程到齐)了, 才开始泄流。

CyclicBarrier数据结构

CyclicBarrier底层是基于ReentrantLock和AbstractQueuedSynchronizer来实现的，所以，CyclicBarrier的数据结构也依托于AQS的数据结构。

CyclicBarrier内部静态类

    /**
     * Each use of the barrier is represented as a generation instance.
     * The generation changes whenever the barrier is tripped, or
     * is reset. There can be many generations associated with threads
     * using the barrier - due to the non-deterministic way the lock
     * may be allocated to waiting threads - but only one of these
     * can be active at a time (the one to which {@code count} applies)
     * and all the rest are either broken or tripped.
     * There need not be an active generation if there has been a break
     * but no subsequent reset.
     */
    private static class Generation {
        boolean broken = false;
    }

Generation类有一个属性broken，用来表示当前屏障是否被损坏。

在 CyclicBarrier 中，有一个 “代” 的概念，因为 CyclicBarrier 是可以复用的，那么每次所有的线程通过了栅栏，就表示一代过去了，就像我们的新年一样。当所有人跨过了元旦，日历就更新了。

为什么需要这个呢？后面我们看源码的时候在细说，现在说有点不太容易懂。

CyclicBarrier类属性

    /** The lock for guarding barrier entry */
    //可重入锁
    private final ReentrantLock lock = new ReentrantLock();
    /** Condition to wait on until tripped */
    //条件队列
    private final Condition trip = lock.newCondition();
    /** The number of parties */
    //参与的线程数量
    private final int parties;
    /* The command to run when tripped */
    //由最后一个进入 barrier 的线程执行的操作
    private final Runnable barrierCommand;
    /** The current generation */
    //当前Generation
    private Generation generation = new Generation();

    /**
     * Number of parties still waiting. Counts down from parties to 0
     * on each generation.  It is reset to parties on each new
     * generation or when broken.
     */
    //正在等待进入屏障的线程数量
    private int count;

该属性有一个为ReentrantLock对象，有一个为Condition对象，而Condition对象又是基于AQS的，所以，归根到底，底层还是由AQS提供支持。

CyclicBarrier构造方法

public CyclicBarrier(int parties, Runnable barrierAction) {
    if (parties <= 0) throw new IllegalArgumentException();
    this.parties = parties;
    this.count = parties;
    this.barrierCommand = barrierAction;
}

构造函数可以指定关联该CyclicBarrier的线程数量，并且可以指定在所有线程都进入屏障后的执行动作，该执行动作由最后一个进行屏障的线程执行。

public CyclicBarrier(int parties) {
    this(parties, null);
}

该构造函数仅仅执行了关联该CyclicBarrier的线程数量，没有设置执行动作。

如果使用 CyclicBarrier 就知道了，CyclicBarrier 支持在所有线程通过栅栏的时候，执行一个线程的任务。

parties 属性就是线程的数量，这个数量用来控制什么时候释放打开栅栏，让所有线程通过。

CyclicBarrier基本方法

dowait()

此函数为CyclicBarrier类的核心函数，CyclicBarrier类对外提供的await函数在底层都是调用该了doawait函数，其源代码如下。

private int dowait(boolean timed, long nanos)
    throws InterruptedException, BrokenBarrierException,
           TimeoutException {
    final ReentrantLock lock = this.lock;
    // 锁住
    lock.lock();
    try {
        // 当前代
        final Generation g = generation;
        // 如果这代损坏了，抛出异常
        if (g.broken)
            throw new BrokenBarrierException();

        // 如果线程中断了，抛出异常
        if (Thread.interrupted()) {
            // 将损坏状态设置为 true
            // 并通知其他阻塞在此栅栏上的线程
            breakBarrier();
            throw new InterruptedException();
        }
        // 获取下标    
        int index = --count;
        // 如果是 0 ,说明到头了
        if (index == 0) {  // tripped
            boolean ranAction = false;
            try {
                final Runnable command = barrierCommand;
                // 执行栅栏任务
                if (command != null)
                    command.run();
                ranAction = true;
                // 更新一代,将 count 重置,将 generation 重置.
                // 唤醒之前等待的线程
                nextGeneration();
                // 结束
                return 0;
            } finally {
                // 如果执行栅栏任务的时候失败了,就将栅栏失效
                if (!ranAction)
                    breakBarrier();
            }
        }

        for (;;) {
            try {
                // 如果没有时间限制,则直接等待,直到被唤醒
                if (!timed)
                    trip.await();
                // 如果有时间限制,则等待指定时间
                else if (nanos > 0L)
                    nanos = trip.awaitNanos(nanos);
            } catch (InterruptedException ie) {
                // g == generation >> 当前代
                // ! g.broken >>> 没有损坏
                if (g == generation && ! g.broken) {
                    // 让栅栏失效
                    breakBarrier();
                    throw ie;
                } else {
                    // 上面条件不满足,说明这个线程不是这代的.
                    // 就不会影响当前这代栅栏执行逻辑.所以,就打个标记就好了
                    Thread.currentThread().interrupt();
                }
            }
            // 当有任何一个线程中断了,会调用 breakBarrier 方法.
            // 就会唤醒其他的线程,其他线程醒来后,也要抛出异常
            if (g.broken)
                throw new BrokenBarrierException();
            // g != generation >>> 正常换代了
            // 一切正常,返回当前线程所在栅栏的下标
            // 如果 g == generation，说明还没有换代，那为什么会醒了？
            // 因为一个线程可以使用多个栅栏，当别的栅栏唤醒了这个线程，
            //  就会走到这里，所以需要判断是否是当前代。
            // 正是因为这个原因，才需要 generation 来保证正确。
            if (g != generation)
                return index;
            // 如果有时间限制,且时间小于等于0,销毁栅栏,并抛出异常
            if (timed && nanos <= 0L) {
                breakBarrier();
                throw new TimeoutException();
            }
        }
    } finally {
        lock.unlock();
    }
}

代码虽然长，但整体逻辑还是很简单的。总结一下该方法吧。

首先，每个 CyclicBarrier 都有一个 Lock，想执行 await 方法，就必须获得这把锁。所以，CyclicBarrier 在并发情况下的性能是不高的。
一些线程中断的判断，注意，CyclicBarrier 中，只有有一个线程中断了，其余的线程也会抛出中断异常。并且，这个 CyclicBarrier 就不能再次使用了。
每次线程调用一次 await 方法，表示这个线程到了栅栏这里了，那么就将计数器减一。如果计数器到 0 了，表示这是这一代最后一个线程到达栅栏，就尝试执行我们构造方法中输入的任务。最后，将代更新，计数器重置，并唤醒所有之前等待在栅栏上的线程。
如果不是最后一个线程到达栅栏了，就使用 Condition 的 await 方法阻塞线程。如果等待过程中，线程中断了，就抛出异常。这里，注意一下，如果中断的线程的使用 CyclicBarrier 不是这代的，比如，在最后一次线程执行 signalAll 后，并且更新了这个“代”对象。在这个区间，这个线程被中断了，那么，JDK 认为任务已经完成了，就不必在乎中断了，只需要打个标记。所以，catch 里的 else 判断用于极少情况下出现的判断——任务完成，“代” 更新了，突然出现了中断。这个时候，CyclicBarrier 是不在乎的。因为任务已经完成了。
当有一个线程中断了，也会唤醒其他线程，那么就需要判断 broken 状态。
如果这个线程被其他的 CyclicBarrier 唤醒了，那么 g 肯定等于 generation，这个事件就不能 return 了，而是继续循环阻塞。反之，如果是当前 CyclicBarrier 唤醒的，就返回线程在 CyclicBarrier 的下标。完成了一次冲过栅栏的过程。

为了说明， dowait方法的逻辑会进行一系列的判断，大致流程如下。

nextGeneration()

此函数在所有线程进入屏障后会被调用，即生成下一个版本，所有线程又可以重新进入到屏障中，其源代码如下　

    /**
     * Updates state on barrier trip and wakes up everyone.
     * Called only while holding lock.
     */
    private void nextGeneration() {
        // signal completion of last generation
        trip.signalAll();
        // set up next generation
        // 恢复正在等待进入屏障的线程数量
        count = parties;
        //新生一代        
        generation = new Generation();
    }

在此函数中会调用AQS的signalAll方法，即唤醒所有等待线程。如果所有的线程都在等待此条件，则唤醒所有线程。其源代码如下

public final void signalAll() {
            if (!isHeldExclusively()) // 不被当前线程独占，抛出异常
                throw new IllegalMonitorStateException();
            // 保存condition队列头结点
            Node first = firstWaiter;
            if (first != null) // 头结点不为空
                // 唤醒所有等待线程
                doSignalAll(first);
        }

此函数判断头结点是否为空，即条件队列是否为空，然后会调用doSignalAll函数，doSignalAll函数源码如下

private void doSignalAll(Node first) {
            // condition队列的头结点尾结点都设置为空
            lastWaiter = firstWaiter = null;
            // 循环
            do {
                // 获取first结点的nextWaiter域结点
                Node next = first.nextWaiter;
                // 设置first结点的nextWaiter域为空
                first.nextWaiter = null;
                // 将first结点从condition队列转移到sync队列
                transferForSignal(first);
                // 重新设置first
                first = next;
            } while (first != null);
        }

此函数会依次将条件队列中的节点转移到同步队列中，会调用到transferForSignal函数，其源码如下

    final boolean transferForSignal(Node node) {
        /*
         * If cannot change waitStatus, the node has been cancelled.
         */
        if (!compareAndSetWaitStatus(node, Node.CONDITION, 0))
            return false;

        /*
         * Splice onto queue and try to set waitStatus of predecessor to
         * indicate that thread is (probably) waiting. If cancelled or
         * attempt to set waitStatus fails, wake up to resync (in which
         * case the waitStatus can be transiently and harmlessly wrong).
         */
        Node p = enq(node);
        int ws = p.waitStatus;
        if (ws > 0 || !compareAndSetWaitStatus(p, ws, Node.SIGNAL))
            LockSupport.unpark(node.thread);
        return true;
    }

此函数的作用就是将处于条件队列中的节点转移到同步队列中，并设置结点的状态信息，其中会调用到enq函数，其源代码如下

private Node enq(final Node node) {
        for (;;) { // 无限循环，确保结点能够成功入队列
            // 保存尾结点
            Node t = tail;
            if (t == null) { // 尾结点为空，即还没被初始化
                // 头结点为空，并设置头结点为新生成的结点
                if (compareAndSetHead(new Node())) 
                    tail = head; // 头结点与尾结点都指向同一个新生结点
            } else { // 尾结点不为空，即已经被初始化过
                // 将node结点的prev域连接到尾结点
                node.prev = t; 
                // 比较结点t是否为尾结点，若是则将尾结点设置为node                if (compareAndSetTail(t, node)) { 
                    //设置尾结点的next域为node
                    t.next = node; 
                    return t; // 返回尾结点
                }
            }
        }
    }

此函数完成了结点插入同步队列的过程，也很好理解。

　　综合上面的分析可知，newGeneration函数的主要方法的调用如下

breakBarrier()

此函数的作用是损坏当前屏障，会唤醒所有在屏障中的线程。

    /**
     * Sets current barrier generation as broken and wakes up everyone.
     * Called only while holding lock.
     */
     private void breakBarrier() {
        // 设置状态
        generation.broken = true;
        // 恢复正在等待进入屏障的线程数量
        count = parties;
        // 唤醒所有线程
        trip.signalAll();
    }

可以看到，此函数也调用了AQS的signalAll函数，由signal函数提供支持。

await()

    /*
     * Waits until all {@linkplain #getParties parties} have invoked
     * {@code await} on this barrier.
     */
    public int await() throws InterruptedException, BrokenBarrierException {
        try {
            return dowait(false, 0L);
        } catch (TimeoutException toe) {
            throw new Error(toe);
        }
    }

    /*
     * Waits until all {@linkplain #getParties parties} have invoked
     * {@code await} on this barrier, or the specified waiting time elapses.
     */
    public int await(long timeout, TimeUnit unit)
            throws InterruptedException,
            BrokenBarrierException,
            TimeoutException {
        return dowait(true, unit.toNanos(timeout));
    }

getParties()

public int getParties() { return parties; }

isBroken()

    /**
     * Queries if this barrier is in a broken state.
     *
     * @return {@code true} if one or more parties broke out of this
     *         barrier due to interruption or timeout since
     *         construction or the last reset, or a barrier action
     *         failed due to an exception; {@code false} otherwise.
     */
    public boolean isBroken() {
        final ReentrantLock lock = new ReentrantLock();
        lock.lock();
        try {
          return generation.broken;
        } finally {
            lock.unlock();
        }
    }

reset()

    /**
     * Resets the barrier to its initial state.  If any parties are
     * currently waiting at the barrier, they will return with a
     * {@link BrokenBarrierException}. Note that resets <em>after</em>
     * a breakage has occurred for other reasons can be complicated to
     * carry out; threads need to re-synchronize in some other way,
     * and choose one to perform the reset.  It may be preferable to
     * instead create a new barrier for subsequent use.
     */
    public void reset() {
        final ReentrantLock lock = new ReentrantLock();
        lock.lock();
        try {
            breakBarrier();       // break the current generation
            nextGeneration();     // start a new generation
        } finally {
            lock.unlock();
        }
    }

getNumberWaiting()

    /**
     * Returns the number of parties currently waiting at the barrier.
     * This method is primarily useful for debugging and assertions.
     */
    public int getNumberWaiting() {
        final ReentrantLock lock = new ReentrantLock();
        lock.lock();
        try {
            return parties - count;
        } finally {
            lock.unlock();
        }
    }

总结

从 await 方法看，CyclicBarrier 还是比较简单的，JDK 的思路就是：设置一个计数器，线程每调用一次计数器，就减一，并使用 Condition 阻塞线程。当计数器是0的时候，就唤醒所有线程，并尝试执行构造函数中的任务。由于 CyclicBarrier 是可重复执行的，所以，就需要重置计数器。

CyclicBarrier 还有一个重要的点，就是 generation 的概念，由于每一个线程可以使用多个 CyclicBarrier，每个 CyclicBarrier 又都可以唤醒线程，那么就需要用代来控制，如果代不匹配，就需要重新休眠。同时，这个代还记录了线程的中断状态，如果任何线程中断了，那么所有的线程都会抛出中断异常，并且 CyclicBarrier 不再可用了。

总而言之，CyclicBarrier 是依靠一个计数器实现的，内部有一个 count 变量，每次调用都会减一。当一次完整的栅栏活动结束后，计数器重置，这样，就可以重复利用了。

而他和 CountDownLatch 的区别在于，CountDownLatch 只能使用一次就 over 了，CyclicBarrier 能使用多次，可以说功能类似，CyclicBarrier 更强大一点。并且 CyclicBarrier 携带了一个在栅栏处可以执行的任务。更加灵活。

下面来一张图，说说 CyclicBarrier 的流程。和 CountDownLatch 类似：

CyclicBarrier实例

package com.hust.grid.leesf.cyclicbarrier;

import java.util.concurrent.BrokenBarrierException;
import java.util.concurrent.CyclicBarrier;

class MyThread extends Thread {
    private CyclicBarrier cb;
    public MyThread(String name, CyclicBarrier cb) {
        super(name);
        this.cb = cb;
    }
    
    public void run() {
        System.out.println(Thread.currentThread().getName() + " going to await");
        try {
            cb.await();
            System.out.println(Thread.currentThread().getName() + " continue");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
public class CyclicBarrierDemo {
    public static void main(String[] args) 
                           throws InterruptedException, BrokenBarrierException {
        CyclicBarrier cb = new CyclicBarrier(3, new Thread("barrierAction") {
            public void run() {
                System.out.println(Thread.currentThread().getName() 
                                            + " barrier action");
                
            }
        });
        MyThread t1 = new MyThread("t1", cb);
        MyThread t2 = new MyThread("t2", cb);
        t1.start();
        t2.start();
        System.out.println(Thread.currentThread().getName() + " going to await");
        cb.await();
        System.out.println(Thread.currentThread().getName() + " continue");

    }
}

运行结果

t1 going to await
main going to await
t2 going to await
t2 barrier action
t2 continue
t1 continue
main continue

根据结果可知，可能会存在如下的调用时序。

由上图可知，假设t1线程的cb.await是在main线程的cb.barrierAction动作是由最后一个进入屏障的线程执行的。根据时序图，进一步分析出其内部工作流程。

　　① main（主）线程执行cb.await操作，主要调用的函数如下。

说明：由于ReentrantLock的默认采用非公平策略，所以在dowait函数中调用的是ReentrantLock.NonfairSync的lock函数，由于此时AQS的状态是0，表示还没有被任何线程占用，故main线程可以占用，之后在dowait中会调用trip.await函数，最终的结果是条件队列中存放了一个包含main线程的结点，并且被禁止运行了，同时，main线程所拥有的资源也被释放了，可以供其他线程获取。

　　② t1线程执行cb.await操作，其中假设t1线程的lock.lock操作在main线程释放了资源之后，则其主要调用的函数如下。

说明：可以看到，之后condition queue（条件队列）里面有两个节点，包含t1线程的结点插入在队列的尾部，并且t1线程也被禁止了，因为执行了park操作，此时两个线程都被禁止了。

　　③ t2线程执行cb.await操作，其中假设t2线程的lock.lock操作在t1线程释放了资源之后，则其主要调用的函数如下。

说明：由上图可知，在t2线程执行await操作后，会直接执行command.run方法，不是重新开启一个线程，而是最后进入屏障的线程执行。同时，会将Condition queue中的所有节点都转移到Sync queue中，并且最后main线程会被unpark，可以继续运行。main线程获取cpu资源，继续运行。

　　④ main线程获取cpu资源，继续运行，下图给出了主要的方法调用。

说明：其中，由于main线程是在AQS.CO的wait中被park的，所以恢复时，会继续在该方法中运行。运行过后，t1线程被unpark，它获得cpu资源可以继续运行。

　　⑤ t1线程获取cpu资源，继续运行，下图给出了主要的方法调用。

说明：其中，由于t1线程是在AQS.CO的wait方法中被park，所以恢复时，会继续在该方法中运行。运行过后，Sync queue中保持着一个空节点。头结点与尾节点均指向它。

注意：在线程await过程中中断线程会抛出异常，所有进入屏障的线程都将被释放。

作者：莫那·鲁道
链接：juejin.cn/post/684490…
来源：掘金
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

作者：leesf
链接：www.cnblogs.com/leesf456/p/…
来源：cnblogs
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。