持续创作，加速成长！这是我参与「掘金日新计划 · 10 月更文挑战」的第8天，点击查看活动详情

一、LongAdder/DoubleAdder详解

上篇文章介绍到使用AtomicInteger并发的数字操作功能，虽然其解决了并发的问题，但是如果并发量比较高的情况下，在通过自旋让线程等待的时候，就会耗费CPU，造成资源的消耗。

这就意味着，

在并发量小的情况下，可以使用AtomicInteger、AtomicLong等进行操作

在并发量较高的情况下就应该使用LongAdder、DoubleAdder

二、使用LongAdder性能测试

分别使用：
10个线程累加10000次
10个线程累加200000次
100个线程累加200000次

在线程相同、累加次数不同的情况下对比LongAdder和AtomicLong,以及在相同的累加次数和不同的线程数情况下进行分析

public class LongAdderTest {
    public static void main(String[] args) {
        testAtomicLongVSLongAdder(10, 10000);
        System.out.println("==================");
        testAtomicLongVSLongAdder(10, 200000);
        System.out.println("==================");
        testAtomicLongVSLongAdder(100, 200000);
    }

    static void testAtomicLongVSLongAdder(final int threadCount, final int times) {
        try {
            long start = System.currentTimeMillis();
            testLongAdder(threadCount, times);
            long end = System.currentTimeMillis() - start;
            System.out.println("条件>>>>>>线程数:" + threadCount + ", 单线程操作计数" + times);
            System.out.println("结果>>>>>>LongAdder方式增加计数" + (threadCount * times) + "次,共计耗时:" + end);

            long start2 = System.currentTimeMillis();
            testAtomicLong(threadCount, times);
            long end2 = System.currentTimeMillis() - start2;
            System.out.println("条件>>>>>>线程数:" + threadCount + ", 单线程操作计数" + times);
            System.out.println("结果>>>>>>AtomicLong方式增加计数" + (threadCount * times) + "次,共计耗时:" + end2);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    static void testAtomicLong(final int threadCount, final int times) throws InterruptedException {
        CountDownLatch countDownLatch = new CountDownLatch(threadCount);
        AtomicLong atomicLong = new AtomicLong();
        for (int i = 0; i < threadCount; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    for (int j = 0; j < times; j++) {
                        atomicLong.incrementAndGet();
                    }
                    countDownLatch.countDown();
                }
            }, "my-thread" + i).start();
        }
        countDownLatch.await();
    }

    static void testLongAdder(final int threadCount, final int times) throws InterruptedException {
        CountDownLatch countDownLatch = new CountDownLatch(threadCount);
        LongAdder longAdder = new LongAdder();
        for (int i = 0; i < threadCount; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    for (int j = 0; j < times; j++) {
                        longAdder.add(1);
                    }
                    countDownLatch.countDown();
                }
            }, "my-thread" + i).start();
        }
        countDownLatch.await();
    }
}

运行结果：

通过以上测试，我们可以得出结论：低并发、一般的业务场景下AtomicLong是足够了。如果并发量很多，存在大量写多读少的情况，那LongAdder可能更合适。

三、LongAdder实现原理

上面提到AtomicLong实现原理是通过自旋获得CAS的操作，进而实现锁的实现。下面看下LongAdder的实现原理

3-1、设计思路

AtomicLong中有个内部变量value保存着实际的long值，所有的操作都是针对该变量进行。也就是说，高并发环境下，value变量其实是一个热点，也就是N个线程竞争一个热点。LongAdder的基本思路就是分散热点，将value值分散到一个数组中，不同线程会命中到数组的不同槽中，各个线程只对自己槽中的那个值进行CAS操作，这样热点就被分散了，冲突的概率就小很多。如果要获取真正的long值，只要将各个槽中的变量值累加返回。

简单的讲LongAdder就是将最终累加的值进行拆分，分散到不同的数组槽中（数组槽的数量一般是CPU核数，X86为CPU核数*2,初始2，后面再进行扩容），不同的线程通过hash会命中不同的数组槽，各个线程仅对数组槽中的值进行CAS操作，最终再将所有槽中的值合并累加

3-2、LongAdder的内部结构

LongAdder内部有一个base变量，一个Cell[]数组：

base变量：非竞态条件下，直接累加到该变量上

Cell[]数组：竞态条件下，累加个各个线程自己的槽Cell[i]中

/** Number of CPUS, to place bound on table size */
// CPU核数，用来决定槽数组的大小
    static final int NCPU = Runtime.getRuntime().availableProcessors();

    /**
     * Table of cells. When non-null, size is a power of 2.
     */
    // 数组槽，大小为2的次幂
    transient volatile Cell[] cells;

/**
 * Base value, used mainly when there is no contention, but also as
 * a fallback during table initialization races. Updated via CAS.
 */
    /**
     *  基数，在两种情况下会使用：
     *  1. 没有遇到并发竞争时，直接使用base累加数值
     *  2. 初始化cells数组时，必须要保证cells数组只能被初始化一次（即只有一个线程能对cells初始化），
     *  其他竞争失败的线程会讲数值累加到base上
     */
    transient volatile long base;

/**
 * Spinlock (locked via CAS) used when resizing and/or creating Cells.
 */

定义了一个内部Cell类，这就是我们之前所说的槽，每个Cell对象存有一个value值，可以通过Unsafe来CAS操作它的值：

3-3、LongAdder#add方法

LongAdder#add方法的逻辑如下图：

只有从未出现过并发冲突的时候，base基数才会使用到，一旦出现了并发冲突，之后所有的操作都只针对Cell[]数组中的单元Cell。

如果Cell[]数组未初始化，会调用父类的longAccumelate去初始化Cell[]，如果Cell[]已经初始化但是冲突发生在Cell单元内，则也调用父类的longAccumelate，此时可能就需要对Cell[]扩容了。

这也是LongAdder设计的精妙之处 ：尽量减少热点冲突，不到最后万不得已，尽量将CAS操作延迟。

3-3-1、Striped64#longAccumulate方法

整个Striped64#longAccumulate的流程图如下：

3-4、LongAdder#sum方法

/**
 * 返回累加的和，也就是"当前时刻"的计数值
 * 注意： 高并发时，除非全局加锁，否则得不到程序运行中某个时刻绝对准确的值
 *  此返回值可能不是绝对准确的，因为调用这个方法时还有其他线程可能正在进行计数累加,
 *  方法的返回时刻和调用时刻不是同一个点，在有并发的情况下，这个值只是近似准确的计数值
 */
public long sum() {
    Cell[] as = cells; Cell a;
    long sum = base;
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            if ((a = as[i]) != null)
                sum += a.value;
        }
    }
    return sum;

由于计算总和时没有对Cell数组进行加锁，所以在累加过程中可能有其他线程对Cell中的值进行了修改，也有可能对数组进行了扩容，所以sum返回的值并不是非常精确的，其返回值并不是一个调用sum方法时的原子快照值。

3-5、LongAccumulator

LongAccumulator是LongAdder的增强版。LongAdder只能针对数值的进行加减运算，而LongAccumulator提供了自定义的函数操作。其构造函数如下：

通过LongBinaryOperator，可以自定义对入参的任意操作，并返回结果（LongBinaryOperator接收2个long作为参数，并返回1个long）。LongAccumulator内部原理和LongAdder几乎完全一样，都是利用了父类Striped64的longAccumulate方法。

public class LongAccumulatorTest {

    public static void main(String[] args) throws InterruptedException {
        // 累加 x+y
        LongAccumulator accumulator = new LongAccumulator((x, y) -> x + y, 0);

        ExecutorService executor = Executors.newFixedThreadPool(8);
        // 1到9累加
        IntStream.range(1, 10).forEach(i -> executor.submit(() -> accumulator.accumulate(i)));

        Thread.sleep(2000);
        System.out.println(accumulator.getThenReset());

    }
}