LongAdder：分治思想吊打AtomicLong🚀一、开场：AtomicLong的性能瓶颈💔 场景：高并发计数

AtomicLong：我一个人扛所有压力！😤
LongAdder：我们一群人分担压力！😎
结果：LongAdder性能爆表，提升10倍！

一、开场：AtomicLong的性能瓶颈💔

场景：高并发计数器

AtomicLong counter = new AtomicLong(0);

// 100个线程同时自增
for (int i = 0; i < 100; i++) {
    new Thread(() -> {
        for (int j = 0; j < 1_000_000; j++) {
            counter.incrementAndGet(); // CAS自增
        }
    }).start();
}

问题：

线程1: CAS(0 → 1) 成功
线程2: CAS(0 → 1) 失败，重试 CAS(1 → 2) 成功
线程3: CAS(0 → 1) 失败，重试 CAS(1 → 2) 失败，重试 CAS(2 → 3) 成功
线程4: CAS(0 → 1) 失败...失败...失败... 😭

竞争越激烈，重试越多，性能越差！

生活类比：

AtomicLong像单一收银台🏪：

100个顾客排一队
一个收银员处理
越排越长，效率越低

LongAdder像多个收银台🏬：

100个顾客分散到10个收银台
并行处理，互不干扰
最后汇总结果

二、LongAdder的核心思想💡

分而治之（Divide and Conquer）

原理：

AtomicLong: 所有线程竞争一个变量
LongAdder:  每个线程有自己的计数槽（Cell）

最终结果 = base + cell[0] + cell[1] + ... + cell[n]

可视化：

AtomicLong:
   [共享变量: 100]
      ↑ ↑ ↑ ↑ ↑
   所有线程竞争（慢！）

LongAdder:
   [base: 10]
   [Cell-0: 20]  ← 线程1、2
   [Cell-1: 30]  ← 线程3、4
   [Cell-2: 40]  ← 线程5、6
   
   总和 = 10 + 20 + 30 + 40 = 100

三、源码剖析：LongAdder如何工作🔍

核心数据结构

public class LongAdder extends Striped64 {
    
    // 继承自Striped64的字段：
    // transient volatile Cell[] cells;  // Cell数组
    // transient volatile long base;     // 基础值
}

// Cell：填充的计数器（避免伪共享）
@sun.misc.Contended
static final class Cell {
    volatile long value;
    
    Cell(long x) { 
        value = x; 
    }
    
    final boolean cas(long cmp, long val) {
        return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val);
    }
}

关键点：

base：低竞争时直接用base
cells：高竞争时分散到多个Cell
@Contended：填充缓存行，避免伪共享

核心方法：increment()

public void increment() {
    add(1L);
}

public void add(long x) {
    Cell[] as;
    long b, v;
    int m;
    Cell a;
    
    // 第一步：尝试直接CAS base
    if ((as = cells) != null || !casBase(b = base, b + x)) {
        
        // 第二步：base CAS失败，使用Cell
        boolean uncontended = true;
        
        if (as == null ||                        // cells未初始化
            (m = as.length - 1) < 0 ||           // cells为空
            (a = as[getProbe() & m]) == null ||  // 当前线程的Cell为null
            !(uncontended = a.cas(v = a.value, v + x))) { // Cell CAS失败
            
            // 第三步：进入longAccumulate处理冲突
            longAccumulate(x, null, uncontended);
        }
    }
}

工作流程：

1. 尝试 CAS base
   ├─ 成功 → 返回 ✅
   └─ 失败 → 进入步骤2

2. 尝试 CAS 当前线程的Cell
   ├─ 成功 → 返回 ✅
   └─ 失败 → 进入步骤3

3. longAccumulate处理冲突
   ├─ 初始化cells数组
   ├─ 扩容cells（翻倍）
   ├─ 创建新Cell
   └─ 重新hash到其他Cell

核心方法：sum()

public long sum() {
    Cell[] as = cells;
    long sum = base;
    
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            Cell a = as[i];
            if (a != null)
                sum += a.value;
        }
    }
    
    return sum;
}

注意： sum()不是原子操作！计算期间值可能变化。

四、为什么LongAdder这么快？⚡

原因1：减少CAS竞争

AtomicLong：

// 100个线程竞争同一个变量
long old;
do {
    old = value.get();
} while (!value.compareAndSet(old, old + 1));
// 竞争激烈，重试多次

LongAdder：

// 线程1 → Cell[0]
// 线程2 → Cell[1]
// 线程3 → Cell[2]
// ...
// 各自操作不同的Cell，无竞争！

性能对比：

线程数	AtomicLong重试次数	LongAdder重试次数
1	0	0
10	~5次/线程	~0.5次/线程
100	~50次/线程	~1次/线程

原因2：自适应扩容

动态调整Cell数组大小：

// 初始：cells = null，直接用base
// 竞争：创建cells，长度2
// 更多竞争：扩容到4、8、16...
// 最大：不超过CPU核心数

示例：

CPU核心数：8

竞争轻微: [base] + [Cell-0, Cell-1]
竞争中等: [base] + [Cell-0, Cell-1, Cell-2, Cell-3]
竞争激烈: [base] + [Cell-0...Cell-7] (8个Cell)

原因3：消除伪共享

什么是伪共享？

CPU缓存以**缓存行（Cache Line）**为单位，通常64字节。

问题场景：

// 两个Cell在同一缓存行
Cell[0]: value = 10  (8字节)
Cell[1]: value = 20  (8字节)  在同一个64字节缓存行

线程1修改Cell[0] → 整个缓存行失效
线程2读取Cell[1] → 缓存未命中，性能下降！

LongAdder的解决方案：

@sun.misc.Contended  // 填充注解
static final class Cell {
    volatile long value;
}

// 编译后：
// [填充56字节] + [value: 8字节] + [填充56字节]
// 总共120字节，独占2个缓存行！

效果：

缓存行1: [Cell[0] + 填充]
缓存行2: [Cell[1] + 填充]
缓存行3: [Cell[2] + 填充]

线程1修改Cell[0]，不影响Cell[1]的缓存！

五、性能对比实测📊

测试代码

public class PerformanceTest {
    
    private static final int THREAD_COUNT = 50;
    private static final int OPERATIONS = 1_000_000;
    
    public static void main(String[] args) throws InterruptedException {
        
        // 测试AtomicLong
        testAtomicLong();
        
        // 测试LongAdder
        testLongAdder();
        
        // 测试synchronized
        testSynchronized();
    }
    
    private static void testAtomicLong() throws InterruptedException {
        AtomicLong counter = new AtomicLong(0);
        long start = System.nanoTime();
        
        Thread[] threads = new Thread[THREAD_COUNT];
        for (int i = 0; i < THREAD_COUNT; i++) {
            threads[i] = new Thread(() -> {
                for (int j = 0; j < OPERATIONS; j++) {
                    counter.incrementAndGet();
                }
            });
            threads[i].start();
        }
        
        for (Thread t : threads) {
            t.join();
        }
        
        long time = (System.nanoTime() - start) / 1_000_000;
        System.out.println("AtomicLong: " + time + "ms, result=" + counter.get());
    }
    
    private static void testLongAdder() throws InterruptedException {
        LongAdder counter = new LongAdder();
        long start = System.nanoTime();
        
        Thread[] threads = new Thread[THREAD_COUNT];
        for (int i = 0; i < THREAD_COUNT; i++) {
            threads[i] = new Thread(() -> {
                for (int j = 0; j < OPERATIONS; j++) {
                    counter.increment();
                }
            });
            threads[i].start();
        }
        
        for (Thread t : threads) {
            t.join();
        }
        
        long time = (System.nanoTime() - start) / 1_000_000;
        System.out.println("LongAdder: " + time + "ms, result=" + counter.sum());
    }
    
    private static void testSynchronized() throws InterruptedException {
        Counter counter = new Counter();
        long start = System.nanoTime();
        
        Thread[] threads = new Thread[THREAD_COUNT];
        for (int i = 0; i < THREAD_COUNT; i++) {
            threads[i] = new Thread(() -> {
                for (int j = 0; j < OPERATIONS; j++) {
                    counter.increment();
                }
            });
            threads[i].start();
        }
        
        for (Thread t : threads) {
            t.join();
        }
        
        long time = (System.nanoTime() - start) / 1_000_000;
        System.out.println("synchronized: " + time + "ms, result=" + counter.get());
    }
    
    static class Counter {
        private long count = 0;
        
        public synchronized void increment() {
            count++;
        }
        
        public synchronized long get() {
            return count;
        }
    }
}

测试结果（8核CPU）

线程数	AtomicLong	LongAdder	synchronized	倍数
1	50ms	45ms	60ms	1.1x
10	800ms	150ms	1200ms	5.3x
50	4500ms	400ms	6000ms	11.3x
100	9000ms	500ms	12000ms	18x

结论：

单线程：性能相当
高并发：LongAdder快5-18倍！🚀

六、LongAdder家族成员👨‍👩‍👧‍👦

1. LongAdder - 累加器

LongAdder adder = new LongAdder();
adder.increment();      // +1
adder.add(10);         // +10
adder.decrement();     // -1
long sum = adder.sum(); // 获取总和

2. LongAccumulator - 累加器（自定义操作）

// 参数：(accumulator函数, 初始值)
LongAccumulator max = new LongAccumulator(Long::max, Long.MIN_VALUE);

max.accumulate(10);  // max(当前值, 10)
max.accumulate(5);   // max(当前值, 5)
max.accumulate(20);  // max(当前值, 20)

System.out.println(max.get()); // 20

// 其他用途：
LongAccumulator min = new LongAccumulator(Long::min, Long.MAX_VALUE);
LongAccumulator product = new LongAccumulator((x, y) -> x * y, 1);

3. DoubleAdder - 浮点数累加

DoubleAdder adder = new DoubleAdder();
adder.add(1.5);
adder.add(2.3);
double sum = adder.sum(); // 3.8

4. DoubleAccumulator - 浮点数累加器

DoubleAccumulator avg = new DoubleAccumulator((x, y) -> (x + y) / 2, 0.0);

七、LongAdder的使用场景✅❌

✅ 适合场景

高并发计数器

// 统计请求数
LongAdder requestCount = new LongAdder();
requestCount.increment();

性能指标统计

// 统计响应时间总和
LongAdder totalResponseTime = new LongAdder();
totalResponseTime.add(responseTime);

多线程累加

// 并行计算
IntStream.range(0, 1000000).parallel().forEach(i -> {
    adder.increment();
});

❌ 不适合场景

需要实时精确值

// ❌ LongAdder.sum()不是原子的
if (adder.sum() > threshold) {
    // 可能不准确
}

// ✅ 用AtomicLong
if (atomicLong.get() > threshold) {
    // 准确
}

低并发场景

// 单线程或少量线程，AtomicLong更简单

需要CAS语义

// ❌ LongAdder没有compareAndSet
// ✅ 用AtomicLong
atomicLong.compareAndSet(expected, newValue);

八、实战案例：高性能监控系统📈

public class PerformanceMonitor {
    
    // 请求计数
    private final LongAdder requestCount = new LongAdder();
    
    // 响应时间累加
    private final LongAdder totalResponseTime = new LongAdder();
    
    // 错误计数
    private final LongAdder errorCount = new LongAdder();
    
    // 最大响应时间
    private final LongAccumulator maxResponseTime = 
        new LongAccumulator(Long::max, 0L);
    
    // 最小响应时间
    private final LongAccumulator minResponseTime = 
        new LongAccumulator(Long::min, Long.MAX_VALUE);
    
    /**
     * 记录一次请求
     */
    public void recordRequest(long responseTime, boolean success) {
        requestCount.increment();
        totalResponseTime.add(responseTime);
        maxResponseTime.accumulate(responseTime);
        minResponseTime.accumulate(responseTime);
        
        if (!success) {
            errorCount.increment();
        }
    }
    
    /**
     * 获取统计信息
     */
    public Stats getStats() {
        long requests = requestCount.sum();
        long totalTime = totalResponseTime.sum();
        long errors = errorCount.sum();
        long maxTime = maxResponseTime.get();
        long minTime = minResponseTime.get();
        
        return new Stats(
            requests,
            requests > 0 ? totalTime / requests : 0, // 平均响应时间
            maxTime,
            minTime == Long.MAX_VALUE ? 0 : minTime,
            requests > 0 ? (double) errors / requests * 100 : 0 // 错误率
        );
    }
    
    /**
     * 重置统计
     */
    public void reset() {
        requestCount.reset();
        totalResponseTime.reset();
        errorCount.reset();
        maxResponseTime.reset();
        minResponseTime.reset();
    }
    
    public static class Stats {
        public final long totalRequests;
        public final long avgResponseTime;
        public final long maxResponseTime;
        public final long minResponseTime;
        public final double errorRate;
        
        public Stats(long totalRequests, long avgResponseTime, 
                    long maxResponseTime, long minResponseTime, double errorRate) {
            this.totalRequests = totalRequests;
            this.avgResponseTime = avgResponseTime;
            this.maxResponseTime = maxResponseTime;
            this.minResponseTime = minResponseTime;
            this.errorRate = errorRate;
        }
        
        @Override
        public String toString() {
            return String.format(
                "Requests: %d, Avg: %dms, Max: %dms, Min: %dms, Error: %.2f%%",
                totalRequests, avgResponseTime, maxResponseTime, minResponseTime, errorRate
            );
        }
    }
}

// 使用
public class Application {
    private static final PerformanceMonitor monitor = new PerformanceMonitor();
    
    public void handleRequest() {
        long start = System.currentTimeMillis();
        boolean success = false;
        
        try {
            // 处理请求
            processRequest();
            success = true;
        } catch (Exception e) {
            success = false;
        } finally {
            long responseTime = System.currentTimeMillis() - start;
            monitor.recordRequest(responseTime, success);
        }
    }
    
    // 定时输出统计
    public static void main(String[] args) {
        ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
        scheduler.scheduleAtFixedRate(() -> {
            System.out.println(monitor.getStats());
        }, 1, 1, TimeUnit.SECONDS);
    }
}

输出：

Requests: 5420, Avg: 45ms, Max: 230ms, Min: 12ms, Error: 0.52%
Requests: 11203, Avg: 48ms, Max: 280ms, Min: 10ms, Error: 0.61%
Requests: 17891, Avg: 46ms, Max: 300ms, Min: 9ms, Error: 0.58%

九、源码深入：longAccumulate()🔬

这是LongAdder的核心方法，处理高竞争场景：

final void longAccumulate(long x, LongBinaryOperator fn, boolean wasUncontended) {
    int h;
    if ((h = getProbe()) == 0) {
        ThreadLocalRandom.current(); // 初始化随机数
        h = getProbe();
        wasUncontended = true;
    }
    
    boolean collide = false;
    for (;;) {
        Cell[] as; Cell a; int n; long v;
        
        if ((as = cells) != null && (n = as.length) > 0) {
            // cells已存在
            if ((a = as[(n - 1) & h]) == null) {
                // 当前slot为空，创建新Cell
                if (cellsBusy == 0) {
                    Cell r = new Cell(x);
                    if (cellsBusy == 0 && casCellsBusy()) {
                        as[(n - 1) & h] = r;
                        break;
                    }
                }
                collide = false;
            }
            else if (!wasUncontended)
                wasUncontended = true;
            else if (a.cas(v = a.value, fn(v, x)))
                break; // CAS成功
            else if (n >= NCPU || cells != as)
                collide = false; // 不能扩容了
            else if (!collide)
                collide = true; // 标记冲突
            else if (cellsBusy == 0 && casCellsBusy()) {
                // 扩容：翻倍
                cells = Arrays.copyOf(as, n << 1);
                collide = false;
                continue;
            }
            
            h = advanceProbe(h); // rehash
        }
        else if (cellsBusy == 0 && cells == as && casCellsBusy()) {
            // 初始化cells数组
            cells = new Cell[2];
            cells[h & 1] = new Cell(x);
            break;
        }
        else if (casBase(v = base, fn(v, x)))
            break; // 回退到base
    }
}

关键策略：

初始化：创建长度为2的Cell数组
CAS Cell：尝试更新当前线程的Cell
Rehash：失败后换一个Cell槽位
扩容：多次冲突后，数组翻倍（最多到CPU核心数）

十、面试高频问答💯

Q1: LongAdder为什么比AtomicLong快？

A: 三大原因：

分散竞争：多个Cell分担压力
自适应扩容：动态调整Cell数量
消除伪共享：@Contended填充缓存行

Q2: LongAdder的sum()是原子操作吗？

A: 不是！ sum()遍历所有Cell累加，期间值可能变化。如果需要精确值，用AtomicLong。

Q3: LongAdder什么时候用base，什么时候用cells？

低竞争：直接CAS base
高竞争：分散到cells数组

Q4: LongAdder的cells数组最大多大？

A: 不超过CPU核心数，因为线程数再多，也只有这么多核心并行执行。

Q5: 如何选择AtomicLong还是LongAdder？

高并发累加 → LongAdder
需要实时精确值 → AtomicLong
需要CAS操作 → AtomicLong
低并发 → AtomicLong（更简单）

十一、总结：选型决策树🌲

需要计数/累加？
├─ 需要实时精确值？
│  └─ 是 → AtomicLong ✅
├─ 需要CAS操作？
│  └─ 是 → AtomicLong ✅
├─ 并发度高吗？
│  ├─ 高（>10线程）→ LongAdder ⭐
│  └─ 低（<10线程）→ AtomicLong ✅
└─ 自定义累加逻辑？
   └─ 是 → LongAccumulator ⭐

最佳实践

默认用LongAdder（高并发场景）
定时读取sum()（避免频繁累加）
性能监控优先选择LongAdder
业务计数器看并发度选择
压测验证性能提升

下期预告： 如何实现一个读写公平的锁？公平与性能的博弈！⚖️