手写高性能生产者消费者队列:从原理到实战🛠️

64 阅读8分钟

造轮子不是目的,理解原理才是王道!让我们从零开始,打造一个媲美JDK的高性能队列!

一、生产者-消费者模式:经典场景🎬

生活类比

想象一个奶茶店🧋:

  • 生产者: 制作奶茶的员工(可能有多个)
  • 消费者: 打包外卖的员工(可能有多个)
  • 队列: 柜台上的等待区

问题:

  • 奶茶做快了,柜台放不下(队列满)
  • 打包快了,没奶茶可打包(队列空)
  • 多个员工同时操作(并发问题)

二、从最简单的版本开始💡

版本1:synchronized + wait/notify

public class SimpleQueue<T> {
    private final Object[] items;
    private int putIndex = 0;  // 生产者索引
    private int takeIndex = 0; // 消费者索引
    private int count = 0;     // 元素数量
    
    public SimpleQueue(int capacity) {
        items = new Object[capacity];
    }
    
    // 生产(放入元素)
    public synchronized void put(T item) throws InterruptedException {
        // 队列满了,等待
        while (count == items.length) {
            wait();
        }
        
        items[putIndex] = item;
        putIndex = (putIndex + 1) % items.length; // 循环数组
        count++;
        
        notifyAll(); // 唤醒等待的消费者
    }
    
    // 消费(取出元素)
    @SuppressWarnings("unchecked")
    public synchronized T take() throws InterruptedException {
        // 队列空了,等待
        while (count == 0) {
            wait();
        }
        
        T item = (T) items[takeIndex];
        items[takeIndex] = null;
        takeIndex = (takeIndex + 1) % items.length;
        count--;
        
        notifyAll(); // 唤醒等待的生产者
        return item;
    }
}

优点:

  • ✅ 简单易懂
  • ✅ 线程安全

缺点:

  • ❌ 所有操作都加锁,性能差
  • notifyAll()会唤醒所有线程,浪费
  • ❌ 读写互斥(读的时候不能写)

性能测试: 约100万ops/秒


三、版本2:ReentrantLock + Condition(JDK实现)⚡

核心思想: 分离"非满"和"非空"两个条件,精确唤醒。

public class BetterQueue<T> {
    private final Object[] items;
    private int putIndex = 0;
    private int takeIndex = 0;
    private int count = 0;
    
    private final ReentrantLock lock = new ReentrantLock();
    private final Condition notEmpty = lock.newCondition(); // 非空条件
    private final Condition notFull = lock.newCondition();  // 非满条件
    
    public BetterQueue(int capacity) {
        items = new Object[capacity];
    }
    
    public void put(T item) throws InterruptedException {
        lock.lock();
        try {
            // 队列满了,在notFull条件上等待
            while (count == items.length) {
                notFull.await();
            }
            
            items[putIndex] = item;
            putIndex = (putIndex + 1) % items.length;
            count++;
            
            notEmpty.signal(); // 精确唤醒一个消费者
        } finally {
            lock.unlock();
        }
    }
    
    @SuppressWarnings("unchecked")
    public T take() throws InterruptedException {
        lock.lock();
        try {
            // 队列空了,在notEmpty条件上等待
            while (count == 0) {
                notEmpty.await();
            }
            
            T item = (T) items[takeIndex];
            items[takeIndex] = null;
            takeIndex = (takeIndex + 1) % items.length;
            count--;
            
            notFull.signal(); // 精确唤醒一个生产者
            return item;
        } finally {
            lock.unlock();
        }
    }
}

改进点:

  • ✅ 精确唤醒,避免无效唤醒
  • ✅ 两个条件队列,逻辑清晰

性能测试: 约300万ops/秒(提升3倍)

这就是JDK的ArrayBlockingQueue的实现!


四、版本3:读写分离锁(性能飙升)🚀

核心思想: 读和写用不同的锁,提高并发度。

public class FastQueue<T> {
    private final Object[] items;
    private final int capacity;
    
    private final AtomicInteger count = new AtomicInteger(0);
    
    // 分离锁
    private final ReentrantLock putLock = new ReentrantLock();
    private final Condition notFull = putLock.newCondition();
    
    private final ReentrantLock takeLock = new ReentrantLock();
    private final Condition notEmpty = takeLock.newCondition();
    
    private int putIndex = 0;
    private int takeIndex = 0;
    
    public FastQueue(int capacity) {
        this.capacity = capacity;
        this.items = new Object[capacity];
    }
    
    public void put(T item) throws InterruptedException {
        int c;
        putLock.lock();
        try {
            // 等待非满
            while (count.get() == capacity) {
                notFull.await();
            }
            
            // 入队
            items[putIndex] = item;
            putIndex = (putIndex + 1) % capacity;
            
            c = count.getAndIncrement(); // 原子递增
            
            // 如果还有空间,唤醒其他生产者
            if (c + 1 < capacity) {
                notFull.signal();
            }
        } finally {
            putLock.unlock();
        }
        
        // 如果之前是空的,唤醒消费者
        if (c == 0) {
            signalNotEmpty();
        }
    }
    
    @SuppressWarnings("unchecked")
    public T take() throws InterruptedException {
        int c;
        T item;
        takeLock.lock();
        try {
            // 等待非空
            while (count.get() == 0) {
                notEmpty.await();
            }
            
            // 出队
            item = (T) items[takeIndex];
            items[takeIndex] = null;
            takeIndex = (takeIndex + 1) % capacity;
            
            c = count.getAndDecrement(); // 原子递减
            
            // 如果还有元素,唤醒其他消费者
            if (c > 1) {
                notEmpty.signal();
            }
        } finally {
            takeLock.unlock();
        }
        
        // 如果之前是满的,唤醒生产者
        if (c == capacity) {
            signalNotFull();
        }
        
        return item;
    }
    
    private void signalNotEmpty() {
        takeLock.lock();
        try {
            notEmpty.signal();
        } finally {
            takeLock.unlock();
        }
    }
    
    private void signalNotFull() {
        putLock.lock();
        try {
            notFull.signal();
        } finally {
            putLock.unlock();
        }
    }
}

核心改进:

  1. 读写分离

    • 生产者用putLock
    • 消费者用takeLock
    • 并发度提升2倍!
  2. count用AtomicInteger

    • 两个锁都要访问count
    • 用原子变量保证线程安全
  3. 级联唤醒

    如果c + 1 < capacity,说明还有空间
    → 唤醒其他生产者
    → 其他生产者继续唤醒
    → 形成链式反应!
    

性能测试: 约800万ops/秒(再提升2.6倍)

这就是JDK的LinkedBlockingQueue的核心思想!


五、版本4:无锁队列(CAS终极优化)⚡⚡⚡

核心思想: 用CAS代替锁,消除阻塞。

单生产者单消费者(最简单)

public class LockFreeQueue<T> {
    private final Object[] items;
    private final int capacity;
    
    // 使用volatile保证可见性
    private volatile int writeIndex = 0;
    private volatile int readIndex = 0;
    
    // 填充,避免伪共享
    private long p1, p2, p3, p4, p5, p6, p7;
    
    public LockFreeQueue(int capacity) {
        // 容量+1,区分空和满
        this.capacity = capacity + 1;
        this.items = new Object[this.capacity];
    }
    
    public boolean offer(T item) {
        int current = writeIndex;
        int next = (current + 1) % capacity;
        
        // 队列满了
        if (next == readIndex) {
            return false;
        }
        
        items[current] = item;
        writeIndex = next; // 单生产者,直接赋值
        
        return true;
    }
    
    @SuppressWarnings("unchecked")
    public T poll() {
        int current = readIndex;
        
        // 队列空了
        if (current == writeIndex) {
            return null;
        }
        
        T item = (T) items[current];
        items[current] = null;
        readIndex = (current + 1) % capacity; // 单消费者,直接赋值
        
        return item;
    }
}

关键点:

  1. 容量+1技巧

    空队列:writeIndex == readIndex
    满队列:(writeIndex + 1) % capacity == readIndex
    
    通过浪费一个位置,区分空和满!
    
  2. 单生产者/单消费者

    • 生产者只写writeIndex
    • 消费者只写readIndex
    • 无竞争,不需要CAS!
  3. volatile保证可见性

    • 生产者写入对消费者立即可见

性能测试: 约3000万ops/秒(再提升3.75倍)🔥

多生产者多消费者(复杂版)

public class MPMCQueue<T> {
    private final Object[] items;
    private final int capacity;
    
    private final AtomicInteger writeIndex = new AtomicInteger(0);
    private final AtomicInteger readIndex = new AtomicInteger(0);
    
    public MPMCQueue(int capacity) {
        this.capacity = capacity + 1;
        this.items = new Object[this.capacity];
    }
    
    public boolean offer(T item) {
        int current, next;
        do {
            current = writeIndex.get();
            next = (current + 1) % capacity;
            
            // 队列满了
            if (next == readIndex.get()) {
                return false;
            }
            
        } while (!writeIndex.compareAndSet(current, next)); // CAS
        
        items[current] = item;
        return true;
    }
    
    @SuppressWarnings("unchecked")
    public T poll() {
        int current, next;
        do {
            current = readIndex.get();
            
            // 队列空了
            if (current == writeIndex.get()) {
                return null;
            }
            
            next = (current + 1) % capacity;
            
        } while (!readIndex.compareAndSet(current, next)); // CAS
        
        T item = (T) items[current];
        items[current] = null;
        return item;
    }
}

改进点:

  • AtomicInteger和CAS处理并发
  • 多个生产者/消费者竞争索引

性能测试: 约1500万ops/秒(比锁快,但比单生产者慢)


六、性能对比总结📊

版本实现方式吞吐量(ops/秒)特点
版本1synchronized100万简单但慢
版本2ReentrantLock + Condition300万JDK标准
版本3读写分离锁800万并发度高
版本4无锁(单生产单消费)3000万最快
版本4无锁(多生产多消费)1500万通用

结论:

  • 简单场景:用synchronized
  • 通用场景:用JDK的队列
  • 极致性能:用无锁队列或Disruptor

七、实战优化技巧🔧

技巧1:批量操作

// ❌ 低效:一次放一个
for (int i = 0; i < 1000; i++) {
    queue.put(data[i]);
}

// ✅ 高效:批量放入
public void putBatch(T[] items) {
    lock.lock();
    try {
        for (T item : items) {
            // 内部循环,只加一次锁
            enqueue(item);
        }
        notEmpty.signalAll();
    } finally {
        lock.unlock();
    }
}

性能提升: 减少1000次锁竞争 → 1次

技巧2:预分配对象

// ❌ 每次new,GC压力大
queue.put(new Event(data));

// ✅ 预分配对象池
Event event = eventPool.get();
event.setData(data);
queue.put(event);

技巧3:消除伪共享

// ❌ writeIndex和readIndex在同一缓存行
private volatile int writeIndex;
private volatile int readIndex;

// ✅ 填充,独占缓存行
private volatile int writeIndex;
private long p1, p2, p3, p4, p5, p6, p7; // 填充
private volatile int readIndex;

技巧4:自适应自旋

public T take() {
    int spins = 0;
    while (true) {
        T item = poll();
        if (item != null) {
            return item;
        }
        
        // 先自旋,再yield,最后park
        if (spins < 1000) {
            spins++;
        } else if (spins < 10000) {
            Thread.yield();
            spins++;
        } else {
            LockSupport.parkNanos(1); // 短暂休眠
        }
    }
}

八、完整代码:带监控的生产者消费者队列📊

import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.ReentrantLock;

public class MonitoredQueue<T> {
    
    private final Object[] items;
    private final int capacity;
    
    private final ReentrantLock putLock = new ReentrantLock();
    private final Condition notFull = putLock.newCondition();
    
    private final ReentrantLock takeLock = new ReentrantLock();
    private final Condition notEmpty = takeLock.newCondition();
    
    private final AtomicInteger count = new AtomicInteger(0);
    private int putIndex = 0;
    private int takeIndex = 0;
    
    // 监控指标
    private final AtomicLong totalPut = new AtomicLong(0);
    private final AtomicLong totalTake = new AtomicLong(0);
    private final AtomicLong totalWaitTime = new AtomicLong(0);
    
    public MonitoredQueue(int capacity) {
        this.capacity = capacity;
        this.items = new Object[capacity];
    }
    
    public void put(T item) throws InterruptedException {
        long startTime = System.nanoTime();
        int c;
        
        putLock.lock();
        try {
            while (count.get() == capacity) {
                notFull.await();
            }
            
            items[putIndex] = item;
            putIndex = (putIndex + 1) % capacity;
            
            c = count.getAndIncrement();
            totalPut.incrementAndGet();
            
            if (c + 1 < capacity) {
                notFull.signal();
            }
        } finally {
            totalWaitTime.addAndGet(System.nanoTime() - startTime);
            putLock.unlock();
        }
        
        if (c == 0) {
            signalNotEmpty();
        }
    }
    
    @SuppressWarnings("unchecked")
    public T take() throws InterruptedException {
        long startTime = System.nanoTime();
        int c;
        T item;
        
        takeLock.lock();
        try {
            while (count.get() == 0) {
                notEmpty.await();
            }
            
            item = (T) items[takeIndex];
            items[takeIndex] = null;
            takeIndex = (takeIndex + 1) % capacity;
            
            c = count.getAndDecrement();
            totalTake.incrementAndGet();
            
            if (c > 1) {
                notEmpty.signal();
            }
        } finally {
            totalWaitTime.addAndGet(System.nanoTime() - startTime);
            takeLock.unlock();
        }
        
        if (c == capacity) {
            signalNotFull();
        }
        
        return item;
    }
    
    // 监控方法
    public int size() {
        return count.get();
    }
    
    public long getTotalPut() {
        return totalPut.get();
    }
    
    public long getTotalTake() {
        return totalTake.get();
    }
    
    public double getAvgWaitTimeMs() {
        long total = totalPut.get() + totalTake.get();
        if (total == 0) return 0;
        return totalWaitTime.get() / 1_000_000.0 / total;
    }
    
    public double getUtilization() {
        return (double) count.get() / capacity * 100;
    }
    
    public String getStats() {
        return String.format(
            "Queue Stats: size=%d, capacity=%d, utilization=%.2f%%, " +
            "totalPut=%d, totalTake=%d, avgWaitTime=%.3fms",
            size(), capacity, getUtilization(),
            getTotalPut(), getTotalTake(), getAvgWaitTimeMs()
        );
    }
    
    private void signalNotEmpty() {
        takeLock.lock();
        try {
            notEmpty.signal();
        } finally {
            takeLock.unlock();
        }
    }
    
    private void signalNotFull() {
        putLock.lock();
        try {
            notFull.signal();
        } finally {
            putLock.unlock();
        }
    }
}

使用示例:

MonitoredQueue<String> queue = new MonitoredQueue<>(1000);

// 生产者
for (int i = 0; i < 10; i++) {
    new Thread(() -> {
        try {
            for (int j = 0; j < 10000; j++) {
                queue.put("data-" + j);
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }).start();
}

// 消费者
for (int i = 0; i < 5; i++) {
    new Thread(() -> {
        try {
            while (true) {
                String data = queue.take();
                // 处理数据
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }).start();
}

// 监控线程
new Thread(() -> {
    while (true) {
        try {
            Thread.sleep(1000);
            System.out.println(queue.getStats());
        } catch (InterruptedException e) {
            break;
        }
    }
}).start();

输出:

Queue Stats: size=856, capacity=1000, utilization=85.60%, 
totalPut=45230, totalTake=44374, avgWaitTime=0.125ms

九、面试高频问答💯

Q1: 为什么要用循环数组而不是链表?

A:

  • 数组:连续内存,缓存友好,性能高
  • 链表:分散内存,需要new对象,GC压力大

Q2: 为什么队列容量要+1?

A: 区分空和满:

  • 空:writeIndex == readIndex
  • 满:(writeIndex + 1) % capacity == readIndex

Q3: 读写分离锁如何保证线程安全?

A:

  • count用AtomicInteger,两个锁都能安全访问
  • 生产者只修改putIndex
  • 消费者只修改takeIndex
  • 互不干扰!

Q4: 无锁队列比有锁快多少?

A:

  • 单生产单消费:快10倍
  • 多生产多消费:快2-3倍
  • 取决于竞争程度

十、总结:队列选型指南🎯

需要阻塞等待吗?
├─ 不需要 → 用无锁队列(ConcurrentLinkedQueue)
└─ 需要
   ├─ 简单场景 → synchronized版本
   ├─ 单锁 → ArrayBlockingQueue
   ├─ 双锁 → LinkedBlockingQueue
   └─ 极致性能 → Disruptor

最佳实践:

  1. 优先用JDK的队列(久经考验)
  2. 性能瓶颈再自己优化
  3. 监控指标:吞吐量、延迟、队列长度
  4. 合理容量:太小频繁等待,太大浪费内存

下期预告: happens-before原则如何保证多线程的有序性?深入JMM内存模型!🧠