Disruptor的恐怖性能
之前看了很多关于并发的中文文章 感觉没有特别吃透,最近在研究一个高性能消息队列的时候 在具体的场景下关于三者区别有更深的体会 首先介绍一下主角 Disruptor是一个高性能高吞吐低延迟 并发读写队列的实现
对比一下官方ArrayBlockingQuene延迟显著减小,那么我想问Disruptor你为何如此优秀?
锁的性能开销
disruptor做了一个测试 及其简单但及其普遍的场景,一个64-bit的数字 自增读写5亿次的开销
锁、CAS及volatile原理及对比
This CAS approach is significantly more efficient than locks because it does not require a context switch to the kernel for arbitration. However CAS operations are not free of cost. The processor must lock its instruction pipeline to ensure atomicity and employ a memory barrier to make the changes visible to other threads. CAS operations are available in Java by using the java.util.concurrent.Atomic* classes.
- 锁 例如sychronized通过JVM运行时 锁住特定对象,竞争失败线程会被挂起 导致上下文切换
- CAS 是硬件支持的 是编译后机器码的一个特殊指令 CPU机制保证该指令执行 总线锁+内存屏障保证
- volatile也是关键字,类似CAS 通过CPU 内存屏障保证 保证可见不保证原子性
If the critical section of the program is more complex than a simple increment of a counter it may take a complex state machine using multiple CAS operations to orchestrate the contention. Developing concurrent programs using locks is difficult; developing lock-free algorithms using CAS operations and memory barriers is many times more complex and it is very difficult to prove that they are correct.
- 锁的性能差
- CAS性能优于锁,但是正如上文所说 CAS也是有代价的,总线锁 (instruction pipeline) 与内存屏障 (memory barrier) 并且开发基于CAS的无锁算法相对于锁更加困难 那么有没有什么办法性能又高 开发又简单? 重点来了,官方的办法简直太机智了,让我佩服的五体投地,老泪横流
性能超高的并发解决方案
The ideal algorithm would be one with only a single thread owning all writes to a single resource with other threads reading the results. To read the results in a multi-processor environment requires memory barriers to make the changes visible to threads running on other processors.
最理想的算法是 用一个线程写,其他多个线程读 通过通过volatile 内存屏障保证可见性(修改对其他线程可见) 这样只用了最基础的volatile最小的性能就可以满足我们的需求 ,所以说
解决问题的最好办法就是不需要解决! 解决竞争的最好办法就是没有竞争!
这就是设计的魅力