我正在参加「掘金·启航计划」
最近注意到Disruptor这个非常牛逼的无锁队列实现,HBase中也有几个地方用到了Disruptor,简单看下HBase是如何使用的,在实践中学习应用。
这里不介绍Disruptor的优势,只使用真实的开源项目代码作为demo,展示如何使用。
功能
NamedQueueRecorder维护一个队列,用来记录一些系统信息,比如慢请求、balance决策信息。这类的记录不能阻塞正常处理流程,甚至可以失败/丢失。
分析源码
classDiagram
class NamedQueueRecorder {
Disruptor<RingBufferEnvelope> disruptor
LogEventHandler logEventHandler
static NamedQueueRecorder namedQueueRecorder
static boolean isInit
static final Object LOCK
private NamedQueueRecorder(Configuration conf)
public static NamedQueueRecorder getInstance(Configuration conf)
public void addRecord(NamedQueuePayload namedQueuePayload)
}
class EventHandler {
<<interface>>
}
class LogEventHandler {
Map<NamedQueuePayload.NamedQueueEvent, NamedQueueService> namedQueueServices
void onEvent(RingBufferEnvelope, long, boolean)
}
NamedQueueRecorder o-- Disruptor
NamedQueueRecorder o-- LogEventHandler
EventHandler <|-- LogEventHandler
成员变量
成员变量非常简单,disruptor
和logEventHandler
是和Disruptor库相关的,其他的是单例模式实现相关的。
构造方法(创建Disruptor队列)
/**
* Initialize disruptor with configurable ringbuffer size
*/
private NamedQueueRecorder(Configuration conf) {
// This is the 'writer' -- a single threaded executor. This single thread consumes what is
// put on the ringbuffer. final String hostingThreadName = Thread.currentThread().getName();
int eventCount = conf.getInt("hbase.namedqueue.ringbuffer.size", 1024);
// disruptor initialization with BlockingWaitStrategy
this.disruptor = new Disruptor<>(RingBufferEnvelope::new, getEventCount(eventCount),
new ThreadFactoryBuilder().setNameFormat(hostingThreadName + ".slowlog.append-pool-%d")
.setDaemon(true).setUncaughtExceptionHandler(Threads.LOGGING_EXCEPTION_HANDLER).build(),
ProducerType.MULTI, new BlockingWaitStrategy());
this.disruptor.setDefaultExceptionHandler(new DisruptorExceptionHandler());
// initialize ringbuffer event handler
this.logEventHandler = new LogEventHandler(conf);
this.disruptor.handleEventsWith(new LogEventHandler[] { this.logEventHandler });
this.disruptor.start();
}
public static NamedQueueRecorder getInstance(Configuration conf) {
if (namedQueueRecorder != null) {
return namedQueueRecorder;
}
synchronized (LOCK) {
if (!isInit) {
namedQueueRecorder = new NamedQueueRecorder(conf);
isInit = true;
}
}
return namedQueueRecorder;
}
构造方式是private的,自然想到这是个单例实现。static的getInstance
方法:3个静态成员变量都在这里用上了,典型的单例实现,没有多说的必要了。
重点看下private的构造方法:
- 从配置中获取
eventCount
,后续用来初始化disruptor的RingBuffer大小 - 初始化Disruptor:
- Event的工场方法
- ring buffer大小,注意要是2的次幂,下面讲
- ThreadFactory:
- 为线程命名,在大型项目中,一定要给所有线程命名,查问题时能通过线程名直接定位到相关功能/代码。
- 设置为守护线程
- 设置ExceptionHandler:这里的handler只打日志,不做其他的。
- ProducerType:这里会有多线程访问的情况,所以选择MULTI
- WaitStrategy: 使用了BlockingWaitStrategy,最节省CPU。该功能不是核心功能,不需要多高的性能。
- 为
disruptor
设置ExceptionHandler
。DisruptorExceptionHandler只打日志而已。 - 为
disruptor
设置EventHandler
,实现为LogEventHandler
。 - 启动
disruptor
// must be power of 2 for disruptor ringbuffer
private int getEventCount(int eventCount) {
Preconditions.checkArgument(eventCount >= 0, "hbase.namedqueue.ringbuffer.size must be > 0");
int floor = Integer.highestOneBit(eventCount);
if (floor == eventCount) {
return floor;
}
// max capacity is 1 << 30
if (floor >= 1 << 29) {
return 1 << 30;
}
return floor << 1;
}
整个方法的逻辑是取不小于eventCount
的最小2的次幂。例如:
- 如果
eventCount==4
,return 4 - 如果
eventCount==5
,return 8
步骤:
- 检查是否小于0,如果小于0直接异常
- 用
Integer.highestOneBit(int)
方法,取对应二进制中只保留最高位的值floor
。 floor
和eventCount
如果刚好相等,就结束。否则说明eventCount
不是2的次幂。- 限制最大值为2^30。
floor
左移一位。
flowchart TD
S[Start] --> A{eventCount >= 0}
A --> |Yes| B[floor=eventCount二进制下只保留最高位]
B --> C{floor == eventCount}
C --> |Yes| D[return floor]
D --> END[End]
C --> |No| E{floor >= 2^29}
E --> |Yes| F[return 2^30]
F --> END[End]
E --> |No| G[return floor << 1]
G --> END[End]
A --> |No| H[throw Exception]
H --> END[End]
生产
/**
* Add various NamedQueue records to ringbuffer. Based on the type of the event (e.g slowLog),
* consumer of disruptor ringbuffer will have specific logic. This method is producer of disruptor
* ringbuffer which is initialized in NamedQueueRecorder constructor.
* @param namedQueuePayload namedQueue payload sent by client of ring buffer service
*/
public void addRecord(NamedQueuePayload namedQueuePayload) {
RingBuffer<RingBufferEnvelope> ringBuffer = this.disruptor.getRingBuffer();
long seqId = ringBuffer.next();
try {
ringBuffer.get(seqId).load(namedQueuePayload);
} finally {
ringBuffer.publish(seqId);
}
}
这里其实很简单:
- 申请一个
seqId
- 为
seqId
对应的Event填充数据 - publish这个Event,之后消费者就能消费了
publish步骤放在了finally
代码块里,这里也能看出作者的严谨,因为如果load时出现异常而跳过了publish,则消费者会堵在这个seqId,即使后续还有数据,也无法消费。
消费
消费者的逻辑在LogEventHandler
中,成员变量Map<NamedQueuePayload.NamedQueueEvent, NamedQueueService> namedQueueServices
存储了不同的事件由不同的逻辑具体处理。
/**
* Called when a publisher has published an event to the {@link RingBuffer}. This is generic
* consumer of disruptor ringbuffer and for each new namedQueue that we add, we should also * provide specific consumer logic here. * @param event published to the {@link RingBuffer}
* @param sequence of the event being processed
* @param endOfBatch flag to indicate if this is the last event in a batch from the
* {@link RingBuffer}
*/
@Override
public void onEvent(RingBufferEnvelope event, long sequence, boolean endOfBatch) {
final NamedQueuePayload namedQueuePayload = event.getPayload();
// consume ringbuffer payload based on event type
namedQueueServices.get(namedQueuePayload.getNamedQueueEvent())
.consumeEventFromDisruptor(namedQueuePayload);
}
所以LogEventHandler
重写的onEvent
方法也很简单,从event
取出事件类型,查到具体的NamedQueueService
去处理就好了。
总结
在看HBase NamedQueueRecorder源码的过程中,不用太关心其具体功能,所以用它来学习Disruptor库是很好的demo。即使其足够简单,也有值得学习和注意的地方:
- Disruptor关键参数都设置了,麻雀虽小五脏俱全。
- RingBuffer大小的设置,如果不是2的次幂Disruptor会直接报错,而这种大型项目的运维人员也不会知道具体某个功能的实现细节,也就不知道这里必须是2的次幂。作者就在这里做了兼容考虑,如果不是2的次幂,就转换成2的次幂。
- 用finally块保证每个seqId都publish了数据,即使数据无效。如果没有这个保证,消费者会堵塞在缺数据的seqId上无法继续。