HBase如何使用无锁队列Disruptor之NamedQueueRecorder实现

560 阅读2分钟

我正在参加「掘金·启航计划」

最近注意到Disruptor这个非常牛逼的无锁队列实现,HBase中也有几个地方用到了Disruptor,简单看下HBase是如何使用的,在实践中学习应用。

这里不介绍Disruptor的优势,只使用真实的开源项目代码作为demo,展示如何使用。

功能

NamedQueueRecorder维护一个队列,用来记录一些系统信息,比如慢请求、balance决策信息。这类的记录不能阻塞正常处理流程,甚至可以失败/丢失。

分析源码

classDiagram

class NamedQueueRecorder {
  Disruptor<RingBufferEnvelope> disruptor
  LogEventHandler logEventHandler
  
  static NamedQueueRecorder namedQueueRecorder
  static boolean isInit
  static final Object LOCK

  private NamedQueueRecorder(Configuration conf)
  public static NamedQueueRecorder getInstance(Configuration conf)
  public void addRecord(NamedQueuePayload namedQueuePayload)
}

class EventHandler {
  <<interface>>
}

class LogEventHandler {
  Map<NamedQueuePayload.NamedQueueEvent, NamedQueueService> namedQueueServices
  void onEvent(RingBufferEnvelope, long, boolean)
}

NamedQueueRecorder o-- Disruptor
NamedQueueRecorder o-- LogEventHandler
EventHandler <|-- LogEventHandler

成员变量

成员变量非常简单,disruptorlogEventHandler是和Disruptor库相关的,其他的是单例模式实现相关的。

构造方法(创建Disruptor队列)

/**
 * Initialize disruptor with configurable ringbuffer size
 */
 private NamedQueueRecorder(Configuration conf) {

  // This is the 'writer' -- a single threaded executor. This single thread consumes what is
  // put on the ringbuffer.  final String hostingThreadName = Thread.currentThread().getName();

  int eventCount = conf.getInt("hbase.namedqueue.ringbuffer.size", 1024);

  // disruptor initialization with BlockingWaitStrategy
  this.disruptor = new Disruptor<>(RingBufferEnvelope::new, getEventCount(eventCount),
    new ThreadFactoryBuilder().setNameFormat(hostingThreadName + ".slowlog.append-pool-%d")
    .setDaemon(true).setUncaughtExceptionHandler(Threads.LOGGING_EXCEPTION_HANDLER).build(),  
    ProducerType.MULTI, new BlockingWaitStrategy());  
  this.disruptor.setDefaultExceptionHandler(new DisruptorExceptionHandler());  
  
  // initialize ringbuffer event handler  
  this.logEventHandler = new LogEventHandler(conf);  
  this.disruptor.handleEventsWith(new LogEventHandler[] { this.logEventHandler });  
  this.disruptor.start();  
}  
  
public static NamedQueueRecorder getInstance(Configuration conf) {  
  if (namedQueueRecorder != null) {  
    return namedQueueRecorder;  
  }  
  synchronized (LOCK) {  
    if (!isInit) {  
      namedQueueRecorder = new NamedQueueRecorder(conf);  
      isInit = true;  
    }  
  }  
  return namedQueueRecorder;  
}

构造方式是private的,自然想到这是个单例实现。static的getInstance方法:3个静态成员变量都在这里用上了,典型的单例实现,没有多说的必要了。

重点看下private的构造方法:

  1. 从配置中获取eventCount,后续用来初始化disruptor的RingBuffer大小
  2. 初始化Disruptor:
    • Event的工场方法
    • ring buffer大小,注意要是2的次幂,下面讲
    • ThreadFactory:
      • 为线程命名,在大型项目中,一定要给所有线程命名,查问题时能通过线程名直接定位到相关功能/代码。
      • 设置为守护线程
      • 设置ExceptionHandler:这里的handler只打日志,不做其他的。
    • ProducerType:这里会有多线程访问的情况,所以选择MULTI
    • WaitStrategy: 使用了BlockingWaitStrategy,最节省CPU。该功能不是核心功能,不需要多高的性能。
  3. disruptor设置ExceptionHandler。DisruptorExceptionHandler只打日志而已。
  4. disruptor设置EventHandler,实现为LogEventHandler
  5. 启动disruptor
// must be power of 2 for disruptor ringbuffer
private int getEventCount(int eventCount) {
  Preconditions.checkArgument(eventCount >= 0, "hbase.namedqueue.ringbuffer.size must be > 0");
  int floor = Integer.highestOneBit(eventCount);
  if (floor == eventCount) {
    return floor;
  }
  // max capacity is 1 << 30
  if (floor >= 1 << 29) {
    return 1 << 30;
  }
  return floor << 1;
}

整个方法的逻辑是取不小于eventCount的最小2的次幂。例如:

  • 如果eventCount==4,return 4
  • 如果eventCount==5,return 8

步骤:

  1. 检查是否小于0,如果小于0直接异常
  2. Integer.highestOneBit(int)方法,取对应二进制中只保留最高位的值floor
  3. flooreventCount如果刚好相等,就结束。否则说明eventCount不是2的次幂。
  4. 限制最大值为2^30。
  5. floor左移一位。
flowchart TD
S[Start] --> A{eventCount >= 0}
A --> |Yes| B[floor=eventCount二进制下只保留最高位]
B --> C{floor == eventCount}
C --> |Yes| D[return floor]
D --> END[End]
C --> |No| E{floor >= 2^29}
E --> |Yes| F[return 2^30]
F --> END[End]
E --> |No| G[return floor << 1]
G --> END[End]
A --> |No| H[throw Exception]
H --> END[End]

生产

/**
 * Add various NamedQueue records to ringbuffer. Based on the type of the event (e.g slowLog),
 * consumer of disruptor ringbuffer will have specific logic. This method is producer of disruptor
 * ringbuffer which is initialized in NamedQueueRecorder constructor.
 * @param namedQueuePayload namedQueue payload sent by client of ring buffer service  
 */
 public void addRecord(NamedQueuePayload namedQueuePayload) {  
  RingBuffer<RingBufferEnvelope> ringBuffer = this.disruptor.getRingBuffer();  
  long seqId = ringBuffer.next();  
  try {  
    ringBuffer.get(seqId).load(namedQueuePayload);  
  } finally {  
    ringBuffer.publish(seqId);  
  }  
}

这里其实很简单:

  1. 申请一个seqId
  2. seqId对应的Event填充数据
  3. publish这个Event,之后消费者就能消费了

publish步骤放在了finally代码块里,这里也能看出作者的严谨,因为如果load时出现异常而跳过了publish,则消费者会堵在这个seqId,即使后续还有数据,也无法消费。

消费

消费者的逻辑在LogEventHandler中,成员变量Map<NamedQueuePayload.NamedQueueEvent, NamedQueueService> namedQueueServices存储了不同的事件由不同的逻辑具体处理。

/**  
 * Called when a publisher has published an event to the {@link RingBuffer}. This is generic  
 * consumer of disruptor ringbuffer and for each new namedQueue that we add, we should also * provide specific consumer logic here. * @param event      published to the {@link RingBuffer}  
 * @param sequence   of the event being processed  
 * @param endOfBatch flag to indicate if this is the last event in a batch from the  
 *                   {@link RingBuffer}  
 */
@Override  
public void onEvent(RingBufferEnvelope event, long sequence, boolean endOfBatch) {  
  final NamedQueuePayload namedQueuePayload = event.getPayload();  
  // consume ringbuffer payload based on event type  
  namedQueueServices.get(namedQueuePayload.getNamedQueueEvent())  
    .consumeEventFromDisruptor(namedQueuePayload);  
}

所以LogEventHandler重写的onEvent方法也很简单,从event取出事件类型,查到具体的NamedQueueService去处理就好了。

总结

在看HBase NamedQueueRecorder源码的过程中,不用太关心其具体功能,所以用它来学习Disruptor库是很好的demo。即使其足够简单,也有值得学习和注意的地方:

  • Disruptor关键参数都设置了,麻雀虽小五脏俱全。
  • RingBuffer大小的设置,如果不是2的次幂Disruptor会直接报错,而这种大型项目的运维人员也不会知道具体某个功能的实现细节,也就不知道这里必须是2的次幂。作者就在这里做了兼容考虑,如果不是2的次幂,就转换成2的次幂。
  • 用finally块保证每个seqId都publish了数据,即使数据无效。如果没有这个保证,消费者会堵塞在缺数据的seqId上无法继续。