brpc原理系列(1)-DoublyBufferedData

DoublyBufferedData(简称DBD)是一个brpc中实现的双缓冲区+Thread Local缓存的数据结构，主要用于多读少写的场景，如负载均衡场景：每次请求都需要选择合适的下游节点，但下游服务器列表更新不可能很频繁。本文以brpc release-1.2.0的DBD代码为主进行分析，因为这个版本的代码比较简洁，后续版本整体实现思想没变，只是针对一些极端场景做了改进，如解决pthread_key存在上限的问题，bthread suspended问题等。

我将brpc里的DBD代码单独提取了出来，将brpc相关依赖换成了c++的STL库函数，写了完整且纤细的注释，并添加了一个模拟RoundRobin负载均衡的example，以方便理解这一数据结构，具体见github。

数据结构

DBD主要存储两类数据:

<T>类型的双缓冲数据，一般为c++容器里的数据如vector。该数据有两个副本，叫做前台数据(fg)和后台数据(bg)。读操作读的是前台数据，修改操作之在后台数据上进行。
<TLS>类型的Thread Local缓存。该结构旨在存储专属于本线程的数据，以减少线程之间的数据竞争。

DBD的主要结构如下:

原子变量_index指向的是双缓冲结构的前台数据，可能是_data[0],也可能是_data[1]。
线程通过_wapper_key的pthread_setspecific与某一Wrapper绑定，然后通过pthread_getspecific 获取当前线程绑定的Wrapper*，进而获取TLS数据。

DBD示意图：

dbd1

ps: 所有的Wrapper*在所有线程中都是可见的，只是实践过程中通过_wapper_key 获得了当前线程绑定的Wrapper* 。

读&改操作

Read(ScopedPtr* ptr)

ScopedPtr用于提取核心数据：

    class ScopedPtr {
    // 类似于一个代理，通过该类获取缓冲数据和TLS数据
    // _data指向存储的数据
    // _w指向Wrapper包含的TLS数据
    friend class DoublyBufferedData;
    public:
        ScopedPtr() : _data(NULL), _w(NULL) {}
        ~ScopedPtr() {
            if (_w) {
                _w->EndRead();
            }
        }
        const T* get() const { return _data; }
        const T& operator*() const { return *_data; }
        const T* operator->() const { return _data; }
        TLS& tls() { return _w->user_tls(); }   //返回TLS数据
        
    private:
        const T* _data; //所有线程共享的数据（即双缓冲数据）
        Wrapper* _w;    //Thread local数据
    };

读操作是将前台数据以及当前线程的TLS数据读到ptr中。

读的过程中每个Wrapper会加一个锁，但由于Wrapper和thread绑定，因此这个锁竞争很小。但是也不是不会有竞争，什么情况下会产生竞争呢？当其Modify线程修改完后台时，想要切换前后台，此时会产生一个很小的竞争，具体情况在后面的时序图中展示。

template <typename T, typename TLS>
class DoublyBufferedData<T, TLS>::Wrapper
    : public DoublyBufferedDataWrapperBase<T, TLS> {
friend class DoublyBufferedData;
public:
    explicit Wrapper(DoublyBufferedData* c) : _control(c) {
    }
    
    ~Wrapper() {
        if (_control != NULL) {
            _control->RemoveWrapper(this);
        }
    }

    // _mutex will be locked by the calling pthread and DoublyBufferedData.
    // Most of the time, no modifications are done, so the mutex is
    // uncontended and fast.
    // 在同一个线程中使用，锁基本无竞争
    inline void BeginRead() {
        _mutex.lock();
    }

    inline void EndRead() {
        _mutex.unlock();
    }

    inline void WaitReadDone() {
        // 等待读结束，以便切换fg和bg
        std::lock_guard<std::mutex> lock(_mutex);
    }
    
private:
    DoublyBufferedData* _control;  
    std::mutex _mutex;
};

Modify

根据传入的函数Fn修改双缓冲的数据。

为了减少锁竞争，先修改后台数据，修改过程中读前台数据不受影响。当修改完数据之后，切换前后台，即更改_index 的值。在修改的过程中，有的线程可能旧前台数据还没读完，因此还不能马上修改旧前台(新后台)，因此通过尝试获取其它读线程的锁(也就是上面的_mutex)，来确认所有线程都读完旧前台了，那么就可以对旧前台进行同样的修改操作。

template <typename T, typename TLS>
template <typename Fn>
size_t DoublyBufferedData<T, TLS>::Modify(Fn& fn) {
    // _modify_mutex sequences modifications. Using a separate mutex rather
    // than _wrappers_mutex is to avoid blocking threads calling
    // AddWrapper() or RemoveWrapper() too long. Most of the time, modifications
    // are done by one thread, contention should be negligible.
    std::lock_guard<std::mutex> lock(_modify_mutex);
    int bg_index = !_index.load(std::memory_order_relaxed);
    // background instance is not accessed by other threads, being safe to
    // modify.
    // 修改bg
    const size_t ret = fn(_data[bg_index]);
    if (!ret) {
        // 修改失败
        return 0;
    }

    // Publish, flip background and foreground.
    // The release fence matches with the acquire fence in UnsafeRead() to
    // make readers which just begin to read the new foreground instance see
    // all changes made in fn.
    // 切换前后台
    _index.store(bg_index, std::memory_order_release);
    bg_index = !bg_index;
    
    // Wait until all threads finishes current reading. When they begin next
    // read, they should see updated _index.
    // 等待每个旧的前台读完，认为是前台看到新的前台了
    {
        std::lock_guard<std::mutex> lock(_wrappers_mutex);
        for (size_t i = 0; i < _wrappers.size(); ++i) {
            _wrappers[i]->WaitReadDone();
        }
    }
    // 修改新后台(旧前台)
    const size_t ret2 = fn(_data[bg_index]);
    return ret2;
}

Read &Modify示意图

dbd2

Read &Modify锁时序图

如下图所示，thread1读旧前台还没结束，modify线程thread2会等待thread1读完才会对旧前台进行修改。

dbd3