- 背景

为了深入了解WebRTC中各个功能模块细节，更好的阅读调试源码，我们有必要对WebRTC中的线程有所了解。本文会详细介绍WebRTC中的线程类、线程事件处理以及线程运行机制。

- 线程

线程在wiki中是这样定义的

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system.[1] In many cases, a thread is a component of a process.

简单翻译即：在计算机科学中，执行线程是操作系统的调度器可独立管理的最小程序指令序列。在大多情况下，一个线程是一个进程的组成部分。

WebRTC中的Thread类定义

class RTC_LOCKABLE RTC_EXPORT Thread : public webrtc::TaskQueueBase {
 public:
  static const int kForever = -1;

  explicit Thread(SocketServer* ss);
  explicit Thread(std::unique_ptr<SocketServer> ss);

  Thread(SocketServer* ss, bool do_init);
  Thread(std::unique_ptr<SocketServer> ss, bool do_init);

  ~Thread() override;

  Thread(const Thread&) = delete;
  Thread& operator=(const Thread&) = delete;
  
  static std::unique_ptr<Thread> CreateWithSocketServer();
  static std::unique_ptr<Thread> Create();
  static Thread* Current();
  
protected:
  void PostTaskImpl(absl::AnyInvocable<void() &&> task,
                    const PostTaskTraits& traits,
                    const webrtc::Location& location) override;

  virtual void BlockingCallImpl(FunctionView<void()> functor,
                                const webrtc::Location& location);

  void DoInit();

  void DoDestroy() RTC_EXCLUSIVE_LOCKS_REQUIRED(mutex_);

  void WakeUpSocketServer();

  void Join();
  
   private:
  static const int kSlowDispatchLoggingThreshold = 50;  // 50 ms

  // Get() will process I/O until:
  //  1) A task is available (returns it)
  //  2) cmsWait seconds have elapsed (returns empty task)
  //  3) Stop() is called (returns empty task)
  absl::AnyInvocable<void() &&> Get(int cmsWait);
  void Dispatch(absl::AnyInvocable<void() &&> task);

  // Sets the per-thread allow-blocking-calls flag and returns the previous
  // value. Must be called on this thread.
  bool SetAllowBlockingCalls(bool allow);

#if defined(WEBRTC_WIN)
  static DWORD WINAPI PreRun(LPVOID context);
#else
  static void* PreRun(void* pv);
#endif

  // Return true if the thread is currently running.
  bool IsRunning();

  // Called by the ThreadManager when being set as the current thread.
  void EnsureIsCurrentTaskQueue();

  // Called by the ThreadManager when being unset as the current thread.
  void ClearCurrentTaskQueue();

  std::queue<absl::AnyInvocable<void() &&>> messages_ RTC_GUARDED_BY(mutex_);
  std::priority_queue<DelayedMessage> delayed_messages_ RTC_GUARDED_BY(mutex_);
  uint32_t delayed_next_num_ RTC_GUARDED_BY(mutex_);

  mutable webrtc::Mutex mutex_;
  bool fInitialized_;
  bool fDestroyed_;

  std::atomic<int> stop_;

  // The SocketServer might not be owned by Thread.
  SocketServer* const ss_;
  // Used if SocketServer ownership lies with `this`.
  std::unique_ptr<SocketServer> own_ss_;

  std::string name_;

#if defined(WEBRTC_POSIX)
  pthread_t thread_ = 0;
#endif

#if defined(WEBRTC_WIN)
  HANDLE thread_ = nullptr;
  DWORD thread_id_ = 0;
#endif

  bool owned_ = true;

  bool blocking_calls_allowed_ = true;

  std::unique_ptr<TaskQueueBase::CurrentTaskQueueSetter>
      task_queue_registration_;

  friend class ThreadManager;

  int dispatch_warning_ms_ RTC_GUARDED_BY(this) = kSlowDispatchLoggingThreshold;
}

Thread类中有单参数及双参数的构造方法。单参数的构造方法用于其本身创建时使用，双参数构造用于其子类创建时使用。子类调用父类Thread的构造方法时，do_init需要传递为false，是由于调用Thread#DoInit()方法时，会将Thread类实例的this指针传递到线程管理类ThreadManage中，但如果在父类中执行该动作，虚表指针vptr并未初始化完成，所以需要放到子类中去执行。构造中的另一个参数SocketServer是用于事件处理的类，见下文SocketServer的解析；

Tips: 在继承情况下，调用基类构造函数时，会先将基类的虚函数表地址赋值给vptr；然后调用子类构造函数时，又将子类的虚函数表地址赋值给vptr。

#if defined(WEBRTC_POSIX)
  pthread_t thread_ = 0;
#endif

#if defined(WEBRTC_WIN)
  HANDLE thread_ = nullptr;
  DWORD thread_id_ = 0;
#endif

这里可以看到，在linux下使用的pthread_t, 在windows下使用的HANDLE，最终是由它们持有创建后线程的句柄，而WebRTC的Thread类相当于线程的包装类。

Thread继承了TaskQueueBase抽象类，其作用是实现异步执行任务，并保证任务被FIFO顺序执行。

class RTC_LOCKABLE RTC_EXPORT TaskQueueBase {
 public:
  virtual void Delete() = 0;

  void PostTask(absl::AnyInvocable<void() &&> task,
                const Location& location = Location::Current()) {
    PostTaskImpl(std::move(task), PostTaskTraits{}, location);
  }
 
  static TaskQueueBase* Current();
  bool IsCurrent() const { return Current() == this; }

 protected:
  class RTC_EXPORT CurrentTaskQueueSetter {
   public:
    explicit CurrentTaskQueueSetter(TaskQueueBase* task_queue);
    CurrentTaskQueueSetter(const CurrentTaskQueueSetter&) = delete;
    CurrentTaskQueueSetter& operator=(const CurrentTaskQueueSetter&) = delete;
    ~CurrentTaskQueueSetter();

   private:
    TaskQueueBase* const previous_;
  };

  virtual void PostTaskImpl(absl::AnyInvocable<void() &&> task,
                            const PostTaskTraits& traits,
                            const Location& location) = 0;
  virtual ~TaskQueueBase() = default;
};

这里我们重点关注Get()、PostTask()方法；在介绍这这两个方法前，我们需要知道WebRTC的运行机制。在WebRTC当中有三大线程，分别是信号线程，用于与应用层交互；工作线程，负责内部逻辑处理；网络线程，负责网络数据包的收发(此类相关介绍资料已有很多，这里不再展开);

WebRTC的线程协作如下图所示：

通过上图可知，WebRTC通过一个线程发送任务，然后由另一个线程执行任务完成线程间的协作。发送任务通过Thread的BlockingCall(同步调用）和PostTask(异步调用)实现，将任务插入到目标线程的消息队列当中；

处理任务通过线程运行后执行Thread#ProcessMessages(）内的while循环实现，通过Get()不断从自己的消息队列中获取数据，通过Dispatch()处理消息；

公共对象根据线程类型的不同分为NullSocketServer和PhysicalSocketServer(需要处理socket事件);

先看消息是如何被处理的,首先执行Thread#ProcessMessages()

bool Thread::ProcessMessages(int cmsLoop) {
  // Using ProcessMessages with a custom clock for testing and a time greater
  // than 0 doesn't work, since it's not guaranteed to advance the custom
  // clock's time, and may get stuck in an infinite loop.
  RTC_DCHECK(GetClockForTesting() == nullptr || cmsLoop == 0 ||
             cmsLoop == kForever);
  int64_t msEnd = (kForever == cmsLoop) ? 0 : TimeAfter(cmsLoop);
  int cmsNext = cmsLoop;

  while (true) {
#if defined(WEBRTC_MAC)
    ScopedAutoReleasePool pool;
#endif
    absl::AnyInvocable<void()&&> task = Get(cmsNext);
    if (!task)
      return !IsQuitting();
    Dispatch(std::move(task));

    if (cmsLoop != kForever) {
      cmsNext = static_cast<int>(TimeUntil(msEnd));
      if (cmsNext < 0)
        return true;
    }
  }
}

然后执行Thread#Get()

absl::AnyInvocable<void() &&> Thread::Get(int cmsWait) {
  // Get w/wait + timer scan / dispatch + socket / event multiplexer dispatch

  int64_t cmsTotal = cmsWait;
  int64_t cmsElapsed = 0;
  int64_t msStart = TimeMillis();
  int64_t msCurrent = msStart;
  while (true) {
    // Check for posted events
    int64_t cmsDelayNext = kForever;
    {
      // All queue operations need to be locked, but nothing else in this loop
      // can happen while holding the `mutex_`.
      MutexLock lock(&mutex_);
      // Check for delayed messages that have been triggered and calculate the
      // next trigger time.
      while (!delayed_messages_.empty()) {
        if (msCurrent < delayed_messages_.top().run_time_ms) {
          cmsDelayNext =
              TimeDiff(delayed_messages_.top().run_time_ms, msCurrent);
          break;
        }
        messages_.push(std::move(delayed_messages_.top().functor));
        delayed_messages_.pop();
      }
      // Pull a message off the message queue, if available.
      if (!messages_.empty()) {
        absl::AnyInvocable<void()&&> task = std::move(messages_.front());
        messages_.pop();
        return task;
      }
    }

    if (IsQuitting())
      break;

    // Which is shorter, the delay wait or the asked wait?

    int64_t cmsNext;
    if (cmsWait == kForever) {
      cmsNext = cmsDelayNext;
    } else {
      cmsNext = std::max<int64_t>(0, cmsTotal - cmsElapsed);
      if ((cmsDelayNext != kForever) && (cmsDelayNext < cmsNext))
        cmsNext = cmsDelayNext;
    }

    {
      // Wait and multiplex in the meantime
      if (!ss_->Wait(cmsNext == kForever ? SocketServer::kForever
                                         : webrtc::TimeDelta::Millis(cmsNext),
                     /*process_io=*/true))
        return nullptr;
    }

    // If the specified timeout expired, return

    msCurrent = TimeMillis();
    cmsElapsed = TimeDiff(msCurrent, msStart);
    if (cmsWait != kForever) {
      if (cmsElapsed >= cmsWait)
        return nullptr;
    }
  }
  return nullptr;
}

这里可以看到当messages_不为空时，返回处理任务；为空时，进入Wait；

最后执行Thread#Dispatch()

void Thread::Dispatch(absl::AnyInvocable<void() &&> task) {
  TRACE_EVENT0("webrtc", "Thread::Dispatch");
  RTC_DCHECK_RUN_ON(this);
  int64_t start_time = TimeMillis();
  std::move(task)();
  int64_t end_time = TimeMillis();
  int64_t diff = TimeDiff(end_time, start_time);
  if (diff >= dispatch_warning_ms_) {
    RTC_LOG(LS_INFO) << "Message to " << name() << " took " << diff
                     << "ms to dispatch.";
    // To avoid log spew, move the warning limit to only give warning
    // for delays that are larger than the one observed.
    dispatch_warning_ms_ = diff + 1;
  }
}

这里执行task()时，具体的执行内容需要在发送线程中查看，即发送线程调用PostTask的地方，这样我们就可以顺利的调试源码了。

再来看PostTask，PostTask会继续调用PostTaskImpl；

void Thread::PostTaskImpl(absl::AnyInvocable<void() &&> task,
                          const PostTaskTraits& traits,
                          const webrtc::Location& location) {
  if (IsQuitting()) {
    return;
  }

  // Keep thread safe
  // Add the message to the end of the queue
  // Signal for the multiplexer to return

  {
    MutexLock lock(&mutex_);
    messages_.push(std::move(task));
  }
  WakeUpSocketServer();
}

可以看到，这里将任务插入了目标的消息队列中，之后唤醒了对方的线程；

- 事件处理

WebRTC中使用SocketServer类处理事件

class SocketServer : public SocketFactory {
 public:
  static constexpr webrtc::TimeDelta kForever = rtc::Event::kForever;

  static std::unique_ptr<SocketServer> CreateDefault();
  // When the socket server is installed into a Thread, this function is called
  // to allow the socket server to use the thread's message queue for any
  // messaging that it might need to perform. It is also called with a null
  // argument before the thread is destroyed.
  virtual void SetMessageQueue(Thread* queue) {}

  // Sleeps until:
  //  1) `max_wait_duration` has elapsed (unless `max_wait_duration` ==
  //  `kForever`)
  // 2) WakeUp() is called
  // While sleeping, I/O is performed if process_io is true.
  virtual bool Wait(webrtc::TimeDelta max_wait_duration, bool process_io) = 0;

  // Causes the current wait (if one is in progress) to wake up.
  virtual void WakeUp() = 0;

  // A network binder will bind the created sockets to a network.
  // It is only used in PhysicalSocketServer.
  void set_network_binder(NetworkBinderInterface* binder) {
    network_binder_ = binder;
  }
  NetworkBinderInterface* network_binder() const { return network_binder_; }

 private:
  NetworkBinderInterface* network_binder_ = nullptr;
};

这里重点分析一下Wait()和WakeUp的实现，以SocketServer的子类PhysicalSocketServer为例

bool PhysicalSocketServer::Wait(webrtc::TimeDelta max_wait_duration,
                                bool process_io) {
  // We don't support reentrant waiting.
  RTC_DCHECK(!waiting_);
  ScopedSetTrue s(&waiting_);
  const int cmsWait = ToCmsWait(max_wait_duration);
#if defined(WEBRTC_USE_EPOLL)
  // We don't keep a dedicated "epoll" descriptor containing only the non-IO
  // (i.e. signaling) dispatcher, so "poll" will be used instead of the default
  // "select" to support sockets larger than FD_SETSIZE.
  if (!process_io) {
    return WaitPoll(cmsWait, signal_wakeup_);
  } else if (epoll_fd_ != INVALID_SOCKET) {
    return WaitEpoll(cmsWait);
  }
#endif
  return WaitSelect(cmsWait, process_io);
}

1、首先这里 RTC_DCHECK(!waiting_);当waitting_为true时，在debug模式下会报错;

2、ScopedSetTrue s(&waiting_); 作用是利用栈区的特性，构造时置为true，离开作用域析构时置为false;

3、const int cmsWait = ToCmsWait(max_wait_duration); 转换为以毫秒为单位的等待时长；

4、当process_io为true时，进入WaiEpoll()，以下是WaitEpoll的具体实现；

bool PhysicalSocketServer::WaitEpoll(int cmsWait) {
  RTC_DCHECK(epoll_fd_ != INVALID_SOCKET);
  int64_t tvWait = -1;
  int64_t tvStop = -1;
  if (cmsWait != kForeverMs) {
    tvWait = cmsWait;
    tvStop = TimeAfter(cmsWait);
  }

  fWait_ = true;
  while (fWait_) {
    // Wait then call handlers as appropriate
    // < 0 means error
    // 0 means timeout
    // > 0 means count of descriptors ready
    int n = epoll_wait(epoll_fd_, epoll_events_.data(), epoll_events_.size(),
                       static_cast<int>(tvWait));
    if (n < 0) {
      if (errno != EINTR) {
        RTC_LOG_E(LS_ERROR, EN, errno) << "epoll";
        return false;
      }
      // Else ignore the error and keep going. If this EINTR was for one of the
      // signals managed by this PhysicalSocketServer, the
      // PosixSignalDeliveryDispatcher will be in the signaled state in the next
      // iteration.
    } else if (n == 0) {
      // If timeout, return success
      return true;
    } else {
      // We have signaled descriptors
      CritScope cr(&crit_);
      for (int i = 0; i < n; ++i) {
        const epoll_event& event = epoll_events_[i];
        uint64_t key = event.data.u64;
        if (!dispatcher_by_key_.count(key)) {
          // The dispatcher for this socket no longer exists.
          continue;
        }
        Dispatcher* pdispatcher = dispatcher_by_key_.at(key);

        bool readable = (event.events & (EPOLLIN | EPOLLPRI));
        bool writable = (event.events & EPOLLOUT);
        bool error = (event.events & (EPOLLRDHUP | EPOLLERR | EPOLLHUP));

        ProcessEvents(pdispatcher, readable, writable, error, error);
      }
    }

    if (cmsWait != kForeverMs) {
      tvWait = TimeDiff(tvStop, TimeMillis());
      if (tvWait <= 0) {
        // Return success on timeout.
        return true;
      }
    }
  }

  return true;
}

这里使用了epoll_wait等待事件的发生，若发生则交由ProcessEvents()处理;

然后我们再来了解一下WakeUp()是如何实现的

void PhysicalSocketServer::WakeUp() {
  signal_wakeup_->Signal();
}

Signaler#Signal()

    webrtc::MutexLock lock(&mutex_);
    if (!fSignaled_) {
      const uint8_t b[1] = {0};
      const ssize_t res = write(afd_[1], b, sizeof(b));
      RTC_DCHECK_EQ(1, res);
      fSignaled_ = true;
    }

由于Signaler通过PhysicalSocketServer#Add()方法将自己加入到epoll的监听事件集合当中，因此当发生事件改变时，会触发epoll_wait从而执行后续逻辑。

总结

通过分析WebRTC中的线程相关源码，我们可以了解到WebRTC内部的运行机制，这对于我们熟悉WebRTC整个工程有很大的帮助。WebRTC使用发送线程生成任务，并将其插入到执行线程的消息队列中，然后唤醒执行线程，交由执行线程完成任务的处理。另外在掌握了WebRTC线程的使用后，我们可以利用这种高效的方式，移植其线程相关源码到自己的工程使用。

WebRTC中的线程解析

- 背景

- 线程

- 事件处理

总结