Android 消息机制 - Java 层依赖 native 层实现 wait线程间通信在 Android 系统中应用十

源码基于 Android S AOSP，源码在线查看地址： cs.android.com/android/pla…

线程间通信在 Android 系统中应用十分广泛，本文是一个系列文章，主要梳理了Android 中 Java、native 层的线程间通信机制。

在Android 消息机制 - Java 层文章中提到了在Looper.loopOnce中调用MessageQueue.next获取Message，在MessageQueue.next中调用native方法nativePollOnce(ptr, nextPollTimeoutMillis);实现wait。

另外，当调用Handler.sendMessage方法时，从Looper中拿到MessageQueue对象，调用MessageQueue.enqueueMessage将message加入到MessageQueue链表中，并且调用nativeWake(mPtr);唤醒MessageQueue.next中的nativePollOnce(ptr, nextPollTimeoutMillis);

接下来以nativePollOnce(ptr, nextPollTimeoutMillis);和nativeWake(mPtr);为中心，分析一下wait和wake是如何实现的。

nativeInit

在MessageQueue的构造方法中会调用nativeInit方法对native层进行初始化操作

frameworks/base/core/jni/android_os_MessageQueue.cpp

static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
    NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
    if (!nativeMessageQueue) {
        jniThrowRuntimeException(env, "Unable to allocate native queue");
        return 0;
    }

    nativeMessageQueue->incStrong(env);
    // 将MessageQueue对象进行强制类型转换然后返回给Java层的MessageQueue
    return reinterpret_cast<jlong>(nativeMessageQueue);
}

NativeMessageQueue.NativeMessageQueue

NativeMessageQueue::NativeMessageQueue() :
        mPollEnv(NULL), mPollObj(NULL), mExceptionObj(NULL) {
    // 从当前线程拿到Looper，类似于java层的Thread.currentThread().get()
    mLooper = Looper::getForThread();
    if (mLooper == NULL) {
        // 创建Looper
        mLooper = new Looper(false);
        // 将创建的Looper加到了ThreadLocal中
        Looper::setForThread(mLooper);
    }
}

Looper.Looper

system/core/libutils/Looper.cpp

Looper::Looper(bool allowNonCallbacks)
    : mAllowNonCallbacks(allowNonCallbacks),
      mSendingMessage(false),
      mPolling(false),
      mEpollRebuildRequired(false),
      mNextRequestSeq(WAKE_EVENT_FD_SEQ + 1),
      mResponseIndex(0),
      mNextMessageUptime(LLONG_MAX) {
    mWakeEventFd.reset(eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC));
    LOG_ALWAYS_FATAL_IF(mWakeEventFd.get() < 0, "Could not make wake event fd: %s", strerror(errno));

    AutoMutex _l(mLock);
    // 构建 epoll
    rebuildEpollLocked();
}

下面关于rebuildEpollLocked的流程在这里就不进行追踪了，在Android 消息机制 - Native 层实现 - 以 InputReader 与 InputDispatcher 间通信为背景分析过的。

nativePollOnce

frameworks/base/core/jni/android_os_MessageQueue.cpp

static void android_os_MessageQueue_nativePollOnce(JNIEnv* env, jobject obj,
        jlong ptr, jint timeoutMillis) {
    // ptr 即 在nativeInit中返回的指针，指向NativeMessageQueue，在这里强转回来
    NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
    // 调用NativeMessageQueue.poolOnce
    nativeMessageQueue->pollOnce(env, obj, timeoutMillis);
}

NativeMessageQueue.pollOnce

void NativeMessageQueue::pollOnce(JNIEnv* env, jobject pollObj, int timeoutMillis) {
    mPollEnv = env;
    mPollObj = pollObj;
    // 调用Looper.pollOnce
    mLooper->pollOnce(timeoutMillis);
    mPollObj = NULL;
    mPollEnv = NULL;

    if (mExceptionObj) {
        env->Throw(mExceptionObj);
        env->DeleteLocalRef(mExceptionObj);
        mExceptionObj = NULL;
    }
}

Looper.pollOnce

system/core/libutils/include/utils/Looper.h

inline int pollOnce(int timeoutMillis) {
    return pollOnce(timeoutMillis, nullptr, nullptr, nullptr);
}

system/core/libutils/Looper.cpp

int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) {
    int result = 0;
    for (;;) {
        // ......
        
		// 调用 pollInner
        result = pollInner(timeoutMillis);
    }
}

Looper.pollInner

int Looper::pollInner(int timeoutMillis) {
	// ......
    
    // 等待超时后epoll_wait也会返回
    int eventCount = epoll_wait(mEpollFd.get(), eventItems, EPOLL_MAX_EVENTS, timeoutMillis);
    
    // ......

    for (int i = 0; i < eventCount; i++) {
        const SequenceNumber seq = eventItems[i].data.u64;
        uint32_t epollEvents = eventItems[i].events;
        if (seq == WAKE_EVENT_FD_SEQ) {
            // epoll监听到可读事件
            if (epollEvents & EPOLLIN) {
                // 调用awoken将事件全部读出
                awoken();
            } else {
                ALOGW("Ignoring unexpected epoll events 0x%x on wake event fd.", epollEvents);
            }
        } else {
            // ......
        }
    }
Done: ;

    // ......
    
    return result;
}

Looper.awoken

void Looper::awoken() {
#if DEBUG_POLL_AND_WAKE
    ALOGD("%p ~ awoken", this);
#endif

    uint64_t counter;
    // 不断重试从mWakeEventFd中读出内容，读取失败则不断重试
    TEMP_FAILURE_RETRY(read(mWakeEventFd.get(), &counter, sizeof(uint64_t)));
}

当Looper.awoken()被调用后，Looper.pollInner函数返回，result = POLL_WAKE，Looper.pollOnce中的for循环结束，返回到了MessageQueue.next函数中，将消息分发给Looper（Java），Looper（Java）将消息分发给Handler中的handleMessage进行处理。

接下来分析一个nativeWake是怎样将epoll_wait唤醒的

nativeWake

frameworks/base/core/jni/android_os_MessageQueue.cpp

static void android_os_MessageQueue_nativeWake(JNIEnv* env, jclass clazz, jlong ptr) {
    // 拿到 NativeMessageQueue
    NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
    // 调用 NativeMessageQueue.wake
    nativeMessageQueue->wake();
}

NativeMessageQueue.wake

void NativeMessageQueue::wake() {
    // 调用 NativeMessageQueue.NativeMessageQueue中创建好的Looper.wake
    mLooper->wake();
}

Looper.wake

system/core/libutils/Looper.cpp

void Looper::wake() {
#if DEBUG_POLL_AND_WAKE
    ALOGD("%p ~ wake", this);
#endif

    uint64_t inc = 1;
    // 将整数1写入到mWakeEventFd中，失败则不断尝试
    ssize_t nWrite = TEMP_FAILURE_RETRY(write(mWakeEventFd.get(), &inc, sizeof(uint64_t)));
    if (nWrite != sizeof(uint64_t)) {
        if (errno != EAGAIN) {
            LOG_ALWAYS_FATAL("Could not write wake signal to fd %d (returned %zd): %s",
                             mWakeEventFd.get(), nWrite, strerror(errno));
        }
    }
}

一但mWakeEventFd中的有写入，epoll_wait就会被唤醒。

总结

MessageQueue.next中无消息或消息没有到分发时间则等待，会调用nativePollOnce实现等待
nativePollOnce中借助nativeInit中初始化好的epoll（监听mWakeEventFd）实现等待
Handler.sendMessage时会间接调用nativeWake，nativeWake会向mWakeEventFd写入数据（整数1）
nativePollOnce中的epoll_wait被唤醒（或等待超时），等待结束，流程返回到MessageQueue.next中