从AIDL到内核,一次完整的Binder通信

4,722 阅读22分钟

前言

上一篇文章Binder概述,快速了解Binder体系 整体介绍了一下Binder体系,本篇就从AIDL开始分析一个完整的Binder通讯流程。

原本不打算发这篇文章,因为关于Binder的细节我觉得Gityuan的系列文章已经讲得足够清楚,然后就是分析源码本来就是会者不难难者不会,要搞清楚细节光看文章怎么都是不够的。本文以我自己的思考习惯整理得来,尽量以线性逻辑进行分析,但是实际情况并非如此,你总是会在捋清楚一条线的时候碰到另一条线,所以有些分叉的知识点被我一笔带过了,需要自己进一步了解。

AIDL生成代码分析

AIDL使用

首先写一个IHelloInterface.aidl文件如下

interface IHelloInterface {
    void hello(String msg);
}

build之后会生成 IHelloInterface.java文件,然后创建一个远程服务

class RemoteService : Service() {
    private val serviceBinder = object : IHelloInterface.Stub() {
        override fun hello(msg: String) {
            Log.e("remote", "hello from client: $msg")
        }
    }
    override fun onBind(intent: Intent): IBinder = serviceBinder
}

绑定远程服务,调用服务方法

class MainActivity : AppCompatActivity() {
    private val conn = object : ServiceConnection {
        override fun onServiceConnected(name: ComponentName, service: IBinder) {
            // 这里的service就是一个BinderProxy
            // asInterface返回一个IHelloInterface.Stub.Proxy实例
            val proxy = IHelloInterface.Stub.asInterface(service)
            proxy.hello("client msg")
        }

        override fun onServiceDisconnected(name: ComponentName?) {
        }
    }
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
        bindService(Intent("com.lyj.RemoteService"), conn, Service.BIND_AUTO_CREATE)
    }
}

远程服务在onBind中返回服务端Binder实例,客户端通过binserService绑定服务后由AMS创建一个对应此服务端Binder的BinderProxy实例,回调到ServiceConnection.onServiceConnected方法中,客户端可以通过IHelloInterface.Stub.asInterface根据该BinderProxy得到一个IHelloInterface.Stub.Proxy实例,调用其中的方法进行IPC通信。

IHelloInterface分析

此文件内容主要分为三部分,这里简单拆分方便看

  • IHelloInterface接口,是远程服务的功能抽象
    public interface IHelloInterface extends android.os.IInterface {
        public void hello(java.lang.String msg) throws android.os.RemoteException;
    }
    
  • IHelloInterface.Stub类表示服务端实现,它本身是一个继承于Binder的抽象类,hello方法由我们使用的时候重写实现(在远程Service onBind方法中返回该类实现,重写hello方法)
    public static abstract class Stub extends android.os.Binder implements com.lyj.bindertest.IHelloInterface {
        // 类型标识
        private static final java.lang.String DESCRIPTOR = "com.lyj.bindertest.IHelloInterface";
        // 客户端服务端以此code标识hello方法
        static final int TRANSACTION_hello = (android.os.IBinder.FIRST_CALL_TRANSACTION + 0);
    
        public Stub() {
            this.attachInterface(this, DESCRIPTOR);
        }
        
        // 此方法一般在ServiceConnection.onServiceConnected回调中调用,
        public static com.lyj.bindertest.IHelloInterface asInterface(android.os.IBinder obj) {
            if ((obj == null)) {
                return null;
            }
            android.os.IInterface iin = obj.queryLocalInterface(DESCRIPTOR);
            if (((iin != null) && (iin instanceof com.lyj.bindertest.IHelloInterface))) {
                return ((com.lyj.bindertest.IHelloInterface) iin);
            }
            return new com.lyj.bindertest.IHelloInterface.Stub.Proxy(obj);
        }
    
        @Override
        public android.os.IBinder asBinder() {
            return this;
        }
        
        // 解析客户端的调用
        @Override
        public boolean onTransact(int code, android.os.Parcel data, android.os.Parcel reply, int flags) throws android.os.RemoteException {
            java.lang.String descriptor = DESCRIPTOR;
            switch (code) {
                case INTERFACE_TRANSACTION: {
                    reply.writeString(descriptor);
                    return true;
                }
                case TRANSACTION_hello: {
                    // code为TRANSACTION_hello时调用hello方法
                    data.enforceInterface(descriptor);
                    java.lang.String _arg0;
                    // 从parcel中读取数据
                    _arg0 = data.readString();
                    this.hello(_arg0);
                    reply.writeNoException();
                    return true;
                }
                default: {
                    return super.onTransact(code, data, reply, flags);
                }
            }
        }
    }
    
  • IHelloInterface.Stub.Proxy类表示远程服务在客户端的代理,成员mRemote代表远程服务Binder对应的BinderProxy
    private static class Proxy implements com.lyj.bindertest.IHelloInterface {
      // 远程服务Binder对应的BinderProxy
      private android.os.IBinder mRemote;
    
      Proxy(android.os.IBinder remote) {
          mRemote = remote;
      }
    
      @Override
      public android.os.IBinder asBinder() {
          return mRemote;
      }
    
      @Override
      public void hello(java.lang.String msg) throws android.os.RemoteException {
          android.os.Parcel _data = android.os.Parcel.obtain();
          android.os.Parcel _reply = android.os.Parcel.obtain();
          try {
              // 写类型标识
              _data.writeInterfaceToken(DESCRIPTOR);
              // 写参数
              _data.writeString(msg);
              // 调用BinderProxy.transact开始通讯
              boolean _status = mRemote.transact(Stub.TRANSACTION_hello, _data, _reply, 0);
              if (!_status && getDefaultImpl() != null) {
                  getDefaultImpl().hello(msg);
                  return;
              }
              _reply.readException();
          } finally {
              _reply.recycle();
              _data.recycle();
          }
      }
    

总结一下IHelloInterface.Stub和IHelloInterface.Stub.Proxy的关系

IHelloInterface.Stub本身就是一个Binder,代表服务端,通过onTransact方法接收来自客户端的调用,判断code=TRANSACTION_hello调用IHelloInterface.hello接口方法。

IHelloInterface.Stub.Proxy代表客户端,持有一个名为mRemote的BinderProxy(内部持有服务端Binder句柄),Proxy.hello调用到BinderProxy.transact,传入code=TRANSACTION_hello,也即服务端收到的code

由此可见真正发起通信的是BinderProxy.transact,而服务端接收消息的是Binder.onTransact,AIDL只是对此简单封装。

Java层到native层的过渡

BinderProxy.transact方法开始,调用JNI方法transactNative进入native层,BinderProxy.transactNative对应native函数是android_util_Binder.cpp中的android_os_BinderProxy_transact

final class BinderProxy implements IBinder {
    public boolean transact(int code, Parcel data, Parcel reply, int flags) throws RemoteException {
        // 检查parcel数据是否大于800k
        Binder.checkParcel(this, code, data, "Unreasonably large binder buffer");
        // 调用native层
        return transactNative(code, data, reply, flags);
    }
}

frameworks\base\core\jni\android_util_Binder.cpp

static jboolean android_os_BinderProxy_transact(JNIEnv* env, jobject obj,
        jint code, jobject dataObj, jobject replyObj, jint flags) // throws RemoteException
{
    // Java Parcel转为native Parcel
    Parcel* data = parcelForJavaObject(env, dataObj);
    Parcel* reply = parcelForJavaObject(env, replyObj);
    // 从Java BinderProxy对象获取BpBinder指针
    IBinder* target = (IBinder*) env->GetLongField(obj, gBinderProxyOffsets.mObject);
    // 调用BpBinder.transact
    status_t err = target->transact(code, *data, reply, flags);
    return JNI_FALSE;
}

android_os_BinderProxy_transact函数中通过传入的BinderProxy对象获取对应的BpBinder指针,然后调用BpBinder::transact

frameworks\native\libs\binder\BpBinder.cpp

status_t BpBinder::transact(
    uint32_t code, const Parcel& data, Parcel* reply, uint32_t flags)
{
    if (mAlive) {
        // mHandle是接收端BBinder的句柄
        status_t status = IPCThreadState::self()->transact(
            mHandle, code, data, reply, flags);
        if (status == DEAD_OBJECT) mAlive = 0;
        return status;
    }
    return DEAD_OBJECT;
}

BpBinder调用到IPCThreadState.transactIPCThreadState::self获取当前线程IPCThreadState单例,不存在则创建。在上一篇文章中提到过,进行Binder通信的线程在native层对应一个IPCThreadState对象。

创建IPCThreadState

IPCThreadState::self做的事很简单,获取当前线程IPCThreadState单例对象(pthread_getspecific可以看做从ThreadLocal.get),不存在则调用空参构造函数创建

frameworks\native\libs\binder\IPCThreadState.cpp

IPCThreadState* IPCThreadState::self()
{
    // 是否已创建
    if (gHaveTLS) {
restart:
        const pthread_key_t k = gTLS;
        // 从线程私有空间获取
        IPCThreadState* st = (IPCThreadState*)pthread_getspecific(k);
        if (st) return st;
        return new IPCThreadState;
    }
    if (gShutdown) {
        return NULL;
    }
    // 线程同步锁
    pthread_mutex_lock(&gTLSMutex);
    if (!gHaveTLS) {
        int key_create_value = pthread_key_create(&gTLS, threadDestructor);
        if (key_create_value != 0) {
            pthread_mutex_unlock(&gTLSMutex);
            return NULL;
        }
        gHaveTLS = true;
    }
    pthread_mutex_unlock(&gTLSMutex);
    goto restart;
}
IPCThreadState::IPCThreadState()
    // 赋值ProcessState
    : mProcess(ProcessState::self()),
      mStrictModePolicy(0),
      mLastTransactionBinderFlags(0)
{
    // 当前对象存到当前线程私有空间
    pthread_setspecific(gTLS, this);
    clearCaller();
    // mIn、mOut两个parcel对象用于从binder驱动读写数据
    mIn.setDataCapacity(256);
    mOut.setDataCapacity(256);
}

IPCThreadState构造函数中我们主要关注ProcessState实例的获取赋值,它是一个进程范围内的单例对象,ProcessState::self获取该对象,不存在则创建。实际上在此调用处ProcessState单例已经存在了上一篇文章提到过,每个App进程在被Zygote进程fork出以后会调用到app_main.cpp onZygoteInit函数,在此创建该进程ProcessState并开启binder线程池。关于应用进程启动过程本文不加赘述,可以自行看源码,下面我们直接从onZygoteInit函数来看看Binder的初始化。

进程中的Binder初始化

frameworks\base\cmds\app_process\app_main.cpp

virtual void onZygoteInit()
{
    sp<ProcessState> proc = ProcessState::self();
    // 启动binder线程池
    proc->startThreadPool();
}

创建ProcessState

frameworks\native\libs\binder\ProcessState.cpp

sp<ProcessState> ProcessState::self()
{
    // 同步锁
    Mutex::Autolock _l(gProcessMutex);
    if (gProcess != NULL) {
        return gProcess;
    }
    gProcess = new ProcessState("/dev/binder");
    return gProcess;
}

ProcessState构造函数中先调用open_driver打开binder驱动,然后通过mmap系统调用到Binder驱动中binder_mmap方法开启用于接收数据的内存映射

ProcessState::ProcessState(const char *driver)
    // 打开binder,mDriverFD保存该文件描述符
    : mDriverFD(open_driver(driver))
    ......
    // 最大binder线程数,值为15
    : mMaxThreads(DEFAULT_MAX_BINDER_THREADS)
{
    if (mDriverFD >= 0) {
        // 调用binder_mmap建立(1M-8k)内存映射区
        mVMStart = mmap(0, BINDER_VM_SIZE, PROT_READ, MAP_PRIVATE | MAP_NORESERVE, mDriverFD, 0);
        if (mVMStart == MAP_FAILED) {
            close(mDriverFD);
            mDriverFD = -1;
            mDriverName.clear();
        }
    }
}

open_driver中主要调用到内核层Binder进行初始化

  1. open系统调用对应到driver层binder_open,返回的fd是一个文件描述符,后续操作需要传递此fd
  2. ioctl BINDER_VERSION调用到driver层binder_ioctl BINDER_VERSION case,用于获取内核binder版本
  3. ioctl BINDER_SET_MAX_THREADS调用到driver层binder_ioctl BINDER_SET_MAX_THREADS case,在驱动中设置该进程的线程数量限制
static int open_driver(const char *driver)
{
    int fd = open(driver, O_RDWR | O_CLOEXEC);
    if (fd >= 0) {
        int vers = 0;
        // 获取内核binder版本
        status_t result = ioctl(fd, BINDER_VERSION, &vers);
        if (result == -1) {
=            close(fd);
            fd = -1;
        }
        // 对比内核binder版本和framework中binder版本
        if (result != 0 || vers != BINDER_CURRENT_PROTOCOL_VERSION) {
            close(fd);
            fd = -1;
        }
        // 设置驱动binder_proc.max_threads = DEFAULT_MAX_BINDER_THREADS
        size_t maxThreads = DEFAULT_MAX_BINDER_THREADS;
        result = ioctl(fd, BINDER_SET_MAX_THREADS, &maxThreads);
        if (result == -1) {
        }
    }
    return fd;
}

driver层binder初始化

driver层binder_open根据当前进程信息创建一个binder_proc,插入全局链表;然后将此binder_proc指针装入用户空间fd对应的file指针中,以便下次使用 drivers/android/binder.c

static int binder_open(struct inode *nodp, struct file *filp)
{
    // binder实例所属进程对象,对应ProcessState
    struct binder_proc *proc;
    // 申请内核空间创建binder_proc指针
    proc = kzalloc(sizeof(*proc), GFP_KERNEL);
    if (proc == NULL)
        return -ENOMEM;
    // 获取当前进程进程描述符
    get_task_struct(current);
    proc->tsk = current;
    // 初始化进程任务队列
    INIT_LIST_HEAD(&proc->todo);
    // 等待队列
    init_waitqueue_head(&proc->wait);
    proc->default_priority = task_nice(current);

    binder_lock(__func__);

    binder_stats_created(BINDER_STAT_PROC);
    // 该binder_proc节点插入一个全局链表
    hlist_add_head(&proc->proc_node, &binder_procs);
    proc->pid = current->group_leader->pid;
    INIT_LIST_HEAD(&proc->delivered_death);
    // binder_proc指针存储到file指针的private_data,这样下次用户空间通过fd调用到driver,可以重新获取到这个指针
    filp->private_data = proc;
    binder_unlock(__func__);
    return 0;
}

binder_mmap中为当前进程开启内存映射

  • vm_area_struct结构体表示用户空间中一段虚拟地址空间,vm_struct表示内核空间一段虚拟地址
  • 之前在用户空间调用mmap时指定了映射空间大小为1M-8KB,内核自动为用户空间分配此大小地址段,指针存储在vm_area_struct结构体中,调用到binder_mmap后根据用户空间地址段在内核中分配同样大小的内核空间虚拟地址段,存储在area指针
  • 创建一个binder_buffer,记录用户/内核映射区起始地址和大小,以便后续存储通讯数据
  • 用户虚拟空间和内核虚拟空间此时还未产生联系,通过binder_update_page_range先分配一个页(4KB)大小的物理内存,使两个虚拟空间指针同时指向该物理页完成映射

这里仅分配了一个物理页,在后面调用binder_transaction真正产生通讯时会按需分配更多内存

static int binder_mmap(struct file *filp, struct vm_area_struct *vma)
{
    int ret;
    //内核虚拟空间
    struct vm_struct *area;
    struct binder_proc *proc = filp->private_data;
    const char *failure_string;
    // 每一次Binder传输数据时,都会先从Binder内存缓存区中分配一个binder_buffer来存储传输数据
    struct binder_buffer *buffer;

    if (proc->tsk != current)
            return -EINVAL;
    // 保证内存映射大小不超过4M
    if ((vma->vm_end - vma->vm_start) > SZ_4M)
        vma->vm_end = vma->vm_start + SZ_4M;
    ......
    // 采用IOREMAP方式,分配一个连续的内核虚拟空间,与用户进程虚拟空间大小一致
    // vma是从用户空间传过来的虚拟空间结构体
    area = get_vm_area(vma->vm_end - vma->vm_start, VM_IOREMAP);
    if (area == NULL) {
            ret = -ENOMEM;
            failure_string = "get_vm_area";
            goto err_get_vm_area_failed;
    }
    // 指向内核虚拟空间的地址
    proc->buffer = area->addr;
    // 用户虚拟空间起始地址 - 内核虚拟空间起始地址
    proc->user_buffer_offset = vma->vm_start - (uintptr_t)proc->buffer;
    ......
    // 分配物理页的指针数组,数组大小为vma的等效page个数
    proc->pages = kzalloc(sizeof(proc->pages[0]) * ((vma->vm_end - vma->vm_start) / PAGE_SIZE), GFP_KERNEL);
    if (proc->pages == NULL) {
            ret = -ENOMEM;
            failure_string = "alloc page array";
            goto err_alloc_pages_failed;
    }
    proc->buffer_size = vma->vm_end - vma->vm_start;

    vma->vm_ops = &binder_vm_ops;
    vma->vm_private_data = proc;
    // 分配物理页面,同时映射到内核空间和进程空间,先分配1个物理页
    if (binder_update_page_range(proc, 1, proc->buffer, proc->buffer + PAGE_SIZE, vma)) {
            ret = -ENOMEM;
            failure_string = "alloc small buf";
            goto err_alloc_small_buf_failed;
    }
    buffer = proc->buffer;
    // 创建buffers链表,buffer插入proc链表
    INIT_LIST_HEAD(&proc->buffers);
    list_add(&buffer->entry, &proc->buffers);
    buffer->free = 1;
    binder_insert_free_buffer(proc, buffer);
    // oneway异步可用大小为总空间的一半
    proc->free_async_space = proc->buffer_size / 2;
    barrier();
    proc->files = get_files_struct(current);
    proc->vma = vma;
    proc->vma_vm_mm = vma->vm_mm;
    return 0;
}

binder_update_page_range 函数为映射地址分配物理页,这里先分配一个物理页(4KB),然后将这个物理页同时映射到用户空间地址和内存空间地址

static int binder_update_page_range(struct binder_proc *proc, int allocate,
				    void *start, void *end,
				    struct vm_area_struct *vma)
{
    // 内核映射区起始地址
    void *page_addr;
    // 用户映射区起始地址
    unsigned long user_page_addr;
    struct page **page;
    // 内存结构体
    struct mm_struct *mm;
    
    if (end <= start)
        return 0;
    ......
    // 循环分配所有物理页,并分别建立用户空间和内核空间对该物理页的映射
    for (page_addr = start; page_addr < end; page_addr += PAGE_SIZE) {
        int ret;
        page = &proc->pages[(page_addr - proc->buffer) / PAGE_SIZE];

        BUG_ON(*page);
        // 分配一页物理内存
        *page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
        if (*page == NULL) {
                pr_err("%d: binder_alloc_buf failed for page at %p\n",
                        proc->pid, page_addr);
                goto err_alloc_page_failed;
        }
        // 物理内存映射到内核虚拟空间
        ret = map_kernel_range_noflush((unsigned long)page_addr,
                                PAGE_SIZE, PAGE_KERNEL, page);
        flush_cache_vmap((unsigned long)page_addr,
        // 用户空间地址 = 内核地址+偏移
        user_page_addr =
                (uintptr_t)page_addr + proc->user_buffer_offset;
        // 物理空间映射到用户虚拟空间
        ret = vm_insert_page(vma, user_page_addr, page[0]);
    }
}

开启Binder线程池

由于每次完整的Binder通讯都需要循环读写驱动,在此过程中会阻塞当前线程,所以开启多个线程处理多任务是必然的选择。ProcessState创建时会开启一个新线程无限循环读binder驱动,每当读到一个来自其他进程的通讯请求,当前线程处理该请求,然后在当前进程binder线程不超过最大限制时会额外创建另一个线程准备处理后续请求,提高响应速度,我们来看代码验证下。

frameworks\native\libs\binder\ProcessState.cpp

void ProcessState::startThreadPool()
{
    AutoMutex _l(mLock);
    if (!mThreadPoolStarted) {
        mThreadPoolStarted = true;
        // 启动binder主线程
        spawnPooledThread(true);
    }
}
void ProcessState::spawnPooledThread(bool isMain)
{
    if (mThreadPoolStarted) {
    	// 线程名称Binder:pid_序号  序号从1开始
        String8 name = makeBinderThreadName();
        // isMain Binder主线程
        sp<Thread> t = new PoolThread(isMain);
        // PoolThread::run最后会调用到PoolThread::threadLoop
        t->run(name.string());
    }
}

Thread::run经过一系列调用最后会调用到PoolThread::threadLoop

virtual bool threadLoop()
{
    // 此时已经运行在新线程中,将新线程注册为binder主线程
    IPCThreadState::self()->joinThreadPool(mIsMain);
    return false;
}

IPCThreadState::joinThreadPool无限循环调用getAndExecuteCommand读写Binder驱动,先写入BC_ENTER_LOOPER,然后无限读,无任务则休眠。

void IPCThreadState::joinThreadPool(bool isMain)
{
    // 主线程写入BC_ENTER_LOOPER,非主线程写入BC_REGISTER_LOOPER
    mOut.writeInt32(isMain ? BC_ENTER_LOOPER : BC_REGISTER_LOOPER);
    status_t result;
    do {
        // 清除所有Binder强弱引用
        processPendingDerefs();
        // 处理指令,通过talkWithDriver从driver层读出来的
        result = getAndExecuteCommand();

        if (result < NO_ERROR && result != TIMED_OUT && result != -ECONNREFUSED && result != -EBADF) {
            abort();
        }
       	// 非主线程超时退出
        if(result == TIMED_OUT && !isMain) {
            break;
        }
    } while (result != -ECONNREFUSED && result != -EBADF);
    // 线程结束需要通知driver
    mOut.writeInt32(BC_EXIT_LOOPER);
    talkWithDriver(false);
}

第一次调用talkWithDriver时先将joinThreadPool中写入mOut中的BC_ENTER_LOOPER指令写入driver,告知driver本线程已进入循环等待,之后开始读driver,通过executeCommand处理读出来的命令

status_t IPCThreadState::getAndExecuteCommand()
{
    // 此方法被循环调用
    status_t result;
    int32_t cmd;
    // 和driver通讯,先写入BC_ENTER_LOOPER到driver
    // 然后从driver读指令,无任务时线程将在这里休眠
    result = talkWithDriver();
    if (result >= NO_ERROR) {
        size_t IN = mIn.dataAvail();
        if (IN < sizeof(int32_t)) return result;
        cmd = mIn.readInt32();
        // 最大线程数处理
        ......
        // 解析并处理
        result = executeCommand(cmd);
        // 最大线程数处理
        ......
    }
    return result;
}

这里我们只需要知道ioctl调用到Binder驱动binder_ioctl_write_read,随后到binder_thread_write BC_ENTER_LOOPER分支就可以了,细节方面的东西留到后面真正通讯时再讲

status_t IPCThreadState::talkWithDriver(bool doReceive)
{
    if (mProcess->mDriverFD <= 0) {
        return -EBADF;
    }
    .....
    do {
        // 循环调用driver binder_ioctl_write_read
        // 先执行 binder_thread_write  BC_ENTER_LOOPER case
        // 然后调用binder_thread_read
        if (ioctl(mProcess->mDriverFD, BINDER_WRITE_READ, &bwr) >= 0)
            err = NO_ERROR;
        else
            err = -errno;
        if (mProcess->mDriverFD <= 0) {
            err = -EBADF;
        }
    } while (err == -EINTR);
    .......
}

driver层线程管理

接上面一节,talkWithDriver调用到binder驱动binder_ioctl_write_read,然后先到binder_thread_write函数,driver读到BC_REGISTER_LOOPER或者BC_ENTER_LOOPER时重设当前线程对应binder_thread.looper标志位,标记该线程已经开启循环,区别是BC_REGISTER_LOOPER(isMain = false)注册的线程会被记录到requested_threads,被最大线程数限制,而BC_ENTER_LOOPER不会

drivers/android/binder.c

static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
    ......
    while (ptr < end && thread->return_error == BR_OK) {
        ......
        switch (cmd) {
        // 非主线程
        case BC_REGISTER_LOOPER:
            // 该线程已经注册为binder主线程,不能重复注册
            if (thread->looper & BINDER_LOOPER_STATE_ENTERED) {
                    thread->looper |= BINDER_LOOPER_STATE_INVALID;
            } else if (proc->requested_threads == 0) {
            // 没有请求创建新线程时不应该创建
            thread->looper |= BINDER_LOOPER_STATE_INVALID;
            } else {
                proc->requested_threads--;
                proc->requested_threads_started++;
            }
            thread->looper |= BINDER_LOOPER_STATE_REGISTERED;
            break;
            // 主线程
        case BC_ENTER_LOOPER:
            if (thread->looper & BINDER_LOOPER_STATE_REGISTERED) {
                    thread->looper |= BINDER_LOOPER_STATE_INVALID;
            }
            // 设置调用线程对应的biner_thread.looper标志
            thread->looper |= BINDER_LOOPER_STATE_ENTERED;
        }
    }
    ......
}

接下来到binder_thread_read函数

  1. binder_proc.todo队列存放其他进程对于本进程的请求(BINDER_WORK_TRANSACTION),binder_thread.todo存放当前线程的任务(BINDER_WORK_TRANSACTION_COMPLETE)
  2. 当前binder_thread.todo为空时该线程进行休眠,直到binder_proc.todo不为空
  3. 读到了客户端的一个通讯请求时线程被唤醒,会走到BINDER_WORK_TRANSACTION case(具体过程在通讯部分分析),最后经过一系列判断在满足条件时写入BR_SPAWN_LOOPER通知framework层开启一个新线程
static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
    ......
    // 是否休眠
    int wait_for_proc_work;
    ......
retry:
    //当前线程todo队列为空且transaction栈为空,此值为true,代表该线程是空闲的
    wait_for_proc_work = thread->transaction_stack == NULL &&
				list_empty(&thread->todo);
    ......

    // 改变状态标识looper
    thread->looper |= BINDER_LOOPER_STATE_WAITING;
    if (wait_for_proc_work)
        // 空闲/就绪线程数++
        proc->ready_threads++;

    binder_unlock(__func__);
    
    if (wait_for_proc_work) {
        // 进入此分支
        if (!(thread->looper & (BINDER_LOOPER_STATE_REGISTERED |
                                BINDER_LOOPER_STATE_ENTERED))) {
                wait_event_interruptible(binder_user_error_wait,
                                         binder_stop_on_user_error < 2);
        }
        binder_set_nice(proc->default_priority);
        if (non_block) {
            // proc.todo是否有binder_work
            if (!binder_has_proc_work(proc, thread))
                ret = -EAGAIN;
        } else
            // 非异步调用则休眠线程,直到binder_proc.todo不为空
            // proc->wait表示该线程放入的等待队列
            ret = wait_event_freezable_exclusive(proc->wait, binder_has_proc_work(proc, thread));
    } else {
        ......
    }
    
    binder_lock(__func__);
    
    // 线程被唤醒后从这里继续执行=========================
    
    // 等待线程--,重置标志位
    if (wait_for_proc_work)
        proc->ready_threads--;
    thread->looper &= ~BINDER_LOOPER_STATE_WAITING;

    if (ret)
        return ret;
    
    while (1) {
        uint32_t cmd;
        struct binder_transaction_data tr;
        struct binder_work *w;
        struct binder_transaction *t = NULL;

        if (!list_empty(&thread->todo)) {
            w = list_first_entry(&thread->todo, struct binder_work,
                                     entry);
        } else if (!list_empty(&proc->todo) && wait_for_proc_work) {
            w = list_first_entry(&proc->todo, struct binder_work,
                                     entry);
        } else {
            ......
        }
        ......
        switch (w->type) {
            case BINDER_WORK_TRANSACTION: {
                // 通过binder_work获取binder_transaction
                t = container_of(w, struct binder_transaction, work);
            } break;
        }
    }
    // 收到BINDER_WORK_TRANSACTION说明存在binder_work要处理
    // 此时t存在,向下执行
    if (!t)
        continue;
    ......
    done:

    // 当前进程中没有请求创建binder线程,即requested_threads = 0;
    // 当前进程没有空闲可用的binder线程,即ready_threads = 0;
    // 当前进程已启动线程个数小于最大上限(默认15);
    // 当前线程已经开启循环
    // 满足上述条件时进入分支
    if (proc->requested_threads + proc->ready_threads == 0 &&
        proc->requested_threads_started < proc->max_threads &&
        (thread->looper & (BINDER_LOOPER_STATE_REGISTERED |
         BINDER_LOOPER_STATE_ENTERED)) /* the user-space code fails to */
         /*spawn a new thread if we leave this out */) {
            proc->requested_threads++;
            // 将BR_SPAWN_LOOPER指令写入read_buffer
            if (put_user(BR_SPAWN_LOOPER, (uint32_t __user *)buffer))
                    return -EFAULT;
            binder_stat_br(proc, thread, BR_SPAWN_LOOPER);
    }
}

然后回到native层IPCThreadState::talkWithDriver,当从驱动读到数据时调用IPCThreadState::executeCommand处理,这里走到BR_SPAWN_LOOPER case又开启一个线程并注册,区别是isMain = false

status_t IPCThreadState::executeCommand(int32_t cmd)
{
    switch ((uint32_t)cmd) {
    case BR_SPAWN_LOOPER:
        // 开启一个线程,开启循环监听driver
        // 这里isMain为false,会通过BC_REGISTER_LOOPER注册为普通binder线程
        mProcess->spawnPooledThread(false);
        break;
    }
}

发送端发起通讯

分析了Binder的初始化,我们接着看IPCThreadState.transact

  1. writeTransactionData对将要发送的数据进行封装
  2. waitForResponse向接收端发送数据并等待回复(接收端收到数据处理完后会给发送端发送一个BR_REPLY的回复),如果是oneway(异步模式),则不需要等待接收端的回复
status_t IPCThreadState::transact(int32_t handle,
                                  uint32_t code, const Parcel& data,
                                  Parcel* reply, uint32_t flags)
{
    ......
    if (err == NO_ERROR) {
        // 数据封装
        err = writeTransactionData(BC_TRANSACTION, flags, handle, code, data, NULL);
    }

    if (err != NO_ERROR) {
        if (reply) reply->setError(err);
        return (mLastError = err);
    }
    // oneway表示不需要等待接收端接收端回复
    if ((flags & TF_ONE_WAY) == 0) {
        if (reply) {
            // 向接收端发送数据并等待返回
            err = waitForResponse(reply);
        } else {
            Parcel fakeReply;
            err = waitForResponse(&fakeReply);
        }
    } else {
    	// oneway 异步不需要等待回复
        err = waitForResponse(NULL, NULL);
    }
    return err;
}

IPCThreadState.writeTransactionData中将data Parcel、handle等数据封装为binder_transaction_data结构体,然后将BC_TRANSACTION和此结构体写入mOut Parcel,mOut用于写入内核
这里需要注意的几个数据是tr.target.handle tr.code tr.data.ptr.buffer tr.data.ptr.offsets以及cmd

注意,如果需要给服务端发送Binder实体,那么这些Binder的地址会被保存在tr.data.ptr.offsets这种情况常见于AIDL双向通信,客户端向服务端注册回调时,这个回调也会有一个对应的BBinder,向服务端注册此回调时就要将该BBinder发送给服务端。

status_t IPCThreadState::writeTransactionData(int32_t cmd, uint32_t binderFlags,
    int32_t handle, uint32_t code, const Parcel& data, status_t* statusBuffer)
{
    binder_transaction_data tr;

    tr.target.ptr = 0;
    // 服务端BBinder对应的句柄
    tr.target.handle = handle;
    // 这里code为hello函数对应code TRANSACTION_hello
    tr.code = code;
    tr.flags = binderFlags;
    tr.cookie = 0;
    tr.sender_pid = 0;
    tr.sender_euid = 0;

    const status_t err = data.errorCheck();
    if (err == NO_ERROR) {
        tr.data_size = data.ipcDataSize();
        // data.ipcData()得到的是原通讯数据的指针
        tr.data.ptr.buffer = data.ipcData();
        tr.offsets_size = data.ipcObjectsCount()*sizeof(binder_size_t);
        // ipcObjects()代表需要传递给服务端的Binder实体地址
        tr.data.ptr.offsets = data.ipcObjects();
    } else if (statusBuffer) {
        ......
    } else {
        ......
    }
    // 这里cmd = BC_TRANSACTION
    mOut.writeInt32(cmd);
    mOut.write(&tr, sizeof(tr));

    return NO_ERROR;
}

waitForResponse函数中循环调用talkWithDriver读mIn Parcel中数据

status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult)
{
    uint32_t cmd;
    int32_t err;

    while (1) {
        // 构造binder_write_read,先将
        if ((err=talkWithDriver()) < NO_ERROR) break;
        .....处理从驱动读到的数据部分后面分析
    return err;
}

talkWithDriver函数中真正开始通过driver通讯

  1. 将mIn mOut中数据封装到一个binder_write_read结构体
  2. 然后通过ioctl调用到driver层binder_ioctl_write_read,此时mOut有数据,mIn无数据,先写后读,先到 binder_thread_write BC_TRANSACTION case
  3. 后到binder_thread_read,处理BINDER_WORK_TRANSACTION_COMPLETE case
status_t IPCThreadState::talkWithDriver(bool doReceive)
{
    if (mProcess->mDriverFD <= 0) {
        return -EBADF;
    }
    // doReceive默认true,doReceive表示调用者调用talkWithDriver希望接受binder驱动返回的命令协议,默认值为true。
    binder_write_read bwr;
    // mIn中无数据needRead为true
    const bool needRead = mIn.dataPosition() >= mIn.dataSize();
    // 不需要接受返回或者mIn无数据时才可以读
    const size_t outAvail = (!doReceive || needRead) ? mOut.dataSize() : 0;
    bwr.write_size = outAvail;
    bwr.write_buffer = (uintptr_t)mOut.data();
    if (doReceive && needRead) {
        // 需要接受返回并且mIn为空时则可以读数据到mIn中
        // 这里read_size = 256,IPCThreadState初始化时设置
        bwr.read_size = mIn.dataCapacity();
        bwr.read_buffer = (uintptr_t)mIn.data();
    } else {
        bwr.read_size = 0;
        bwr.read_buffer = 0;
    }

    if ((bwr.write_size == 0) && (bwr.read_size == 0)) return NO_ERROR;
    bwr.write_consumed = 0;
    bwr.read_consumed = 0;
    status_t err;
    do {
        // 循环调用driver binder_ioctl_write_read
        // 先执行 binder_thread_write  BC_TRANSACTION case
        // 然后到 binder_thread_read 处理 BINDER_WORK_TRANSACTION_COMPLETE
        if (ioctl(mProcess->mDriverFD, BINDER_WRITE_READ, &bwr) >= 0)
            err = NO_ERROR;
        else
            err = -errno;
        if (mProcess->mDriverFD <= 0) {
            err = -EBADF;
        }
    } while (err == -EINTR);

    if (err >= NO_ERROR) {
        if (bwr.write_consumed > 0) {
            // 移除已经被消费的数据段
            if (bwr.write_consumed < mOut.dataSize())
                mOut.remove(0, bwr.write_consumed);
            else
                mOut.setDataSize(0);
        }
        if (bwr.read_consumed > 0) {
            // 从driver读到数据了
            // 重设读缓冲的大小以及指针位置
            mIn.setDataSize(bwr.read_consumed);
            mIn.setDataPosition(0);
        }
        ......
        return NO_ERROR;
    }
}

最终写入到driver层的数据是一个binder_write_read结构体,结构如下

数据封装.png

driver层处理BC_TRANSACTION

drivers/android/binder.c

binder_ioctl_write_read函数中先写后读,可以看到,这里先从用户空间将binder_write_read参数的指针拷贝过来,将write_buffer和read_buffer分别交给binder_thread_write/binder_thread_read处理,最后将它拷贝回去用户空间

static int binder_ioctl_write_read(struct file *filp,
				unsigned int cmd, unsigned long arg,
				struct binder_thread *thread)
{
    int ret = 0;
    // 拿到binder_open时存放在private_data中的binder_proc指针
    struct binder_proc *proc = filp->private_data;
    unsigned int size = _IOC_SIZE(cmd);
    void __user *ubuf = (void __user *)arg;
    struct binder_write_read bwr;

    if (size != sizeof(struct binder_write_read)) {
        ret = -EINVAL;
        goto out;
    }
    // 从用户空间将bwr指针拷过来
    if (copy_from_user(&bwr, ubuf, sizeof(bwr))) {
        ret = -EFAULT;
        goto out;
    }
    // write_size>0说明有数据可写
    if (bwr.write_size > 0) {
        ret = binder_thread_write(proc, thread,
                                  bwr.write_buffer,
                                  bwr.write_size,
                                  &bwr.write_consumed);
        if (ret < 0) {
            bwr.read_consumed = 0;
            if (copy_to_user(ubuf, &bwr, sizeof(bwr)))
                    ret = -EFAULT;
            goto out;
        }
    }
    if (bwr.read_size > 0) {
        ret = binder_thread_read(proc, thread, bwr.read_buffer,
                                 bwr.read_size,
                                 &bwr.read_consumed,
                                 filp->f_flags & O_NONBLOCK);
        // 唤醒等待线程
        if (!list_empty(&proc->todo))
            wake_up_interruptible(&proc->wait);
        if (ret < 0) {
            if (copy_to_user(ubuf, &bwr, sizeof(bwr)))
                    ret = -EFAULT;
            goto out;
        }
    }
    // bwr指针拷贝回用户空间
    if (copy_to_user(ubuf, &bwr, sizeof(bwr))) {
        ret = -EFAULT;
        goto out;
    }
out:
    return ret;
}

binder_thread_write函数中读取判断binder_transaction_data.cmd,进入BC_TRANSACTION case,然后从用户空间拷贝binder_transaction_data指针,调用binder_transaction开始一个binder事务

static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
    uint32_t cmd;
    // bwr->write_buffer
    void __user *buffer = (void __user *)(uintptr_t)binder_buffer;
    // 跳过消费过的数据
    void __user *ptr = buffer + *consumed;
    void __user *end = buffer + size;
    ......
    while (ptr < end && thread->return_error == BR_OK) {
        // 获取mOut中的cmd
        if (get_user(cmd, (uint32_t __user *)ptr))
            return -EFAULT;
        ......
        switch (cmd) {
        case BC_TRANSACTION:
        case BC_REPLY: {
            struct binder_transaction_data tr;
            // 从用户空间拷贝数据,即mOut中tr
            if (copy_from_user(&tr, ptr, sizeof(tr)))
                return -EFAULT;
            ptr += sizeof(tr);
            // 开始一个binder_transaction
            binder_transaction(proc, thread, &tr, cmd == BC_REPLY);
            break;
        }
    }
}

binder_transaction函数非常关键,涉及的内容比较多,这里归纳一下

  1. biner_ref对应native层BpBinder,binder_node对应native层BBinder
  2. 根据native层传过来的BpBinder.handle找到接收端对应的binder_ref,然后得到相应binder_node和binder_proc。
  3. 创建一个binder_transaction,调用为binder_alloc_buf为接收端进程创建一个binder_buffer,分配物理内存并映射到接收端进程和内核空间,将它插入接收端进程buffer链表;并赋值给binder_transaction.buffer此buffer用于接收端接收发送端数据之前说过binder_mmap时仅创建分配了一页物理内存的binder_buffer,剩下的在通讯时分配,就是此处。
  4. 将binder_transaction_data.code等字段装入binder_transaction,从用户空间拷贝binder_transaction_data.data.ptr.buffer到上面创建的binder_buffer.data中,此buffer.data对应的物理内存是接收端进程和内核共享的,这样就完成了从发送端进程到接收端进程的数据传递
  5. binder_transaction.from设为发送端线程binder_thread,并将binder_transaction插入发送端线程binder_thread.transaction_stack
  6. binder_transaction.work类型设为BINDER_WORK_TRANSACTION,将它插入接收端进程binder_proc.todo队列,唤醒目标进程处理该work
  7. 创建一个BINDER_WORK_TRANSACTION_COMPLETE类型的binder_work,此work用于告知发送端发送已经完成,将它插入发送端线程binder_thread.todo队列
static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply)
{
    struct binder_transaction *t;
    struct binder_work *tcomplete;
    // 目标进程
    struct binder_proc *target_proc;
    // 目标线程
    struct binder_thread *target_thread = NULL;
    // 目标binder
    struct binder_node *target_node = NULL;
    // 目标todo队列
    struct list_head *target_list;
    // 目标进程等待队列
    wait_queue_head_t *target_wait;
    if (reply) {
        ......
    }else {
        if (tr->target.handle) {
            struct binder_ref *ref;
            // 由handle 找到相应 binder_ref(Binder引用),得到相应binder_node(Binder实体)
            ref = binder_get_ref(proc, tr->target.handle);
            target_node = ref->node;
        } else {
            ......
        }
        // 得到binder_node对应binder_proc
        target_proc = target_node->proc;
    }
    ......
    if (target_thread) {
        ......
    } else {
        // target_thread为null,接收端任意进程处理
        // 获取目标proc的todo队列
        target_list = &target_proc->todo;
        target_wait = &target_proc->wait;
    }
    ......
    // 创建binder_transaction
    t = kzalloc(sizeof(*t), GFP_KERNEL);
    if (t == NULL) {
        return_error = BR_FAILED_REPLY;
        goto err_alloc_t_failed;
    }
    binder_stats_created(BINDER_STAT_TRANSACTION);
    // 创建binder_work
    tcomplete = kzalloc(sizeof(*tcomplete), GFP_KERNEL);
    if (tcomplete == NULL) {
        return_error = BR_FAILED_REPLY;
        goto err_alloc_tcomplete_failed;
    }
    binder_stats_created(BINDER_STAT_TRANSACTION_COMPLETE);
    ......
    if (!reply && !(tr->flags & TF_ONE_WAY))
        // 当前线程设为from
        t->from = thread;
    else
        t->from = NULL;
    // binder_transaction_data数据装入binder_transaction
    t->code = tr->code;
    t->flags = tr->flags;    .....
    // 创建用于此次通信的binder_buffer,即从映射的物理内存中分配内存块
    t->buffer = binder_alloc_buf(target_proc, tr->data_size,
		tr->offsets_size, !reply && (t->flags & TF_ONE_WAY));
	if (t->buffer == NULL) {
		return_error = BR_FAILED_REPLY;
		goto err_binder_alloc_buf_failed;
	}
    t->buffer->allow_user_free = 0;
    t->buffer->debug_id = t->debug_id;
    t->buffer->transaction = t;
    // 设置目标binder_node
    t->buffer->target_node = target_node;
    ......
    // offp地址用于保存binder_transaction_data.data.ptr.offsets
    offp = (binder_size_t *)(t->buffer->data + ALIGN(tr->data_size, sizeof(void *)));
    // 拷贝用户空间的binder_transaction_data中data.ptr.buffer到内核
    // 赋值给binder_transaction.buffer.data
    if (copy_from_user(t->buffer->data, (const void __user *)(uintptr_t)
                       tr->data.ptr.buffer, tr->data_size)) {
            return_error = BR_FAILED_REPLY;
            goto err_copy_data_failed;
    }
    // 拷贝用户空间的binder_transaction_data中ptr.offsets到offp
    if (copy_from_user(offp, (const void __user *)(uintptr_t)
                       tr->data.ptr.offsets, tr->offsets_size)) {
            return_error = BR_FAILED_REPLY;
            goto err_copy_data_failed;
    }
    // offp - off_end地址段用于存放客户端需要传给服务端的Binder类型的数据
    // 由于我们这里分析的是单向通信,没有用到回调,所以取不到值,直接跳过
    off_end = (void *)offp + tr->offsets_size;
	off_min = 0;
    for (; offp < off_end; offp++) {
        ......
    }
    if (reply) {
            .....
    } else if (!(t->flags & TF_ONE_WAY)) {
        // BC_TRANSACTION 且非oneway,则设置发送端线程事务栈信息
        t->need_reply = 1;
        t->from_parent = thread->transaction_stack;
        thread->transaction_stack = t;
    } else {
        ......
    }
    t->work.type = BINDER_WORK_TRANSACTION;
    // 将BINDER_WORK_TRANSACTION添加到目标队列
    // 本次通信的目标队列为服务端对应proc的todo队列
    list_add_tail(&t->work.entry, target_list);
    tcomplete->type = BINDER_WORK_TRANSACTION_COMPLETE;
    // 将BINDER_WORK_TRANSACTION_COMPLETE添加到当前线程的todo队列
    list_add_tail(&tcomplete->entry, &thread->todo);
    // 唤醒目标进程等待队列,处理本次BINDER_WORK_TRANSACTION
    if (target_wait)
            wake_up_interruptible(target_wait);
    return;
}

发送端处理 BINDER_WORK_TRANSACTION_COMPLETE

发送端处理完binder_thread_write后接着到binder_thread_read,此时thread.todo中存在一个BINDER_WORK_TRANSACTION_COMPLETE work,进入对应分支,写回一个cmd=BR_TRANSACTION_COMPLETE的消息到用户空间,代表此次发送端的发送已经完成

static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
    while (1) {
        uint32_t cmd;
        struct binder_transaction_data tr;
        struct binder_work *w;
        struct binder_transaction *t = NULL;
        if (!list_empty(&thread->todo)) {
        // 由于将BINDER_WORK_TRANSACTION_COMPLETE添加到发送端线程的todo队列
        // 进入此分支拿到binder_work
        w = list_first_entry(&thread->todo, struct binder_work, entry);
        } else if (!list_empty(&proc->todo) && wait_for_proc_work) {
            ......
        } else {
            ......
        }
    switch (w->type) {
        case BINDER_WORK_TRANSACTION_COMPLETE: {
        // cmd转换
        cmd = BR_TRANSACTION_COMPLETE;
        // cmd和buffer指针写回发送端用户空间
        if (put_user(cmd, (uint32_t __user *)ptr))
                return -EFAULT;
        ptr += sizeof(uint32_t);
        binder_stat_br(proc, thread, cmd);
        // 删除并释放该binder_work
        list_del(&w->entry);
        kfree(w);
        binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE);
    } break;
}

driver层处理完返回,IPCThreadState::talkWithDriver处理完此次通讯也返回,回到IPCThreadState::waitForResponse,读取处理mIn(bwr read_buffer)中driver写入的BR_TRANSACTION_COMPLETE对应数据。

处理完BR_TRANSACTION_COMPLETE后,由于需要等待接收端的reply,所以继续循环调用到talkWithDriver,然后到driver binder_thread_read中休眠,等待reply

status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult)
{
    uint32_t cmd;
    int32_t err;

    while (1) {
        if ((err=talkWithDriver()) < NO_ERROR) break;
        err = mIn.errorCheck();
        if (err < NO_ERROR) break;
        if (mIn.dataAvail() == 0) continue;
        cmd = (uint32_t)mIn.readInt32();
        switch (cmd) {
        case BR_TRANSACTION_COMPLETE:
            // 这里reply存在,表示要接收服务端后续的BR_XXX回应
            // 所以继续循环调用talkWithDriver
            if (!reply && !acquireResult) goto finish;
            break;
        }
	}
}

接收端处理请求

在Binder初始化时提到过,进程开启Binder线程池后循环读driver,无任务时在binder_thread_read中休眠,binder_proc.todo不为空时被唤醒。这里发送端往接收端binder_proc.todo中塞入了BINDER_WORK_TRANSACTION work,所以接收端被唤醒处理此work。
根据发送端塞入的binder_work获取对应binder_transaction,然后取出其中的数据装入binder_transaction_data,cmd设为BR_TRANSACTION,将它拷贝回用户空间。target_node->ptr表示BBinder的弱引用指针,target_node->cookie表示BBinder的指针。

static int binder_thread_read(struct binder_proc *proc,
			      struct binder_thread *thread,
			      binder_uintptr_t binder_buffer, size_t size,
			      binder_size_t *consumed, int non_block)
{
    .......
    if (wait_for_proc_work)
        // 退出休眠,休眠线程数量-1
        proc->ready_threads--;
    // 重置标志位
    thread->looper &= ~BINDER_LOOPER_STATE_WAITING;
    while (1) {
        uint32_t cmd;
        struct binder_transaction_data tr;
        struct binder_work *w;
        struct binder_transaction *t = NULL;
        if (!list_empty(&thread->todo)) {
            ......
        } else if (!list_empty(&proc->todo) && wait_for_proc_work) {
            // 取出发送端写入的BINDER_WORK_TRANSACTION binder_work
            w = list_first_entry(&proc->todo, struct binder_work, entry);
        } else {
            ......
        }
        switch (w->type) {
        case BINDER_WORK_TRANSACTION: {
        // 通过binder_work获取binder_transaction
        t = container_of(w, struct binder_transaction, work);
        } break;
    }
    ......
    if (t->buffer->target_node) {
        // 到此分支
        // 接收端binder_node
        struct binder_node *target_node = t->buffer->target_node;

        tr.target.ptr = target_node->ptr;
        // target_node->cookie
        tr.cookie =  target_node->cookie;
        ......
        // 更改cmd
        cmd = BR_TRANSACTION;
    } else {
        ......
    }
    // 从binder_transaction中取数据赋给tr
    tr.code = t->code;
    tr.flags = t->flags;
    tr.sender_euid = from_kuid(current_user_ns(), t->sender_euid);
    ......
    tr.data_size = t->buffer->data_size;
    tr.offsets_size = t->buffer->offsets_size;
    // t->buffer->data即为发送端数据的地址
    tr.data.ptr.buffer = (binder_uintptr_t)(
                            (uintptr_t)t->buffer->data +
                            proc->user_buffer_offset);
    tr.data.ptr.offsets = tr.data.ptr.buffer +
                            ALIGN(t->buffer->data_size,
                                sizeof(void *));
    // cmd以及binder_transaction_data指针拷贝到用户空间
    if (put_user(cmd, (uint32_t __user *)ptr))
            return -EFAULT;
    ptr += sizeof(uint32_t);
    if (copy_to_user(ptr, &tr, sizeof(tr)))
            return -EFAULT;
    ptr += sizeof(tr);
    ......
}

接收端driver层处理完返回,回到IPCThreadState::getAndExecuteCommand,接着talkWithDriver()往下执行,读取mIn(bwr read_buffer)中来自driver的数据,调用IPCThreadState::executeCommand处理。

status_t IPCThreadState::getAndExecuteCommand()
{
    status_t result;
    int32_t cmd;
    // 和driver通讯,先写入BC_ENTER_LOOPER到driver
    // 然后从driver读指令,无任务时线程将在这里休眠
    result = talkWithDriver();
    if (result >= NO_ERROR) {
        size_t IN = mIn.dataAvail();
        if (IN < sizeof(int32_t)) return result;
        cmd = mIn.readInt32();
        // 最大线程数处理
        ......
        // 解析并处理
        result = executeCommand(cmd);
        // 最大线程数处理
        ......
    }
    return result;
}

进入BR_TRANSACTION case
binder_transaction_data.cookie转为BBinder指针,调用BBinder.transact将数据回传给Java层,最后调用sendReply向客户端发送reply

status_t IPCThreadState::executeCommand(int32_t cmd)
{
    BBinder* obj;
    RefBase::weakref_type* refs;
    status_t result = NO_ERROR;
    switch ((uint32_t)cmd) {
    case BR_TRANSACTION:
        {
            binder_transaction_data tr;
            // 从指针读取结构体
            result = mIn.read(&tr, sizeof(tr));
            if (result != NO_ERROR) break;

            Parcel buffer;
            // binder_transaction_data数据解析并装入parcel
            buffer.ipcSetDataReference(
                reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
                tr.data_size,
                reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
                tr.offsets_size/sizeof(binder_size_t), freeBuffer, this);
            ......
            Parcel reply;
            status_t error;
            if (tr.target.ptr) {
                // 根据BBinder弱引用判断强引用是否失效
                if (reinterpret_cast<RefBase::weakref_type*>(
                        tr.target.ptr)->attemptIncStrong(this)) {
                    // 调用BBinder::transact
                    error = reinterpret_cast<BBinder*>(tr.cookie)->transact(tr.code, buffer,
                            &reply, tr.flags);
                    reinterpret_cast<BBinder*>(tr.cookie)->decStrong(this);
                } else {
                    error = UNKNOWN_TRANSACTION;
                }

            } else {
                error = the_context_object->transact(tr.code, buffer, &reply, tr.flags);
            }

            if ((tr.flags & TF_ONE_WAY) == 0) {
                // 非oneway要向客户端发送reply
                if (error < NO_ERROR) reply.setError(error);
                sendReply(reply, 0);
            } else {
                ......
            }
            ......
        }
        break;
}

回调数据到Java层

由于此处BBinder指针实际上是一个JavaBBinder,所以BBinder::transact调用到JavaBBinder::onTransact
至于为什么是JavaBBinder,可以看一看java层Binder对象的初始过程,最终会通过jni在native层创建对应JavaBBinder。

frameworks\native\libs\binder\Binder.cpp

status_t BBinder::transact(
    uint32_t code, const Parcel& data, Parcel* reply, uint32_t flags)
{
    data.setDataPosition(0);
    status_t err = NO_ERROR;
    switch (code) {
        case PING_TRANSACTION:
            reply->writeInt32(pingBinder());
            break;
        default:
            err = onTransact(code, data, reply, flags);
            break;
    }

    if (reply != NULL) {
        reply->setDataPosition(0);
    }
    return err;
}

JavaBBinder::onTransact反射调用到Binder.java中的execTransact方法, frameworks\base\core\jni\android_util_Binder.cpp

// Java Binder类名
const char* const kBinderPathName = "android/os/Binder";

static int int_register_android_os_Binder(JNIEnv* env)
{
    jclass clazz = FindClassOrDie(env, kBinderPathName);
    // Binder.execTransact方法签名
    gBinderOffsets.mClass = MakeGlobalRefOrDie(env, clazz);
    gBinderOffsets.mExecTransact = GetMethodIDOrDie(env, clazz, "execTransact", "(IJJI)Z");
    gBinderOffsets.mObject = GetFieldIDOrDie(env, clazz, "mObject", "J");
    // 注册jni方法
    return RegisterMethodsOrDie(
        env, kBinderPathName,
        gBinderMethods, NELEM(gBinderMethods));
}


class JavaBBinder : public BBinder
{
protected:
    virtual status_t onTransact(
        uint32_t code, const Parcel& data, Parcel* reply, uint32_t flags = 0)
    {
        JNIEnv* env = javavm_to_jnienv(mVM);
        // mExecTransact = Binder.execTransact方法签名
        // 反射调用Binder.java中的execTransact方法
        jboolean res = env->CallBooleanMethod(mObject, gBinderOffsets.mExecTransact,
            code, reinterpret_cast<jlong>(&data), reinterpret_cast<jlong>(reply), flags);

        if (env->ExceptionCheck()) {
            jthrowable excep = env->ExceptionOccurred();
            env->DeleteLocalRef(excep);
        }
        ......
        return res != JNI_FALSE ? NO_ERROR : UNKNOWN_TRANSACTION;
    }
}

回到Java层,Binder.execTransact调用Binder.onTransact,我们这里的Binder是IHelloInterface.Stub,所以最后到IHelloInterface.Stub.onTransact,判断code调用远程服务中匿名IHelloInterface.Stub类hello方法
frameworks\base\core\java\android\os\Binder.java

private boolean execTransact(int code, long dataObj, long replyObj,
            int flags) {
        Parcel data = Parcel.obtain(dataObj);
        Parcel reply = Parcel.obtain(replyObj);
        final boolean tracingEnabled = Binder.isTracingEnabled();
        try {
            res = onTransact(code, data, reply, flags);
        }
        ......

服务端向客户端发送reply

还有最后一步,客户端还在等待回应,回到IPCThreadState::sendReply,还是熟悉的配方,向驱动写入了BC_REPLY,最后一路到driver层binder_thread_write BC_REPLY case

status_t IPCThreadState::sendReply(const Parcel& reply, uint32_t flags)
{
    status_t err;
    status_t statusBuffer;
    err = writeTransactionData(BC_REPLY, flags, -1, 0, reply, &statusBuffer);
    if (err < NO_ERROR) return err;

    return waitForResponse(NULL, NULL);
}

binder_thread_write中再次调用binder_transaction,这次reply参数为true

static int binder_thread_write(struct binder_proc *proc,
			struct binder_thread *thread,
			binder_uintptr_t binder_buffer, size_t size,
			binder_size_t *consumed)
{
    ......
    // 循环读buffer中指令
    while (ptr < end && thread->return_error == BR_OK) {
        ......
        switch (cmd) {
        case BC_TRANSACTION:
        case BC_REPLY: {
            struct binder_transaction_data tr;
            if (copy_from_user(&tr, ptr, sizeof(tr)))
                    return -EFAULT;
            ptr += sizeof(tr);
            // 发起一次到客户端的通讯
            binder_transaction(proc, thread, &tr, cmd == BC_REPLY);
            break;
        }
}

服务端向客户端reply过程和上面客户端发起binder_transaction过程差不多,只不过走了另外一条分支

  1. 最后往客户端线程todo队列添加BINDER_WORK_TRANSACTION work
  2. 往自身(服务端)线程todo队列添加BINDER_WORK_TRANSACTION_COMPLETE
static void binder_transaction(struct binder_proc *proc,
			       struct binder_thread *thread,
			       struct binder_transaction_data *tr, int reply)
{	
    struct binder_transaction *t;
    struct binder_work *tcomplete;
    // 目标队列work
    struct list_head *target_list;
    // 目标等待队列
	wait_queue_head_t *target_wait;
	if (reply) {
            // 接收端事务栈
            in_reply_to = thread->transaction_stack;
            if (in_reply_to == NULL) {
                    return_error = BR_FAILED_REPLY;
                    goto err_empty_call_stack;
            }
            binder_set_nice(in_reply_to->saved_priority);
            if (in_reply_to->to_thread != thread) {
                    return_error = BR_FAILED_REPLY;
                    in_reply_to = NULL;
                    goto err_bad_call_stack;
            }
            thread->transaction_stack = in_reply_to->to_parent;
            // 发送端的binder_thread
            target_thread = in_reply_to->from;
            if (target_thread == NULL) {
                    return_error = BR_DEAD_REPLY;
                    goto err_dead_binder;
            }
            // 两边栈顶的binder_transaction要相同
            if (target_thread->transaction_stack != in_reply_to) {
                    return_error = BR_FAILED_REPLY;
                    in_reply_to = NULL;
                    target_thread = NULL;
                    goto err_dead_binder;
            }
            target_proc = target_thread->proc;
	} else {
		......
	}
    
    if (target_thread) {
      e->to_thread = target_thread->pid;
      //发起端的线程
      target_list = &target_thread->todo;
      target_wait = &target_thread->wait;
    } else {
      ...
    }
    // 创建binder_transaction
    t = kzalloc(sizeof(*t), GFP_KERNEL);
	// 创建binder_work
	tcomplete = kzalloc(sizeof(*tcomplete), GFP_KERNEL);
    ......
    if (!reply && !(tr->flags & TF_ONE_WAY))
		......
	else
        // bc_reply不需要后续回复
        t->from = NULL;
    // 创建用于此次通信的binder_buffer,即从映射的物理内存中分配内存块
    t->buffer = binder_alloc_buf(target_proc, tr->data_size, tr->offsets_size, !reply && (t->flags & TF_ONE_WAY));
    ......
    if (reply) {
        // transaction栈中弹出这个binder_transaction
		binder_pop_transaction(target_thread, in_reply_to);
	} else if (!(t->flags & TF_ONE_WAY)) {
		......
	} else {
		......
	}
    
    t->work.type = BINDER_WORK_TRANSACTION;
    // 将BINDER_WORK_TRANSACTION binder_work添加到发送端线程todo队列
    list_add_tail(&t->work.entry, target_list);
    tcomplete->type = BINDER_WORK_TRANSACTION_COMPLETE;
    // 将BINDER_WORK_TRANSACTION_COMPLETE添加到当前线程的todo队列
    list_add_tail(&tcomplete->entry, &thread->todo);
    // 唤醒发送端等待队列
    if (target_wait)
            wake_up_interruptible(target_wait);
    return;
}

接下来服务端线程binder_thread_read处理BINDER_WORK_TRANSACTION_COMPLETE,cmd变为BR_TRANSACTION_COMPLETE返回到IPCThreadState::waitForResponse,结束服务端逻辑。

客户端线程binder_thread_read处理BINDER_WORK_TRANSACTION,cmd变为BR_REPLY也返回到IPCThreadState::waitForResponse,结束客户端逻辑,自此整个通讯过程结束。

最后放一张图,这张图来自# 彻底理解Android Binder通信架构,只不过大佬分析的是app进程调用系统服务的通讯过程,而我分析的是两个应用进程通讯的过程,总体大同小异,这里把图中system_server看作另一个app process就好了。 image.png

后记

所谓源码分析终究还是要自己深入去看,讲究的就是一个耐心,本文再详细也只能起到一个抛砖引玉的作用。另外对于Binder体系在framework层中的service_manager并没有用IPCThreadState这一套封装,但是别的地方也都差不多,本文没有提及,感兴趣的可以自行看源码。