Android系统启动

1,103 阅读47分钟

1. Android 架构简介:

Android架构图:

  参考: developer.android.com/guide/platf…

Android 操作系统从上至下分为5层,分别是:System Apps层、Java Framework层、Native&Android Runtime层、HAL层、Linux kernel层。

从上至下各层工作简述如下:

1.1 System Apps层

System Apps层是用户与系统交互的图形化接口。简单来说,System Apps层是一些使用系统接口的应用,例如Launcher、Settings、SystemUI、邮件、拨号、短信等应用。通常来说,这些应用的实现依赖系统的某些接口,与系统的耦合较高,权限比一般应用高。

1.2 Java Framework层

Java Framework层是承上启下的一层。向下承接Native层的接口,向上提供App层需要的接口与组件等。这些 API 是创建 Android 应用所需的构建块的基础,可简化核心、模块系统组件和服务的重复使用。包括以下组件和服务:

  • 丰富且可扩展的视图系统,可用于构建应用界面,包括列表、网格、文本框、按钮,甚至可嵌入的网络浏览器
  • 资源管理器,用于访问非代码资源,例如本地化的字符串、图形和布局文件
  • 通知管理器,可让所有应用在状态栏中显示自定义提醒
  • 一个 activity 管理器,用于管理应用的生命周期,并提供常见的导航返回堆栈
  • Content Provider,可让应用访问其他应用(例如“通讯录”应用)中的数据或共享自己的数据

System Apps层开发使用的几乎所有接口与组件,例如Activity、Broadcast Receiver、Content Provider、Service、Fragment等均为Java Framework层提供。

1.3 Native&Android Runtime层

Native层和Java Framework层是相似的。Native层主要是用C/C++编写的原生库和原生代码构建的。许多Android核心的系统组件和服务,例如binder、Multiple Media等均为Native层向上提供的服务。换句话来说,Java Framewor层实现的Java API和许多组件,都依赖Native。

当然,Native层除了向上提供服务外,还需要实现一些本层特的功能,例如ART编译、OpenGL绘制图形等功能。

Android Runtime是指Android虚拟机。是Android 5.0之后引入的Java虚拟机。google在Java通用虚拟机Dalvik做了一定的优化,例如:设计并实现了针对dex文件,这个文件是Java字节码的优化版本,提高虚拟机编译的效率。Android Runtime为Java Framework和System Apps提供服务。

1.4 HAL层

HAL是Hardware Abstraction Layer的缩写。HAL 由多个库模块组成,每个模块都为特定类型的硬件组件(例如相机或蓝牙模块)实现一个接口。当框架 API 发出调用以访问设备硬件时,Android 系统将为该硬件组件加载库模块。

1.5 Linux Kernel层

Android 平台的基础是 Linux 内核。例如,Android Runtime (ART) 依靠 Linux 内核来执行底层功能,例如线程和低层内存管理。使用 Linux 内核可让 Android 利用主要安全功能,并且允许设备制造商为著名的内核开发硬件驱动程序。

该层主要包括:驱动、内存管理、进程管理、网络协议等组件

2. Android 启动架构

参考:gityuan.com/android/

Android系统启动流程是如上图所示。

启动流程从bootloader开始,由下至上依次启动:Bootloader -> Linux kernel -> Native -> Java Framework -> Apps。

  接下来依层分析启动流程做了哪些工作。

3. 启动流程

3.1 Loader层

Loader层的主要功能是加载厂商写死的引导程序,拉起kernel。

  • Boot ROM: 当手机处于关机状态时,长按Power键开机,引导芯片开始从固化在ROM里的预设代码开始执行,然后加载引导程序到RAM

bootrom(或Boot ROM)是嵌入处理器芯片(SOC)内的一小块掩模ROM或写保护闪存。它包含处理器在上电或复位时执行的第一个代码。根据某些带式引脚或内部保险丝的配置,它可以决定从哪里加载要执行的代码的下一部分以及如何或是否验证其正确性或有效性。有时它可能包含其他功能,可能在引导期间或之后由用户代码使用。

  • Boot Loader:这是启动Android系统之前的引导程序,主要是检查RAM,初始化硬件参数等功能。

3.2 Kernel层

  Linux内核是Android平台的基础,内核的安全机制是Android机制的保障,设备厂商也可以基于内核开发驱动程序。在Kernel层,主要任务是:启动kernel的swapper进程、启动kthreadd进程、启动driver

  • swapper进程(pid=0):该进程又称为idle进程, 系统初始化过程Kernel由无到有开创的第一个进程, 用于初始化进程管理、内存管理,加载Display,Camera Driver,Binder Driver等相关工作;
  • 1号进程(pid=1,ppid=0):由0号进程创建,最初的函数是kernel_init(),运行在内核态,kernel_init完成一些初始化工作后,会通过kernel_execve进入用户态。1号进程具体的名字由用户态程序决定。进入用户空间后,执行init.rc。
  • kthreadd进程(pid=2,ppid=0):是Linux系统的内核进程,会创建内核工作线程kworkder,软中断线程ksoftirqd,thermal等内核守护进程。kthreadd进程是所有内核进程的鼻祖

3.2.1 swapper进程详解:

参考:【Linux内核|进程管理】0号线程swapper简介

image.png 以ARM64位的机器为例,在内核初始阶段,是没有线程和进程的概念的。在开启MMU后,__primary_switched的第一步就是将init_task的地址写到sp_el0,这个时候就可以用get_current()或者curent获取到0号进程的task_struct了。在0号进程的上下文,完成调度器相关初始化操作后,就会创建1号和2号进程。然后开启调度器,init_task进入idle状态。

3.2.1.1 swapper工作介绍:

在汇编结束后,会进入init/main.c的start_kernel函数。在这个函数内实现调度器初始化和1、2号进程的创建,最终将init_task变为idle状态。

3.2.1.2 start_kernel

path:kernel-4.14/init/main.c

asmlinkage __visible void __init start_kernel(void)
{
        ......
        local_irq_disable(); // 禁用当前cpu(bootcpu)的中断
        early_boot_irqs_disabled = true;

        /*
         * Interrupts are still disabled. Do necessary setups, then
         * enable them.
         */
        boot_cpu_init();
        page_address_init();
        pr_notice("%s", linux_banner);
        setup_arch(&command_line);
        ......
        // 调度器初始化
        sched_init();
        ......
        // 定时器初始化,注册clockevent等
        time_init();
        ......
        // 使能当前cpu(bootcpu)的中断
        local_irq_enable();
        ......
        // 计算lpj
        calibrate_delay();
        // 初始化pid_max和init_pid_ns等
        pidmap_init();
        // 创建thread_stack_cache kmem_cache
        thread_stack_cache_init();
        // 创建cred_jar kmem_cache
        cred_init();
        // task_struct_cachep、rlimit和一些其他初始化
        fork_init();
        // 一些kmem_cache的初始化
        proc_caches_init();
        ......
        rest_init();                
        ......
}

在start_kernel()函数中做大量初始化工作,例如调度器、定时器等。

3.2.1.2 为其他cpu创建swapper进程

bootcpu使用进程声明的init_task结构,comm为swapper,在sched_init时,调用init_idle将其设置为swapper/,bootcpu通常设置为swapper/0。在一号进程开始运行之后,smp_init调用idle_threads_init创建非bootcpu的swapper进程,将其设置为swapper/。如图所示:

3.2.1.3 创建1号和2号进程

path:kernel-4.14/init/main.c

swapper进程的start_kernel()函数最后会执行reset_init(),reset函数内部创建1号进程和kthreadd进程。

/*
 * We need to finalize in a non-__init function or else race conditions
 * between the root thread and the init thread may cause start_kernel to
 * be reaped by free_initmem before the root thread has proceeded to
 * cpu_idle.
 *
 * gcc-3.4 accidentally inlines this function, so use noinline.
 */

static __initdata DECLARE_COMPLETION(kthreadd_done);

static noinline void __ref rest_init(void)
{
        struct task_struct *tsk;
        int pid;
        // 启动RCU(Read-Copy-Update)机制
        rcu_scheduler_starting();
        /*
         * We need to spawn init first so that it obtains pid 1, however
         * the init task will end up wanting to create kthreads, which, if
         * we schedule it before we create kthreadd, will OOPS.
         */
        // 创建kernel_thread,获取pid,1号进程
        pid = kernel_thread(kernel_init, NULL, CLONE_FS);
        ......
        // 创建kthreadd进程
        pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
        // kernel_init可能会创建内核进程,kernel_init最开始会kthreadd进程创建完成
        complete(&kthreadd_done);

        /*
         * The boot idle thread must execute schedule()
         * at least once to get things moving:
         */
        // bootcpu的idle程序需要调用schedule来使能调度,这样1号和2号进程才可以运行。 
        schedule_preempt_disabled();
        /* Call into cpu_idle with preempt disabled */
        // 将swapper进程置为idle
        cpu_startup_entry(CPUHP_ONLINE);
}

函数内部使用kernel_thread()创建1号进程kernel_init和2号进程kthreadd,并且使用schedule_preempt_disabled()使得kernel_initkthreadd可以被调度起来。完成以上工作后,swapper进程进入idle状态。

3.2.1.4 允许1号和2号进程运行

path:kernel-4.14/kernel/sched/core.c

void __sched schedule_preempt_disabled(void){
    // 打开抢占,但是不进行调度
    sched_preempt_enable_no_resched();  
    // 主动调度,使1号和2号进程可以运行     
    schedule();       
    // 重新关闭抢占,swapper是最后被选择的进程,不会有其他进程抢占  
    preempt_disable();                 
    }
3.2.1.5 设置init_task状态

将init_task设置为idle状态。

do_idle()函数主要工作就是检查swapper是否需要重新调度,不需要就一直循环。如果处理器支持,进入低功耗状态,等待处理器被唤醒。否则通过schedule_idle()调度其他进程运行

kernel-4.14/kernel/sched/idle.c

void cpu_startup_entry(enum cpuhp_state state)
{
        ......
        // 默认为空
        arch_cpu_idle_prepare();
        cpuhp_online_idle(state);
        while (1)
                do_idle();
}

/*
 * Generic idle loop implementation
 *
 * Called with polling cleared.
 */
static void do_idle(void)
{
        /*
         * If the arch has a polling bit, we maintain an invariant:
         *
         * Our polling bit is clear if we're not scheduled (i.e. if rq->curr !=
         * rq->idle). This means that, if rq->idle has the polling bit set,
         * then setting need_resched is guaranteed to cause the CPU to
         * reschedule.
         */

        __current_set_polling();
        quiet_vmstat();
        tick_nohz_idle_enter();

        while (!need_resched()) {
                check_pgt_cache();
                rmb();

                if (cpu_is_offline(smp_processor_id())) {
                        tick_nohz_idle_stop_tick_protected();
                        cpuhp_report_idle_dead();
                        arch_cpu_idle_dead();
                }

                local_irq_disable();
                arch_cpu_idle_enter();

                /*
                 * In poll mode we reenable interrupts and spin. Also if we
                 * detected in the wakeup from idle path that the tick
                 * broadcast device expired for us, we don't want to go deep
                 * idle as we know that the IPI is going to arrive right away.
                 */
                if (cpu_idle_force_poll || tick_check_broadcast_expired()) {
                        tick_nohz_idle_restart_tick();
                        cpu_idle_poll();
                } else {
                        cpuidle_idle_call();
                }
                arch_cpu_idle_exit();
        }

        /*
         * Since we fell out of the loop above, we know TIF_NEED_RESCHED must
         * be set, propagate it into PREEMPT_NEED_RESCHED.
         *
         * This is required because for polling idle loops we will not have had
         * an IPI to fold the state for us.
         */
        preempt_set_need_resched();
        tick_nohz_idle_exit();
        __current_clr_polling();

        /*
         * We promise to call sched_ttwu_pending() and reschedule if
         * need_resched() is set while polling is set. That means that clearing
         * polling needs to be visible before doing these things.
         */
        smp_mb__after_atomic();

        sched_ttwu_pending();
        schedule_idle();

        if (unlikely(klp_patch_pending(current)))
                klp_update_patch_state(current);
}

3.2.2 一号进程

3.2.2.1 一号进程启动

kernel_init运行在内核态,进入用户态后创建init进程

一号进程由wapper进程的kernel_thread()创建,因此一号进程创建和启动的入口在swapper的reset_init()函数中。

path:kernel-4.14/init/main.c

static noinline void __ref rest_init(void)
{
        ......
        // 创建一号进程
        pid = kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_FILES);
        ......
}

kernel_thread会调用do_fork函数用于创建进程,进程创建成功后会通过函数指针回调执行kernel_init函数。

path:kernel-4.14/init/main.c

static int __ref kernel_init(void *unused)
{
        int ret;
        // init进程的初始化
        kernel_init_freeable();
        // 等待所有异步调用执行完成,释放内存前必须要完成所有异步
        async_synchronize_full();
        // 释放内存
        ftrace_free_init_mem();
        free_initmem();
        mark_readonly();
        // 设置系统状态为运行态
        system_state = SYSTEM_RUNNING;
        // 设定NUMA系统的默认内存方位策略
        numa_default_policy();
        // 释放所有延时的struct_file结构体
        rcu_end_inkernel_boot();
        ......
        // ramdisk_execute_command值为"/init"
        if (ramdisk_execute_command) {
                // 运行根目录下的init程序     
                ret = run_init_process(ramdisk_execute_command);
                if (!ret)
                        return 0;
                pr_err("Failed to execute %s (error %d)\n",
                       ramdisk_execute_command, ret);
        }

        /*
         * We try each of these until one succeeds.
         *
         * The Bourne shell can be used instead of init if we are
         * trying to recover a really broken machine.
         */
        //execute_command的值如果有定义就去根目录下找对应的应用程序,然后启动 
        if (execute_command) {
                ret = run_init_process(execute_command);
                if (!ret)
                        return 0;
                panic("Requested init %s failed (error %d).",
                      execute_command, ret);
        }
        //如果ramdisk_execute_command和execute_command定义的应用程序都没有找到, 
    //就到根目录下找 /sbin/init,/etc/init,/bin/init,/bin/sh 这四个应用程序进行启动
        if (!try_to_run_init_process("/sbin/init") ||
            !try_to_run_init_process("/etc/init") ||
            !try_to_run_init_process("/bin/init") ||
            !try_to_run_init_process("/bin/sh"))
                return 0;

        panic("No working init found.  Try passing init= option to kernel. "
              "See Linux Documentation/admin-guide/init.rst for guidance.");
}

kernel_init()函数主要工作:

  1. 调用kernel_init_freeable()函数完成一些init的初始化,然后释放相应的内存。
  2. 接着去系统根目录下寻找ramdisk_execute_commandexecute_command设置的程序,如果这些程序存在,则调用run_init_process()执行;如果没找到这两个程序,就去根目录下依次寻找/sbin/init,/etc/init,/bin/init,/bin/sh这四个程序进行启动,启动一个后就结束。

kernel_init_freeable()函数如下:

path:kernel-4.14/init/main.c

static noinline void __init kernel_init_freeable(void)
{
        /*
         * Wait until kthreadd is all set-up.
         */
        // 等待&kthreadd_done这个值complete,这个在rest_init方法中有写,在ktreadd进程启动完成后设置为complete
        wait_for_completion(&kthreadd_done);
        // 设置bitmask, 使得init进程可以使用PM并且允许I/O阻塞操作
        gfp_allowed_mask = __GFP_BITS_MASK;
        // init进程可以分配物理页面
        set_mems_allowed(node_states[N_MEMORY]);
        // 设置到init进程的pid号给cad_pid,cad就是ctrl-alt-del,设置init进程来处理ctrl-alt-del信号
        cad_pid = task_pid(current);
        // 设置smp初始化时最大的cpu数量,然后将对应数量的cpu设置为present
        smp_prepare_cpus(setup_max_cpus);
        // 初始化队列
        workqueue_init();
        // 
        init_mm_internals();
        do_pre_smp_initcalls();
        // 开启watchdog_threads,watchdog用于监控、管理cpu运行状态
        lockup_detector_init();
        // 启动cpu0以外的其他核
        smp_init();
        // 进程调度域初始化
        sched_init_smp();

        page_alloc_init_late();
        // 初始化设备、驱动等
        do_basic_setup();

        /* Open the /dev/console on the rootfs, this should never fail */
        // 打开/dev/console,文件号0,作为init进程标准输入
        if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
                pr_err("Warning: unable to open an initial console.\n");
        // 标准输入
        (void) sys_dup(0);
        // 标准输出
        (void) sys_dup(0);
        /*
         * check if there is an early userspace init.  If yes, let it do all
         * the work
         */
        // 如果ramdisk_execute_command未赋值,则赋值为"/init"
        if (!ramdisk_execute_command)
                ramdisk_execute_command = "/init";
        // 尝试进入ramdisk_execute_command指向的文件,失败就重新挂载根文件系统
        if (sys_access((const char __user *) ramdisk_execute_command, 0) != 0) {
                ramdisk_execute_command = NULL;
                prepare_namespace();
        }

        /*
         * Ok, we have completed the initial bootup, and
         * we're essentially up and running. Get rid of the
         * initmem segments and start the user-mode stuff..
         *
         * rootfs is available now, try loading the public keys
         * and default modules
         */

        integrity_load_keys();
        // 加载I/O调度的电梯算法
        load_default_modules();
}

kernel_init_freeable()函数启动了smp、启动watchdog、初始化设备和驱动、打开标准输入输出、初始化文件系统等。

SMP是Symmetrical Multi-Processing,即对称多处理。在这种技术的支持下,一个服务器系统可以同时运行多个处理器,并共享内存和其他的主机资源。

do_basic_setup()函数中初始化了很多设备和驱动,函数如下:

path:kernel-4.14/init/main.c

/*
 * Ok, the machine is now initialized. None of the devices
 * have been touched yet, but the CPU subsystem is up and
 * running, and memory and process management works.
 *
 * Now we can finally start doing some real work..
 */
static void __init do_basic_setup(void)
{
        //针对SMP系统,初始化内核control group的cpuset子系统。
        cpuset_init_smp();
        // 初始化共享内存
        shmem_init();
        // 初始化设备驱动
        driver_init();
        //创建/proc/irq目录, 并初始化系统中所有中断对应的子目录
        init_irq_proc();
        // 执行内核的构造函数
        do_ctors();
        // 启动usermodehelper
        usermodehelper_enable();
        //遍历initcall_levels数组,调用里面的initcall函数,这里主要是对设备、驱动、文件系统进行初始化,之所有将函数封装到数组进行遍历,主要是为了好扩展
        do_initcalls();
}

driver_init()函数如下:

path:kernel-4.14/drivers/base/init.c
/**
 * driver_init - initialize driver model.
 *
 * Call the driver model init functions to initialize their
 * subsystems. Called early from init/main.c.
 */ 
void __init driver_init(void)
{
        /* These are the core pieces */
        // 注册devtnpfs系统
        devtmpfs_init();
        // 初始化驱动模型中的部分子系统,kset:devices 和 kobject:dev、 dev/block、 dev/char
        devices_init();
        // 初始化驱动模型中的bus子系统,kset:bus、devices/system
        buses_init();
        // 初始化驱动模型中的class子系统,kset:class
        classes_init();
        // 初始化驱动模型中的firmware子系统 ,kobject:firmware
        firmware_init();
        // 初始化驱动模型中的hypervisor子系统,kobject:hypervisor
        hypervisor_init();

        /* These are also core pieces, but must come after the
         * core core pieces.
         */
        // 初始化驱动模型中的bus/platform子系统,这个节点是所有platform设备和驱动的总线类型,即所有platform设备和驱动都会挂载到这个总线上     
        platform_bus_init();
        // 初始化驱动模型中的devices/system/cpu子系统,该节点包含CPU相关的属性
        cpu_dev_init();
        // 初始化驱动模型中的/devices/system/memory子系统,该节点包含了内存相关的属性,如块大小等
        memory_dev_init();
        // 初始化系统总线类型为容器
        container_dev_init();
        // 初始化创建,访问和解释设备树的过程
        of_core_init();
}

完成以上驱动子系统的构建,实现linux设备驱动的一个整体框架后,kernel_init_freeable()函数中的初始化基本执行结束,接下来看看kernel_init_freeable()中的run_init_process()函数。

static int run_init_process(const char *init_filename)
{
        argv_init[0] = init_filename;
        return do_execve(getname_kernel(init_filename),
                (const char __user *const __user *)argv_init,
                (const char __user *const __user *)envp_init);
}

run_init_process()函数执行do_execve()函数后,执行init程序,演变为用户态的1号进程,也就是init进程。

3.2.2.2 一号进程启动小结

init进程是linux内核启动的第一个用户进程。init进程创建的流程简述为:swapper进程(pid=0)-> 1号内核进程 -> init进程(1号用户进程)。

其中1号内核进程和1号用户进程的区别如下:

  • 1号内核进程在内核态运行,是内核代码;1号用户进程在用户态运行
  • 1号内核进程由内核第一个进程swapper创建;1号用户进程是由1号内核进程演变的
  • 1号内核进程没有自己的名字,只是一个/一系列函数(kernel_init()开始);1号用户进程名为init
  • 1号内核进程演变为1号用户进程(init进程)时,是使用execve()函数执行init程序,过程中没有调用fork()函数。因此1号内核进程和1号用户进程实际上是一个进程在不同时间、不同状态的同一个进程。

1号进程主要功能:

  • 初始化系统运行所需要的条件,例如:启动smp、启动watchdog、初始化设备和驱动、打开标准输入输出、初始化文件系统等。
  • 加载init程序,进入用户态空间。

3.2.3 二号进程

二号进程就是kthreadd进程。

3.2.3.1 kthreadd进程启动

kthreadd也是在reset_init()函数中启动。

static noinline void __ref rest_init(void)
{
        ......
        // 创建1号进程
        pid = kernel_thread(kernel_init, NULL, CLONE_FS);
        ......
        // 创建kthreadd进程
        pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
        ......
        // 等待2号进程创建完成
        complete(&kthreadd_done);
        ......
}

在reset_init()函数中先创建一号进程,然后创建2号进程。

创建进程的时候是需要传递进程执行函数的,从reset_init()中使用kernel_thread创建进程可以知道的kthreadd执行体是ktheradd()函数。

path:kernel-4.14/kernel/kthread.c

int kthreadd(void *unused)
{
        struct task_struct *tsk = current;

        /* Setup a clean context for our children to inherit. */
        set_task_comm(tsk, "kthreadd");
        ignore_signals(tsk);
        // 允许kthreadd在任意cpu上执行
        set_cpus_allowed_ptr(tsk, cpu_all_mask);
        set_mems_allowed(node_states[N_MEMORY]);

        current->flags |= PF_NOFREEZE;
        cgroup_init_kthreadd();

        for (;;) {
                // 将当前状态设置为可中断
                set_current_state(TASK_INTERRUPTIBLE);
                // 如果没有进程需要创建,主动让出cpu
                if (list_empty(&kthread_create_list))
                        schedule();
                // 有进程需要创建,更新运行状态
                __set_current_state(TASK_RUNNING);
                // 加锁保护队列
                spin_lock(&kthread_create_lock);
                // 依次取出队列中的任务
                while (!list_empty(&kthread_create_list)) {
                        struct kthread_create_info *create;
                        // 从队列中取出对象
                        create = list_entry(kthread_create_list.next,
                                            struct kthread_create_info, list);
                        // 从任务聊表中删除
                        list_del_init(&create->list);
                        spin_unlock(&kthread_create_lock);
                        // 创建进程
                        create_kthread(create);

                        spin_lock(&kthread_create_lock);
                }
                // 解锁
                spin_unlock(&kthread_create_lock);
        }

        return 0;
}

从上述代码可以看出,kthreadd()进程的任务就是等待创建进程,在一个无限循环内部,如果任务队列为空,则进程主动让出cpu(进程休眠);如果不为空,则依次从队列中取出任务,然后创建相应的进程。

创建进程使用的函数为create_kthread()

path:kernel-4.14/kernel/kthread.c

static void create_kthread(struct kthread_create_info *create)
{
        int pid;

#ifdef CONFIG_NUMA
        current->pref_node_fork = create->node;
#endif
        /* We want our own signal handler (we take no signals by default). */
        // kernel_thread()函数创建进程,执行体为kthread
        pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
        if (pid < 0) {
                /* If user was SIGKILLed, I release the structure. */
                struct completion *done = xchg(&create->done, NULL);

                if (!done) {
                        kfree(create);
                        return;
                }
                create->result = ERR_PTR(pid);
                complete(done);
        }
}

内核通过kernel_thread接口创建进程,包括内核态的1号进程和kthreadd进程本身都是通过kernel_thread创建。在create_kthread()函数中,创建的进程执行体都是kthread。

path:kernel-4.14/kernel/kthread.c

static int kthread(void *_create)
{
        /* Copy data: it's on kthread's stack */
        struct kthread_create_info *create = _create;
        int (*threadfn)(void *data) = create->threadfn;
        void *data = create->data;
        struct completion *done;
        struct kthread *self;
        int ret;

        self = kmalloc(sizeof(*self), GFP_KERNEL);
        set_kthread_struct(self);

        /* If user was SIGKILLed, I release the structure. */
        done = xchg(&create->done, NULL);
        if (!done) {
                kfree(create);
                do_exit(-EINTR);
        }

        if (!self) {
                create->result = ERR_PTR(-ENOMEM);
                complete(done);
                do_exit(-ENOMEM);
        }

        self->flags = 0;
        self->data = data;
        init_completion(&self->exited);
        init_completion(&self->parked);
        current->vfork_done = &self->exited;

        /* OK, tell user we're spawned, wait for stop or wakeup */
        __set_current_state(TASK_UNINTERRUPTIBLE);
        create->result = current;
        complete(done);
        // 休眠
        schedule();

        ret = -EINTR;
        // 如果唤醒后不需要cpu
        if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) {
                cgroup_kthread_ready();
                __kthread_parkme(self);
                // 执行指定的函数
                ret = threadfn(data);
        }
        do_exit(ret);
}

在kthread()中,进程一旦创建成功,就会一直休眠,直到有人唤醒(wake_up_process())。进程被唤醒后,如果不需要stop,则执行threadfn(data)

3.2.3.2 kthreadd启动小结

image.png

  • 某一个线程A(左上那个圈)调用kthread_create函数来创建新线程,调用后阻塞;kthread_create会将任务封装后添加到kthreadd监控的工作队列中;
  • kthreadd进程检测到工作队列中有任务,则结束休眠状态,通过调用create_kthread函数创建线程,最后调用到kernel_thread -- > do_fork来创建线程,且新线程执行体为kthead
  • 新线程创建成功后,执行kthead,kthreadd线程则继续睡眠等待创建新进程;
  • 线程A调用kthread_create返回后,在合适的时候通过wake_up_process(pid)来唤醒新创建的线程
  • 新创建的线程在kthead执行体中被唤醒,检测到是否需要stop,在不需要stop时,执行用户指定的线程执行体。 (线程执行体发生了变化:先执行默认的kthead,然后才是用户指定的threadfn,当然也可能直接执行do_exit退出线程)

kthreadd进程由swapper通过kernel_thread创建,并始终运行在内核空间, 负责所有内核线程的调度和管理,它的任务就是管理和调度其他内核线程kernel_thread, 会循环执行一个kthreadd的函数,该函数的作用就是运行kthread_create_list全局链表中维护的kthread, 当我们调用kernel_thread创建的内核线程会被加入到此链表中,因此所有的内核线程都是直接或者间接的以kthreadd为父进程

3.2.4 Kernel层总结

kthreadd.drawio.png kernel启动的进程如下所示:

3.3 Native层

Native层中,init进程(pid=1)是这层的大主管。在Native层中,init进程负责孵化各种系统服务、守护进程等。其中最重要的就是孵化出zygote进程。zygote是Java层的鼻祖,所有的Java层进程的产生都依赖zygote。

3.3.1 main函数

在Navite层,init进程还需要启动Media Server等服务。

/system/core/init/main.cpp

#include "init.h"
int main(int argc, char** argv) {
    android::init::main(argc, argv);
}

main函数中启动init.h中的android::init::main()函数,init.cpp中实现此函数。

/system/core/init/init.cpp

namespace android {
namespace init {
int main(int argc, char** argv) {
    // 启动ueventd
    if (!strcmp(basename(argv[0]), "ueventd")) {
        return ueventd_main(argc, argv);
    }
    // 启动watchdog
    if (!strcmp(basename(argv[0]), "watchdogd")) {
        return watchdogd_main(argc, argv);
    }
    // 启动subcontext
    if (argc > 1 && !strcmp(argv[1], "subcontext")) {
        InitKernelLogging(argv);
        const BuiltinFunctionMap function_map;
        return SubcontextMain(argc, argv, &function_map);
    }

    if (REBOOT_BOOTLOADER_ON_PANIC) {
        InstallRebootSignalHandlers();
    }
    // 判断是否是第一次初始化
    bool is_first_stage = (getenv("INIT_SECOND_STAGE") == nullptr);
    // 如果是第一阶段
    if (is_first_stage) {
        ......
        // 创建文件并挂载
        mount("tmpfs", "/dev", "tmpfs", MS_NOSUID, "mode=0755");
        mkdir("/dev/pts", 0755);
        mkdir("/dev/socket", 0755);
        mount("devpts", "/dev/pts", "devpts", 0, NULL);
        #define MAKE_STR(x) __STRING(x)
        mount("proc", "/proc", "proc", 0, "hidepid=2,gid=" MAKE_STR(AID_READPROC));
        // Don't expose the raw commandline to unprivileged processes.
        chmod("/proc/cmdline", 0440);
        gid_t groups[] = { AID_READPROC };
        setgroups(arraysize(groups), groups);
        mount("sysfs", "/sys", "sysfs", 0, NULL);
        mount("selinuxfs", "/sys/fs/selinux", "selinuxfs", 0, NULL);

        mknod("/dev/kmsg", S_IFCHR | 0600, makedev(1, 11));

        if constexpr (WORLD_WRITABLE_KMSG) {
            mknod("/dev/kmsg_debug", S_IFCHR | 0622, makedev(1, 11));
        }

        mknod("/dev/random", S_IFCHR | 0666, makedev(1, 8));
        mknod("/dev/urandom", S_IFCHR | 0666, makedev(1, 9));

        // Mount staging areas for devices managed by vold
        // See storage config details at http://source.android.com/devices/storage/
        mount("tmpfs", "/mnt", "tmpfs", MS_NOEXEC | MS_NOSUID | MS_NODEV,
              "mode=0755,uid=0,gid=1000");
        // /mnt/vendor is used to mount vendor-specific partitions that can not be
        // part of the vendor partition, e.g. because they are mounted read-write.
        mkdir("/mnt/vendor", 0755);

        // Now that tmpfs is mounted on /dev and we have /dev/kmsg, we can actually
        // talk to the outside world...
        InitKernelLogging(argv);

        LOG(INFO) << "init first stage started!";

        if (!DoFirstStageMount()) {
            LOG(FATAL) << "Failed to mount required partitions early ...";
        }

        SetInitAvbVersionInRecovery();

        // Enable seccomp if global boot option was passed (otherwise it is enabled in zygote).
        global_seccomp();

        // Set up SELinux, loading the SELinux policy.
        SelinuxSetupKernelLogging();
        SelinuxInitialize();

        // We're in the kernel domain, so re-exec init to transition to the init domain now
        // that the SELinux policy has been loaded.
        if (selinux_android_restorecon("/init", 0) == -1) {
            PLOG(FATAL) << "restorecon failed of /init failed";
        }

        setenv("INIT_SECOND_STAGE", "true", 1);

        static constexpr uint32_t kNanosecondsPerMillisecond = 1e6;
        uint64_t start_ms = start_time.time_since_epoch().count() / kNanosecondsPerMillisecond;
        setenv("INIT_STARTED_AT", std::to_string(start_ms).c_str(), 1);

        char* path = argv[0];
        char* args[] = { path, nullptr };
        execv(path, args);

        // execv() only returns if an error happened, in which case we
        // panic and never fall through this conditional.
        PLOG(FATAL) << "execv("" << path << "") failed";
    }
    
    // 第二阶段
    // At this point we're in the second stage of init.
    InitKernelLogging(argv);
    LOG(INFO) << "init second stage started!";
    ......
    // Indicate that booting is in progress to background fw loaders, etc.
    close(open("/dev/.booting", O_WRONLY | O_CREAT | O_CLOEXEC, 0000));
    // 初始化相关系统属性
    property_init();
    ......
    // 启动系统属性服务
    start_property_service();
    set_usb_controller();
    ......
    // 获取ActionManager和ServiceList
    ActionManager& am = ActionManager::GetInstance();
    ServiceList& sm = ServiceList::GetInstance();
    // 加载boot脚本,解析init.rc文件
    LoadBootScripts(am, sm);
    ......
    }

    }
}

android::init::main()函数中,使用loadBootScripts()加载init.rc配置文件。

static void LoadBootScripts(ActionManager& action_manager, ServiceList& service_list) {
    Parser parser = CreateParser(action_manager, service_list);

    std::string bootscript = GetProperty("ro.boot.init_rc", "");
    if (bootscript.empty()) {
        // 解析配置文件
        parser.ParseConfig("/init.rc");
        if (!parser.ParseConfig("/system/etc/init")) {
            late_import_paths.emplace_back("/system/etc/init");
        }
        if (!parser.ParseConfig("/product/etc/init")) {
            late_import_paths.emplace_back("/product/etc/init");
        }
        if (!parser.ParseConfig("/odm/etc/init")) {
            late_import_paths.emplace_back("/odm/etc/init");
        }
        if (!parser.ParseConfig("/vendor/etc/init")) {
            late_import_paths.emplace_back("/vendor/etc/init");
        }
    } else {
        parser.ParseConfig(bootscript);
    }
}

综上,从init进程的main函数可以看到,init进程中main()主要做了以下两件事情:

  • 初始化某些系统属性
  • 解析init.rc

而启动架构中的init(pid=1)中,init进程启动zygote进程、servicemanager大管家、MediaServer、bootanimation等均在init.rc中定义。

3.3.2 init.rc

init.rc语法:Android init.rc语法

path: system/core/rootdir/init.rc

# Copyright (C) 2012 The Android Open Source Project
#
# IMPORTANT: Do not create world writable files or directories.
# This is a common source of Android security bugs.
#
 
"【import <filename>一个init配置文件,扩展当前配置。】"
import /init.environ.rc
import /init.usb.rc
import /init.${ro.hardware}.rc
import /init.${ro.zygote}.rc
import /init.trace.rc
 
"【触发条件early-init,在early-init阶段调用以下行】"
on early-init
    # Set init and its forked children's oom_adj.
    write /proc/1/oom_score_adj -1000
    "【打开路径为<path>的一个文件,并写入一个或多个字符串】"
    # Apply strict SELinux checking of PROT_EXEC on mmap/mprotect calls.
    write /sys/fs/selinux/checkreqprot 0
 
    # Set the security context for the init process.
    # This should occur before anything else (e.g. ueventd) is started.
    "【这段脚本的意思是init进程启动之后就马上调用函数setcon将自己的安全上下文设置为“u:r:init:s0”,即将init进程的domain指定为init。】"
    setcon u:r:init:s0
 
    # Set the security context of /adb_keys if present.
    "【恢复指定文件到file_contexts配置中指定的安全上线文环境】"
    restorecon /adb_keys
 
    "【执行start ueventd的命令。ueventd是一个service后面有定义】 "
    start ueventd
 
    "【mkdir <path> [mode] [owner] [group]   //创建一个目录<path>,可以选择性地指定mode、owner以及group。如果没有指定,默认的权限为755,并属于root用户和root组。】"
    # create mountpoints
    mkdir /mnt 0775 root system
 
on init
    "【设置系统时钟的基准,比如0代表GMT,即以格林尼治时间为准】"
    sysclktz 0
 
"【设置kernel日志等级】"
loglevel 6 ####
    write /proc/bootprof "INIT: on init start" ####
 
    "【symlink <target> <path>    //创建一个指向<path>的软连接<target>。】"
    # Backward compatibility
    symlink /system/etc /etc
    symlink /sys/kernel/debug /d
 
    # Right now vendor lives on the same filesystem as system,
    # but someday that may change.
    symlink /system/vendor /vendor
 
    "【创建一个目录<path>,可以选择性地指定mode、owner以及group。】"
    # Create cgroup mount point for cpu accounting
    mkdir /acct
    mount cgroup none /acct cpuacct
    mkdir /acct/uid
 
    "【mount <type> <device> <dir> [ <mountoption> ]   //在目录<dir>挂载指定的设备。<device> 可以是以 mtd@name 的形式指定一个mtd块设备。<mountoption>包括 ro、rw、remount、noatime、 ...】"
    # Create cgroup mount point for memory
    mount tmpfs none /sys/fs/cgroup mode=0750,uid=0,gid=1000
    mkdir /sys/fs/cgroup/memory 0750 root system
    mount cgroup none /sys/fs/cgroup/memory memory
    write /sys/fs/cgroup/memory/memory.move_charge_at_immigrate 1
    "【chown <owner> <group> <path>   //改变文件的所有者和组。】"
 
    "【后面的一些行因为类似,就省略了】"
    .....
 
# Healthd can trigger a full boot from charger mode by signaling this
# property when the power button is held.
on property:sys.boot_from_charger_mode=1
    "【停止指定类别服务类下的所有已运行的服务】"
    class_stop charger
    "【触发一个事件,将该action排在某个action之后(用于Action排队)】"
    trigger late-init
 
# Load properties from /system/ + /factory after fs mount.
on load_all_props_action
    "【从/system,/vendor加载属性。默认包含在init.rc】"
    load_all_props
 
# Indicate to fw loaders that the relevant mounts are up.
on firmware_mounts_complete
    "【删除指定路径下的文件】"
    rm /dev/.booting
 
# Mount filesystems and start core system services.
on late-init
    "【触发一个事件。用于将一个action与另一个 action排列。】"
    trigger early-fs
    trigger fs
    trigger post-fs
    trigger post-fs-data
 
    # Load properties from /system/ + /factory after fs mount. Place
    # this in another action so that the load will be scheduled after the prior
    # issued fs triggers have completed.
    trigger load_all_props_action
 
    # Remove a file to wake up anything waiting for firmware.
    trigger firmware_mounts_complete
 
    trigger early-boot
    trigger boot
 
 
on post-fs
    ...
    "【一些创造目录,建立链接,更改权限的操作,这里省略】"
 
on post-fs-data
    ...
    "【一些创造目录,建立链接,更改权限的操作,这里省略】"
 
    "【恢复指定文件到file_contexts配置中指定的安全上线文环境】"
    restorecon /data/mediaserver
 
    "【将系统属性<name>的值设置为<value>,即以键值对的方式设置系统属性】"
    # Reload policy from /data/security if present.
    setprop selinux.reload_policy 1
 
    "【以递归的方式恢复指定目录到file_contexts配置中指定的安全上下文中】"
    # Set SELinux security contexts on upgrade or policy update.
    restorecon_recursive /data
 
    # If there is no fs-post-data action in the init.<device>.rc file, you
    # must uncomment this line, otherwise encrypted filesystems
    # won't work.
    # Set indication (checked by vold) that we have finished this action
    #setprop vold.post_fs_data_done 1
 
on boot
    "【初始化网络】"
    # basic network init
    ifup lo
    "【设置主机名为localhost】"
    hostname localhost
    "【设置域名localdomain】"
    domainname localdomain
 
    "【设置资源限制】"
    # set RLIMIT_NICE to allow priorities from 19 to -20
    setrlimit 13 40 40
 
    "【这里省略了一些chmod,chown,等操作,不多解释】"
   ...
 
 
    # Define default initial receive window size in segments.
    setprop net.tcp.default_init_rwnd 60
 
    "【重启core服务】"
    class_start core
 
on nonencrypted
    class_start main
    class_start late_start
 
on property:vold.decrypt=trigger_default_encryption
    start defaultcrypto
 
on property:vold.decrypt=trigger_encryption
    start surfaceflinger
    start encrypt
 
on property:sys.init_log_level=*
    loglevel ${sys.init_log_level}
 
on charger
    class_start charger
 
on property:vold.decrypt=trigger_reset_main
    class_reset main
 
on property:vold.decrypt=trigger_load_persist_props
    load_persist_props
 
on property:vold.decrypt=trigger_post_fs_data
    trigger post-fs-data
 
on property:vold.decrypt=trigger_restart_min_framework
    class_start main
 
on property:vold.decrypt=trigger_restart_framework
    class_start main
    class_start late_start
 
on property:vold.decrypt=trigger_shutdown_framework
    class_reset late_start
    class_reset main
 
on property:sys.powerctl=*
    powerctl ${sys.powerctl}
 
# system server cannot write to /proc/sys files,
# and chown/chmod does not work for /proc/sys/ entries.
# So proxy writes through init.
on property:sys.sysctl.extra_free_kbytes=*
    write /proc/sys/vm/extra_free_kbytes ${sys.sysctl.extra_free_kbytes}
 
# "tcp_default_init_rwnd" Is too long!
on property:sys.sysctl.tcp_def_init_rwnd=*
    write /proc/sys/net/ipv4/tcp_default_init_rwnd ${sys.sysctl.tcp_def_init_rwnd}
 
"【守护进程】"
## Daemon processes to be run by init.
##
service ueventd /sbin/ueventd
    class core
    critical
    seclabel u:r:ueventd:s0
 
"【日志服务进程】"
service logd /system/bin/logd
    class core
    socket logd stream 0666 logd logd
    socket logdr seqpacket 0666 logd logd
    socket logdw dgram 0222 logd logd
    seclabel u:r:logd:s0
 
"【Healthd是android4.4之后提出来的一种中介模型,该模型向下监听来自底层的电池事件,向上传递电池数据信息给Framework层的BatteryService用以计算电池电量相关状态信息】"
service healthd /sbin/healthd
    class core
    critical
    seclabel u:r:healthd:s0
 
"【控制台进程】"
service console /system/bin/sh
    "【为当前service设定一个类别.相同类别的服务将会同时启动或者停止,默认类名是default】"
    class core
    "【服务需要一个控制台】"
    console
    "【服务不会自动启动,必须通过服务名显式启动】"
    disabled
    "【在执行此服务之前切换用户名,当前默认的是root.自Android M开始,即使它要求linux capabilities,也应该使用该选项.很明显,为了获得该功能,进程需要以root用户运行】"
    user shell
    seclabel u:r:shell:s0
 
on property:ro.debuggable=1
    start console
 
# 启动adbd服务进程
service adbd /sbin/adbd --root_seclabel=u:r:su:s0
    class core
    "【创建一个unix域下的socket,其被命名/dev/socket/<name>. 并将其文件描述符fd返回给服务进程.其中,type必须为dgram,stream或者seqpacke,user和group默认是0.seclabel是该socket的SELLinux的安全上下文环境,默认是当前service的上下文环境,通过seclabel指定】"
    socket adbd stream 660 system system
    disabled
    seclabel u:r:adbd:s0
 
# adbd on at boot in emulator
on property:ro.kernel.qemu=1
    start adbd
 
"【内存管理服务,内存不够释放内存】"
service lmkd /system/bin/lmkd
    class core
    critical
    socket lmkd seqpacket 0660 system system
 
"【ServiceManager是一个守护进程,它维护着系统服务和客户端的binder通信。
在Android系统中用到最多的通信机制就是Binder,Binder主要由Client、Server、ServiceManager和Binder驱动程序组成。其中Client、Service和ServiceManager运行在用户空间,而Binder驱动程序运行在内核空间。核心组件就是Binder驱动程序了,而ServiceManager提供辅助管理的功能,无论是Client还是Service进行通信前首先要和ServiceManager取得联系。而ServiceManager是一个守护进程,负责管理Server并向Client提供查询Server的功能。】"
service servicemanager /system/bin/servicemanager
    class core
    user system
    group system
    critical
    onrestart restart healthd
    "【servicemanager 服务启动时会重启zygote服务】"
    onrestart restart zygote
    onrestart restart media
    onrestart restart surfaceflinger
    onrestart restart drm
 
"【Vold是Volume Daemon的缩写,它是Android平台中外部存储系统的管控中心,是管理和控制Android平台外部存储设备的后台进程】"
service vold /system/bin/vold
    class core
    socket vold stream 0660 root mount
    ioprio be 2
 
"【Netd是Android系统中专门负责网络管理和控制的后台daemon程序】"
service netd /system/bin/netd
    class main
    socket netd stream 0660 root system
    socket dnsproxyd stream 0660 root inet
    socket mdns stream 0660 root system
    socket fwmarkd stream 0660 root inet
 
"【debuggerd是一个daemon进程,在系统启动时随着init进程启动。主要负责将进程运行时的信息dump到文件或者控制台中】"
service debuggerd /system/bin/debuggerd
    class main
 
service debuggerd64 /system/bin/debuggerd64
    class main
 
"【Android RIL (Radio Interface Layer)提供了Telephony服务和Radio硬件之间的抽象层】"
# for using TK init.modem.rc rild-daemon setting
#service ril-daemon /system/bin/rild
#    class main
#    socket rild stream 660 root radio
#    socket rild-debug stream 660 radio system
#    user root
#    group radio cache inet misc audio log
 
"【提供系统 范围内的surface composer功能,它能够将各种应用 程序的2D、3D surface进行组合。】"
service surfaceflinger /system/bin/surfaceflinger
    class core
    user system
    group graphics drmrpc
    onrestart restart zygote
 
"【DRM可以直接访问DRM clients的硬件。DRM驱动用来处理DMA,内存管理,资源锁以及安全硬件访问。为了同时支持多个3D应用,3D图形卡硬件必须作为一个共享资源,因此需要锁来提供互斥访问。DMA传输和AGP接口用来发送图形操作的buffers到显卡硬件,因此要防止客户端越权访问显卡硬件。】"
#make sure drm server has rights to read and write sdcard ####
service drm /system/bin/drmserver
    class main
    user drm
    # group drm system inet drmrpc ####
    group drm system inet drmrpc sdcard_r ####
 
"【媒体服务,无需多说】"
service media /system/bin/mediaserver
    class main
    user root ####
#   google default ####
#   user media    ####
    group audio camera inet net_bt net_bt_admin net_bw_acct drmrpc mediadrm media sdcard_r system net_bt_stack ####
#   google default ####
#   group audio camera inet net_bt net_bt_admin net_bw_acct drmrpc mediadrm ####
 
    ioprio rt 4
 
"【设备加密相关服务】"
# One shot invocation to deal with encrypted volume.
service defaultcrypto /system/bin/vdc --wait cryptfs mountdefaultencrypted
    disabled
    "【当服务退出时,不重启该服务】"
    oneshot
    # vold will set vold.decrypt to trigger_restart_framework (default
    # encryption) or trigger_restart_min_framework (other encryption)
 
# One shot invocation to encrypt unencrypted volumes
service encrypt /system/bin/vdc --wait cryptfs enablecrypto inplace default
    disabled
    oneshot
    # vold will set vold.decrypt to trigger_restart_framework (default
    # encryption)
 
"【开机动画服务】"
service bootanim /system/bin/bootanimation
    class core
    user graphics
#    group graphics audio ####
    group graphics media audio ####
    disabled
    oneshot
 
"【在Android系统中,PackageManagerService用于管理系统中的所有安装包信息及应用程序的安装卸载,但是应用程序的安装与卸载并非PackageManagerService来完成,而是通过PackageManagerService来访问installd服务来执行程序包的安装与卸载的。】"
service installd /system/bin/installd
    class main
    socket installd stream 600 system system
 
service flash_recovery /system/bin/install-recovery.sh
    class main
    seclabel u:r:install_recovery:s0
    oneshot
 
"【vpn相关的服务】"
service racoon /system/bin/racoon
    class main
    socket racoon stream 600 system system
    # IKE uses UDP port 500. Racoon will setuid to vpn after binding the port.
    group vpn net_admin inet
    disabled
    oneshot
 
"【android中有mtpd命令可以连接vpn】"
service mtpd /system/bin/mtpd
    class main
    socket mtpd stream 600 system system
    user vpn
    group vpn net_admin inet net_raw
    disabled
    oneshot
 
service keystore /system/bin/keystore /data/misc/keystore
    class main
    user keystore
    group keystore drmrpc
 
"【可以用dumpstate 获取设备的各种信息】"
service dumpstate /system/bin/dumpstate -s
    class main
    socket dumpstate stream 0660 shell log
    disabled
    oneshot
 
"【mdnsd 是多播 DNS 和 DNS 服务发现的守护程序。】"
service mdnsd /system/bin/mdnsd
    class main
    user mdnsr
    group inet net_raw
    socket mdnsd stream 0660 mdnsr inet
    disabled
    oneshot
 
"【触发关机流程继续往下走】"
service pre-recovery /system/bin/uncrypt
    class main
    disabled
    "【当服务退出时,不重启该服务】"
    oneshot

总结:在init.rc中,主要做了以下工作:

  • 生成设备驱动节点、修改权限等准备工作
  • 启动ueventd、logd、healthd、console、adbd、lmkd等守护进程。
  • 启动servicemanager(binder服务管家)、bootanim(开机动画)、surfaceFlinger等重要服务。
  • 孵化出zygote()进程。zygote进程是Android系统的第一个Java进程,也是所有Java进程的父进程。

3.3.3 启动zygote

在parser进程中,init_parse.cpp解析相应的rc文件。在Android 7.0之后,每一个定义在init.rc中的service都被拆分成了一个rc文件。

zygote定义在system/core/rootdir/init.zygote*.rc中。以init.zygote64.rc作为示例:

system/core/rootdir/init.zygote64.rc

service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server
    class main
    priority -20
    user root
    group root readproc reserved_disk
    socket zygote stream 660 root system
    onrestart write /sys/android_power/request_state wake
    onrestart write /sys/power/state on
    onrestart restart audioserver
    onrestart restart cameraserver
    onrestart restart media
    onrestart restart netd
    onrestart restart wificond
    writepid /dev/cpuset/foreground/tasks

从以上init.zygote64.rc中,该rc主要是创建了一个名为zygote的service,可执行文件路径为/system/bin/app_process64;创建一个socketinfo结构体用于socket通信;五个onrestart的action结构体(图中绿色部分)。

在Android 9.0中,以X10P为例,service里面有很多service有onrestart关键字重启其他service,例如:servicemanager、zygote、surfaceflinger等。但是只有servicemanager、surfaceflinger、zygote、system_server被杀时会重启zygote。

./frameworks/native/cmds/servicemanager/servicemanager.rc
service servicemanager /system/bin/servicemanager
    class core animation
    user system
    group system readproc
    critical
    onrestart restart healthd
    onrestart restart zygote
    onrestart restart audioserver
    onrestart restart media
    onrestart restart surfaceflinger
    onrestart restart inputflinger
    onrestart restart drm
    onrestart restart cameraserver
    onrestart restart keystore
    onrestart restart gatekeeperd
    writepid /dev/cpuset/system-background/tasks
    shutdown critical
    
system/core/rootdir/init.zygote64.rc
service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server
    class main
    priority -20
    user root
    group root readproc reserved_disk
    socket zygote stream 660 root system
    onrestart write /sys/android_power/request_state wake
    onrestart write /sys/power/state on
    onrestart restart audioserver
    onrestart restart cameraserver
    onrestart restart media
    onrestart restart netd
    onrestart restart wificond
    writepid /dev/cpuset/foreground/tasks
    
./frameworks/native/services/surfaceflinger/surfaceflinger.rc    
service surfaceflinger /system/bin/surfaceflinger
    class core animation
    user system
    group graphics drmrpc readproc
    onrestart restart zygote
    writepid /dev/stune/foreground/tasks
    socket pdx/system/vr/display/client     stream 0666 system graphics u:object_r:pdx_display_client_endpoint_socket:s0
    socket pdx/system/vr/display/manager    stream 0666 system graphics u:object_r:pdx_display_manager_endpoint_socket:s0
    socket pdx/system/vr/display/vsync      stream 0666 system graphics u:object_r:pdx_display_vsync_endpoint_socket:s0    

zygote服务会随着app_main.cpp中main函数的启动而启动。退出后会由init启动zygote,因此多次重启也不会进入recovery模式。zygote在Service.start()中通过fork()函数创建,接着由execv()进入app_main的main()函数,启动com.android.internal.os.ZygoteInit.

Result<Success> Service::Start() {
    ......
    pid_t pid = -1;
    if (namespace_flags_) {
        pid = clone(nullptr, nullptr, namespace_flags_ | SIGCHLD, nullptr);
    } else {
        pid = fork();
    }
    if (pid == 0) {
        ......
        if (!ExpandArgsAndExecv(args_)) {
            PLOG(ERROR) << "cannot execve('" << args_[0] << "')";
        }

        _exit(127);
    }

    if (pid < 0) {
        pid_ = 0;
        return ErrnoError() << "Failed to fork";
    }
    ......
    return Success();
}

在Native进程中,最后会在init.cpp中调用parser.ParseConfig("/init.rc")解析init脚本,从而启动zygote进程。简要流程如下:

3.3.4 Native小结:

android_native_start.drawio.png

在Native层,从Kernel中的1号进程完成kernel space的工作后,1号进程演化为user space的第一个进程:init进程。init进程是用户空间所有进程的父进程。init进程在用户空间中为Java层的启动做了以下工作:

  • 初始化某些系统属性、生成设备驱动节点、修改某些权限,为Java层启动做初始化的准备工作。
  • 启动ueventd、logd、healthd、console、adbd、lmkd(内存监控,低内存时释放资源保障系统运行)等守护进程。
  • 启动servicemanager(binder服务管家)、bootanim(开机动画)、surfaceFlinger等重要服务。
  • 孵化出zygote()进程。zygote进程是Android系统的第一个Java进程,也是所有Java进程的父进程。

3.4 Java Framework层

这层的初始进程是 Zygote进程。它负责注册Zygote Socket服务端套接字,加载虚拟机,preloadClasses和preloadResouces。

  System Server进程:负责启动和管理整个Java framework,包含AMS、WMS、PMS等服务。 PKMS启动详细流程:PackageManagerService启动流程

3.4.1 zygote进入Java层

Java层第一个进程是zygote,本层其他所有进程都是zygote通过fork出来的子进程。因此Zygote是java层所有进程的父进程。

在Native层(3.3.3),zygote在Service.start()中通过fork()函数创建,接着由execv()进入app_main的main()函数,启动com.android.internal.os.ZygoteInit.

app_main.cpp中的main()函数如下:

frameworks/base/cmds/app_process/app_main.cpp

class AppRuntime : public AndroidRuntime
{    
    ......
}

int main(int argc, char* const argv[])
{
    ......
    AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
    ......
    if (zygote) {
        runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
    } else if (className) {
        runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
    } else {
        fprintf(stderr, "Error: no class name or --zygote supplied.\n");
        app_usage();
        LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
    }
}

在main函数中,处理完指令后,调用runtime.start()启动ZygoteInit。其中Runtime是AppRuntime对象。AppRuntime定义在app_main中,是一个AndroidRuntime子类。实际上runtime.start()就是调用AndroidRuntime.start()。

接下来看看AndroidRuntime::start()函数。

frameworks/base/core/jni/AndroidRuntime.cpp

/*
 * Start the Android runtime.  This involves starting the virtual machine
 * and calling the "static void main(String[] args)" method in the class
 * named by "className".
 *
 * Passes the main function two arguments, the class name and the specified
 * options string.
 */
void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
    ......
    /* start the virtual machine */
    JniInvocation jni_invocation;
    jni_invocation.Init(NULL);
    JNIEnv* env;
    if (startVm(&mJavaVM, &env, zygote) != 0) {
        return;
    }
    onVmCreated(env);

    /*
     * Register android functions.
     */
    if (startReg(env) < 0) {
        ALOGE("Unable to register all android natives\n");
        return;
    }
    ......
    /*
     * Start VM.  This thread becomes the main thread of the VM, and will
     * not return until the VM exits.
     */
    char* slashClassName = toSlashClassName(className != NULL ? className : "");
    jclass startClass = env->FindClass(slashClassName);
    if (startClass == NULL) {
        ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
        /* keep going */
    } else {
        jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
            "([Ljava/lang/String;)V");
        if (startMeth == NULL) {
            ALOGE("JavaVM unable to find main() in '%s'\n", className);
            /* keep going */
        } else {
            env->CallStaticVoidMethod(startClass, startMeth, strArray);

#if 0
            if (env->ExceptionCheck())
                threadExitUncaughtException(env);
#endif
        }
    }
    ......
    
}

在AndroidRuntime::start()函数中,主要做以下工作:

  • startVm(&mJavaVM, &env, zygote):创建Java虚拟机,然后使用startReg(env):为JavaVM注册JNI
  • GetStaticMethodID(startClass, "main","([Ljava/lang/String;)V"):找到ZygoteInit的main函数,然后使用env->CallStaticVoidMethod(startClass, startMeth, strArray):执行JNI调用,从Native层进入Java层架构。

3.4.2 java层中zygote启动

在Native中的AndroidRuntime::start()函数中通过jni调用,zygote从native层进入java层。

frameworks/base/core/java/com/android/internal/os/ZygoteInit.java

public static void main(String argv[]) {
    // 创建zygoteServer对象
    ZygoteServer zygoteServer = new ZygoteServer();

    // Mark zygote start. This ensures that thread creation will throw
    // an error.
    ZygoteHooks.startZygoteNoThreadCreation();
    // 创建一个runnable对象
    final Runnable caller;
    try {
        ......
        // 是否启动systemserver
        boolean startSystemServer = false;
        for (int i = 1; i < argv.length; i++) {
            // 启动zygote的指令中(init.zygote*.rc)包含start-system-server因此启动systemServer
            if ("start-system-server".equals(argv[i])) {
                startSystemServer = true;
            } else if ("--enable-lazy-preload".equals(argv[i])) {
                enableLazyPreload = true;
            } else if (argv[i].startsWith(ABI_LIST_ARG)) {
                abiList = argv[i].substring(ABI_LIST_ARG.length());
            } else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
                socketName = argv[i].substring(SOCKET_NAME_ARG.length());
            } else {
                throw new RuntimeException("Unknown command line argument: " + argv[i]);
            }
        }
        ......
        // 注册zygote的通信使用的socket
        zygoteServer.registerServerSocketFromEnv(socketName);
        if (!enableLazyPreload) {
            ......
            // 预加载资源、类、虚拟机实例等
            preload(bootTimingsTraceLog);
            ......
        } else {
            Zygote.resetNicePriority();
        }
        ......
        // 此处为true
        if (startSystemServer) {
            // 准备参数并且启动systemserver
            Runnable r = forkSystemServer(abiList, socketName, zygoteServer);
            if (r != null) {
                r.run();
                return;
            }
        }
        ......
        // 启动循环等待;等待ams发送来的创建新的应用进程请求。
        caller = zygoteServer.runSelectLoop(abiList);
    } catch (Throwable ex) {
        Log.e(TAG, "System zygote died with exception", ex);
        throw ex;
    } finally {
        zygoteServer.closeServerSocket();
    }
    ......
}

在Java层的ZygoteInit的main函数中,主要做一下工作:

  • 调用zygoteServer.registerServerSocketFromEnv(socketName),建立socket通道,zygote作为通信的服务端,响应客户端请求
  • 调用preload(),预加载通用类、drawable和color等资源、OpenGL共享库和webview等,提高app的启动效率
  • 处理创建zygote的start-system-server命令,然后调用forkSystemServer()函数创建并启动system_server
  • 最终zygote调用runSelectLoop()函数,循环等待,等待ams发送来的创建新的应用进程请求 提高app的启动效率原因如下:

image.png

zygote进程在ZygoteInit.main()中通过preload()对通用类、drawable与color等资源、OpenGL共享库和webview等进行了预加载。当通过zygote作为父进程,fork一个子进程的时候,父进程是共享物理地址资源,是只读不能写。如果进程需要进行写操作,会将zygote的物理地址中的内容复制到一块新的物理地址上,供应用程序使用。这样子进程就可以集成父进程zygote的所有数据信息。以此达到提升应用启动效率的目的。是一种copy-on-write fork。

3.4.3 zygote启动小结

zygote进程通过JNI跨越Native和Java两层,是Java层的第一个进程,有着较为重要的地位。因此在这里单独总结一下zygote进程的启动流程。

zygote启动时序图:

android_zygote_start.drawio.png

  • Native中,解析init.zygote.rc中的参数,调用AndroidRuntime.start()。
    • 函数内部执行startVM()创建虚拟机
    • 执行startReg()注册JNI
    • 通过JNI获取ZygoteInit.java的main()函数,然后调用此函数进入Java空间。
  • 调用zygoteServer.registerServerSocketFromEnv(socketName),建立socket通道,zygote作为通信的服务端,响应客户端请求
  • 调用preload(),预加载通用类、drawable和color等资源、OpenGL共享库和webview等,提高app的启动效率
  • 处理创建zygote的start-system-server命令,然后调用forkSystemServer()函数创建并启动system_server
  • 最终zygote调用runSelectLoop()函数,循环等待,等待ams发送来的创建新的应用进程请求

3.4.4 SystemServer启动

在zygote启动完成进入Java空间后,ZygoteInit.java中通过forkSystemServer()创建并启动system_server。

/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java

private static Runnable forkSystemServer(String abiList, String socketName,
        ZygoteServer zygoteServer) {
    ......
    int pid;
    try {
        ......
        // 创建system_server
        pid = Zygote.forkSystemServer(
                parsedArgs.uid, parsedArgs.gid,
                parsedArgs.gids,
                parsedArgs.runtimeFlags,
                null,
                parsedArgs.permittedCapabilities,
                parsedArgs.effectiveCapabilities);
    } catch (IllegalArgumentException ex) {
        throw new RuntimeException(ex);
    }
    /* For child process */
    if (pid == 0) {
        if (hasSecondZygote(abiList)) {
            waitForSecondaryZygote(socketName);
        }
        // 关闭socket
        zygoteServer.closeServerSocket();
        // 处理systemserever进程
        return handleSystemServerProcess(parsedArgs);
    }
    ......
}

ZygoteInit.java中调用Zygote.forkSystemServer()创建system_server,在Zygote中通过JNI调用启动SystemServer。

frameworks/base/core/java/com/android/internal/os/Zygote.java

public static int forkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
        int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
    VM_HOOKS.preFork();
    // Resets nice priority for zygote process.
    resetNicePriority();
    // 
    int pid = nativeForkSystemServer(
            uid, gid, gids, runtimeFlags, rlimits, permittedCapabilities, effectiveCapabilities);
    // Enable tracing as soon as we enter the system_server.
    if (pid == 0) {
        Trace.setTracingEnabled(true, runtimeFlags);
    }
    VM_HOOKS.postForkCommon();
    return pid;
}

native private static int nativeForkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
        int[][] rlimits, long permittedCapabilities, long effectiveCapabilities);

nativeForkSystemServer()函数的JNI函数如下:

frameworks/base/core/jni/com_android_internal_os_Zygote.cpp

static jint com_android_internal_os_Zygote_nativeForkSystemServer(
        JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
        jint runtime_flags, jobjectArray rlimits, jlong permittedCapabilities,
        jlong effectiveCapabilities) {
  // 调用ForkAndSpecializeCommon()函数
  pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
                                      runtime_flags, rlimits,
                                      permittedCapabilities, effectiveCapabilities,
                                      MOUNT_EXTERNAL_DEFAULT, NULL, NULL, true, NULL,
                                      NULL, false, NULL, NULL);
   ......
   return pid;
}

nativeForkSystemServer()的jni函数内部调用ForkAndSpecializeCommon()。

// Utility routine to fork zygote and specialize the child process.
static pid_t ForkAndSpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray javaGids,
                                     jint runtime_flags, jobjectArray javaRlimits,
                                     jlong permittedCapabilities, jlong effectiveCapabilities,
                                     jint mount_external,
                                     jstring java_se_info, jstring java_se_name,
                                     bool is_system_server, jintArray fdsToClose,
                                     jintArray fdsToIgnore, bool is_child_zygote,
                                     jstring instructionSet, jstring dataDir) {
  // 监听子进程,设置后systemserver受到zygote监听
  SetSignalHandlers();
  ......
  // fork进程
  pid_t pid = fork();
  ......
  return pid;
}

system_server的启动流程如下所示:

android_systemserver_start.drawio.png

3.4.5 其他Java进程的启动

system_server进程fork完成后返回pid,最终将pid返回给ZygoteInit.java,ZygoteInit.java判断system_server进程是否创建成功,做后续处理。

/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java

private static Runnable forkSystemServer(String abiList, String socketName,
        ZygoteServer zygoteServer) {
    ......
    int pid;
    try {
        ......
        // 创建system_server
        pid = Zygote.forkSystemServer(
                parsedArgs.uid, parsedArgs.gid,
                parsedArgs.gids,
                parsedArgs.runtimeFlags,
                null,
                parsedArgs.permittedCapabilities,
                parsedArgs.effectiveCapabilities);
    } catch (IllegalArgumentException ex) {
        throw new RuntimeException(ex);
    }
    /* For child process */
    if (pid == 0) {
        // 根据ro.product.cpu.abilist判断是否需要启动另一个zygote
        if (hasSecondZygote(abiList)) {
            waitForSecondaryZygote(socketName);
        }
        // 关闭socket
        zygoteServer.closeServerSocket();
        // 处理systemserever进程
        return handleSystemServerProcess(parsedArgs);
    }
    ......
}

在ZygoteInit.java中的forkSystemServer()函数中,通过forkSystemServer之后,拿到pid。如果pid==0,表明system_server创建完成。首先根据ro.product.cpu.abilist判断是否需要启动另一个zygote,如需要,等待第二个zygote启动完成。然后关闭socket,并调用handleSystemServerProcess()函数,做system_server启动后的后续工作处理。

 /**
* Finish remaining work for the newly forked system server process.
*/
private static Runnable handleSystemServerProcess(ZygoteConnection.Arguments parsedArgs) {
   ......
    else {
        ClassLoader cl = null;
        if (systemServerClasspath != null) {
            cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);

            Thread.currentThread().setContextClassLoader(cl);
        }
        // init一些设置
        return ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
    }

    /* should never reach here */
}

在handleSystemServerProcess中,需要为新建的system_server进程完成剩余的工作。

 /**
* Finish remaining work for the newly forked system server process.
*/
private static Runnable handleSystemServerProcess(ZygoteConnection.Arguments parsedArgs) {
   ......
    else {
        ClassLoader cl = null;
        if (systemServerClasspath != null) {
            cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);

            Thread.currentThread().setContextClassLoader(cl);
        }
        // init一些设置
        return ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
    }

    /* should never reach here */
}

在ZygoteInit.java中,zygoteInit()函数做重定向日志输出、通用设置初始化、nativeZygoteInit初始化。

   nativeZygoteInit() 函数通过jni调用com_android_internal_os_ZygoteInit_nativeZygoteInit()函数,实际上进入app_main.cpp的onZygoteInit()函数。

frameworks/base/core/jni/AndroidRuntime.cpp

static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)
{
    gCurRuntime->onZygoteInit();
}

proc->startThreadPool(); 启动Binder线程池,这样就可以与其他进程进行通信。

frameworks/base/cmds/app_process/app_main.cpp

virtual void onZygoteInit()
{
    sp<ProcessState> proc = ProcessState::self();
    ALOGV("App process: starting thread pool.\n");
    proc->startThreadPool();
}

随着nativeZygoteInit()函数执行,native层中的Binder线程池也完成启动。

接着ZygoteInit.java中继续执行RuntimeInit.applicationInit()函数。

frameworks/base/core/java/com/android/internal/os/RuntimeInit.java

protected static Runnable applicationInit(int targetSdkVersion, String[] argv,
        ClassLoader classLoader) {
    ......
    // find system_server的main函数
    return findStaticMain(args.startClass, args.startArgs, classLoader);
}

在applicationInit()函数中,通过findStaticMain()获取SystemServer.java的main()函数,执行SystemServer.java的相关函数。

frameworks/base/core/java/com/android/internal/os/RuntimeInit.java

protected static Runnable findStaticMain(String className, String[] argv,
        ClassLoader classLoader) {
    Class<?> cl;

    try {
        cl = Class.forName(className, true, classLoader);
    } catch (ClassNotFoundException ex) {
        throw new RuntimeException(
                "Missing class when invoking static main " + className,
                ex);
    }

    Method m;
    try {
        m = cl.getMethod("main", new Class[] { String[].class });
    } catch (NoSuchMethodException ex) {
        throw new RuntimeException(
                "Missing static main on " + className, ex);
    } catch (SecurityException ex) {
        throw new RuntimeException(
                "Problem getting static main on " + className, ex);
    }

    int modifiers = m.getModifiers();
    if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
        throw new RuntimeException(
                "Main method is not public and static on " + className);
    }

    /*
     * This throw gets caught in ZygoteInit.main(), which responds
     * by invoking the exception's run() method. This arrangement
     * clears up all the stack frames that were required in setting
     * up the process.
     */
    return new MethodAndArgsCaller(m, argv);
}

在此函数中通过反射获取SystemServer.java的main()函数,然后执行此函数。

frameworks/base/services/java/com/android/server/SystemServer.java

/**
* The main entry point from zygote.
*/
public static void main(String[] args) {
    new SystemServer().run();
}

public SystemServer() {
    // Check for factory test mode.
    mFactoryTestMode = FactoryTest.getMode();
    // Remember if it's runtime restart(when sys.boot_completed is already set) or reboot
    mRuntimeRestart = "1".equals(SystemProperties.get("sys.boot_completed"));

    mRuntimeStartElapsedTime = SystemClock.elapsedRealtime();
    mRuntimeStartUptime = SystemClock.uptimeMillis();
}

在SystemServer()初始化完成后,执行run()函数。

frameworks/base/services/java/com/android/server/SystemServer.java

private void run() {
    try {
        traceBeginAndSlog("InitBeforeStartServices");
        // 当系统时间比1970年更早,设置当前系统时间为1970年
        if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) {
            Slog.w(TAG, "System clock is before 1970; setting to 1970.");
            SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME);
        }
        // 设置系统时区
        String timezoneProperty =  SystemProperties.get("persist.sys.timezone");
        if (timezoneProperty == null || timezoneProperty.isEmpty()) {
            Slog.w(TAG, "Timezone not set; setting to GMT.");
            SystemProperties.set("persist.sys.timezone", "GMT");
        }
        // 设置系统语言、地区、国家等信息
        if (!SystemProperties.get("persist.sys.language").isEmpty()) {
            final String languageTag = Locale.getDefault().toLanguageTag();

            SystemProperties.set("persist.sys.locale", languageTag);
            SystemProperties.set("persist.sys.language", "");
            SystemProperties.set("persist.sys.country", "");
            SystemProperties.set("persist.sys.localevar", "");
        }

        // Here we go!
        Slog.i(TAG, "Entered the Android system server!");
        int uptimeMillis = (int) SystemClock.elapsedRealtime();
        EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, uptimeMillis);
        if (!mRuntimeRestart) {
            MetricsLogger.histogram(null, "boot_system_server_init", uptimeMillis);
        }
        // 变更虚拟机的库文件
        SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());
        // 清除虚拟机内存增长上限
        VMRuntime.getRuntime().clearGrowthLimit();
        // 设置内存可能使用效率为0.8
        VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);
        // 由于某些设备依赖运行时产生的指纹信息,因此需要在开机前完成自定义
        Build.ensureFingerprintProperty();
        // 访问环境变量前,指定用户
        Environment.setUserRequired(true);
        // 系统服务器内部,传入的bundle都是解包的,避免抛出BadParcelableException异常
        BaseBundle.setShouldDefuse(true);
        // 设置追踪栈
        Parcel.setStackTraceParceling(true);
        // 确保当前系统进程的binder调用在前台,确保运行在前台优先级
        BinderInternal.disableBackgroundScheduling(true);
        // 增加system_server中的binder线程数量
        BinderInternal.setMaxThreads(sMaxBinderThreads);
        // 准备主线程
        android.os.Process.setThreadPriority(
            android.os.Process.THREAD_PRIORITY_FOREGROUND);
        android.os.Process.setCanSelfBackground(false);
        Looper.prepareMainLooper();
        Looper.getMainLooper().setSlowLogThresholdMs(
                SLOW_DISPATCH_THRESHOLD_MS, SLOW_DELIVERY_THRESHOLD_MS);
        // 初始化native线程,加载android_servers.so库,源码在frameworks/base/services/目录下
        System.loadLibrary("android_servers");
        // 检测上次关机过程是否失败
        performPendingShutdown();
        // 初始化系统的context模块
        createSystemContext();
        // 创建SystemServiceManager
        mSystemServiceManager = new SystemServiceManager(mSystemContext);
        mSystemServiceManager.setStartInfo(mRuntimeRestart,
                mRuntimeStartElapsedTime, mRuntimeStartUptime);
        // 将mSystemServiceManager添加到本地服务的成员sLocalServiceObjects
        LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);
        // Prepare the thread pool for init tasks that can be parallelized
        // 为可以预初始化的init task准备线程池
        SystemServerInitThreadPool.get();
    } finally {
        traceEnd();  // InitBeforeStartServices
    }

    /// M: Set paramters to mtkSystemserver.
    sMtkSystemServerIns.setPrameters(BOOT_TIMINGS_TRACE_LOG, mSystemServiceManager,
        mSystemContext);
    // 启动服务.
    try {
        traceBeginAndSlog("StartServices");
        // 启动BootstrapServices
        startBootstrapServices();
        /// mtk添加,启动mtk的MtkBootstrapServices
        sMtkSystemServerIns.startMtkBootstrapServices();
        // 启动CoreServices
        startCoreServices();
        /// mtk添加,启动MtkCoreServices
        sMtkSystemServerIns.startMtkCoreServices();
        // 启动OtherServices
        startOtherServices();
        // 关闭SystemServer的初始化线程
        SystemServerInitThreadPool.shutdown();
    } catch (Throwable ex) {
        Slog.e("System", "******************************************");
        Slog.e("System", "************ Failure starting system services", ex);
        throw ex;
    } finally {
        traceEnd();
    }

    StrictMode.initVmDefaults(null);

    /// M: open wtf when load is user.
    // mtk添加,当加载用户是user的时候打开wtf
    if ("user".equals(Build.TYPE) && !mRuntimeRestart && !isFirstBootOrUpgrade()) {
        int uptimeMillis = (int) SystemClock.elapsedRealtime();
        MetricsLogger.histogram(null, "boot_system_server_ready", uptimeMillis);
        final int MAX_UPTIME_MILLIS = 60 * 1000;
        if (uptimeMillis > MAX_UPTIME_MILLIS) {
            Slog.wtf(SYSTEM_SERVER_TIMING_TAG,
                    "SystemServer init took too long. uptimeMillis=" + uptimeMillis);
        }
    }
    /// mtk添加
    sMtkSystemServerIns.addBootEvent("Android:SysServerInit_END");
    // 循环执行
    Looper.loop();
    throw new RuntimeException("Main thread loop unexpectedly exited");
}

SystemServer的run()函数主要执行以下函数完成相应的功能:

  • 创建SystemServiceManager
  • startBootstrapServices():启动引导服务。例如AMS、PMS、DMS等
  • startCoreServices():启动核心服务。例如BatteryService等
  • startOtherServices():启动其他服务。

接下来看看SystemServer启动的这些服务。

  1. startBootstrapServices()
frameworks/base/services/java/com/android/server/SystemServer.java

/**
* Starts the small tangle of critical services that are needed to get
* the system off the ground.  These services have complex mutual dependencies
* which is why we initialize them all in one place here.  Unless your service
* is also entwined in these dependencies, it should be initialized in one of
* the other functions.
*/
private void startBootstrapServices() {
    Slog.i(TAG, "Reading configuration...");
    final String TAG_SYSTEM_CONFIG = "ReadingSystemConfig";
    traceBeginAndSlog(TAG_SYSTEM_CONFIG);
    SystemServerInitThreadPool.get().submit(SystemConfig::getInstance, TAG_SYSTEM_CONFIG);
    traceEnd();
    // 阻塞等待与installd建立的socket通道
    traceBeginAndSlog("StartInstaller");
    Installer installer = mSystemServiceManager.startService(Installer.class);
    traceEnd();
    // 启动DeviceIdentifiersPolicyService
    traceBeginAndSlog("DeviceIdentifiersPolicyService");
    mSystemServiceManager.startService(DeviceIdentifiersPolicyService.class);
    traceEnd();
    // 启动ActivityManagerService
    traceBeginAndSlog("StartActivityManager");
    mActivityManagerService = mSystemServiceManager.startService(
            ActivityManagerService.Lifecycle.class).getService();
    mActivityManagerService.setSystemServiceManager(mSystemServiceManager);
    mActivityManagerService.setInstaller(installer);
    traceEnd();
    // 启动PowerManagerService
    traceBeginAndSlog("StartPowerManager");
    mPowerManagerService = mSystemServiceManager.startService(PowerManagerService.class);
    traceEnd();
    // 初始化PowerManagement
    traceBeginAndSlog("InitPowerManagement");
    mActivityManagerService.initPowerManagement();
    traceEnd();
    // 启动RecoverySystemService
    traceBeginAndSlog("StartRecoverySystemService");
    mSystemServiceManager.startService(RecoverySystemService.class);
    traceEnd();

    // Now that we have the bare essentials of the OS up and running, take
    // note that we just booted, which might send out a rescue party if
    // we're stuck in a runtime restart loop.
    RescueParty.noteBoot(mSystemContext);
    // 启动LightsService
    traceBeginAndSlog("StartLightsService");
    mSystemServiceManager.startService(LightsService.class);
    traceEnd();
    
    traceBeginAndSlog("StartSidekickService");
    // Package manager isn't started yet; need to use SysProp not hardware feature
    if (SystemProperties.getBoolean("config.enable_sidekick_graphics", false)) {
        mSystemServiceManager.startService(WEAR_SIDEKICK_SERVICE_CLASS);
    }
    traceEnd();
    // 启动DisplayManagerService
    traceBeginAndSlog("StartDisplayManager");
    mDisplayManagerService = mSystemServiceManager.startService(DisplayManagerService.class);
    traceEnd();
    // 在初始化pms之前需要默认等待显示完成
    traceBeginAndSlog("WaitForDisplay");
    mSystemServiceManager.startBootPhase(SystemService.PHASE_WAIT_FOR_DEFAULT_DISPLAY);
    traceEnd();

    // 加密状态下仅启动核心应用
    String cryptState = SystemProperties.get("vold.decrypt");
    if (ENCRYPTING_STATE.equals(cryptState)) {
        Slog.w(TAG, "Detected encryption in progress - only parsing core apps");
        mOnlyCore = true;
    } else if (ENCRYPTED_STATE.equals(cryptState)) {
        Slog.w(TAG, "Device encrypted - only parsing core apps");
        mOnlyCore = true;
    }

    // Start the package manager.
    if (!mRuntimeRestart) {
        MetricsLogger.histogram(null, "boot_package_manager_init_start",
                (int) SystemClock.elapsedRealtime());
    }
    // 启动pms
    traceBeginAndSlog("StartPackageManagerService");
    mPackageManagerService = PackageManagerService.main(mSystemContext, installer,
            mFactoryTestMode != FactoryTest.FACTORY_TEST_OFF, mOnlyCore);
    mFirstBoot = mPackageManagerService.isFirstBoot();
    mPackageManager = mSystemContext.getPackageManager();
    traceEnd();
    if (!mRuntimeRestart && !isFirstBootOrUpgrade()) {
        MetricsLogger.histogram(null, "boot_package_manager_init_ready",
                (int) SystemClock.elapsedRealtime());
    }
    // dex优化
    if (!mOnlyCore) {
        boolean disableOtaDexopt = SystemProperties.getBoolean("config.disable_otadexopt",
                false);
        if (!disableOtaDexopt) {
            traceBeginAndSlog("StartOtaDexOptService");
            try {
                OtaDexoptService.main(mSystemContext, mPackageManagerService);
            } catch (Throwable e) {
                reportWtf("starting OtaDexOptService", e);
            } finally {
                traceEnd();
            }
        }
    }
    // 启动UserManagerService
    traceBeginAndSlog("StartUserManagerService");
    mSystemServiceManager.startService(UserManagerService.LifeCycle.class);
    traceEnd();
    // 初始化用于缓存包资源的属性缓存。
    traceBeginAndSlog("InitAttributerCache");
    AttributeCache.init(mSystemContext);
    traceEnd();
    // 为系统进程设置应用程序实例并启动。
    traceBeginAndSlog("SetSystemProcess");
    mActivityManagerService.setSystemProcess();
    traceEnd();

    // DisplayManagerService needs to setup android.display scheduling related policies
    // since setSystemProcess() would have overridden policies due to setProcessGroup
    mDisplayManagerService.setupSchedulerPolicies();

    // 启动OverlayManagerService
    traceBeginAndSlog("StartOverlayManagerService");
    mSystemServiceManager.startService(new OverlayManagerService(mSystemContext, installer));
    traceEnd();
    // 在单独的线程中启动传感器服务。在使用它之前应该检查是否完成。
    mSensorServiceStart = SystemServerInitThreadPool.get().submit(() -> {
        TimingsTraceLog traceLog = new TimingsTraceLog(
                SYSTEM_SERVER_TIMING_ASYNC_TAG, Trace.TRACE_TAG_SYSTEM_SERVER);
        traceLog.traceBegin(START_SENSOR_SERVICE);
        startSensorService();
        traceLog.traceEnd();
    }, START_SENSOR_SERVICE);
}
  1. startCoreService()
frameworks/base/services/java/com/android/server/SystemServer.java
/**
* Starts some essential services that are not tangled up in the bootstrap process.
*/
private void startCoreServices() {
    traceBeginAndSlog("StartBatteryService");
    // 启动BatteryService
    mSystemServiceManager.startService(BatteryService.class);
    traceEnd();

    // 启动UsageStatsService
    traceBeginAndSlog("StartUsageService");
    mSystemServiceManager.startService(UsageStatsService.class);
    mActivityManagerService.setUsageStatsManager(
            LocalServices.getService(UsageStatsManagerInternal.class));
    traceEnd();

    // 启动WebViewUpdateService
    if (mPackageManager.hasSystemFeature(PackageManager.FEATURE_WEBVIEW)) {
        traceBeginAndSlog("StartWebViewUpdateService");
        mWebViewUpdateService = mSystemServiceManager.startService(WebViewUpdateService.class);
        traceEnd();
    }

    // 启动BinderCallsStateService
    traceBeginAndSlog("StartBinderCallsStatsService");
    BinderCallsStatsService.start();
    traceEnd();
}
  1. startOtherServices()
frameworks/base/services/java/com/android/server/SystemServer.java

private void startOtherServices() {
        ...
        SystemConfig.getInstance();
        mContentResolver = context.getContentResolver(); // resolver
        ...
        mActivityManagerService.installSystemProviders(); //provider
        mSystemServiceManager.startService(AlarmManagerService.class); // alarm
        // watchdog
        watchdog.init(context, mActivityManagerService); 
        inputManager = new InputManagerService(context); // input
        wm = WindowManagerService.main(...); // window
        inputManager.start();  //启动input
        mDisplayManagerService.windowManagerAndInputReady();
        ...
        mSystemServiceManager.startService(MOUNT_SERVICE_CLASS); // mount
        mPackageManagerService.performBootDexOpt();  // dexopt操作
        ActivityManagerNative.getDefault().showBootMessage(...); //显示启动界面
        ...
        statusBar = new StatusBarManagerService(context, wm); //statusBar
        //dropbox
        ServiceManager.addService(Context.DROPBOX_SERVICE,
                    new DropBoxManagerService(context, new File("/data/system/dropbox")));
         mSystemServiceManager.startService(JobSchedulerService.class); //JobScheduler
         lockSettings.systemReady(); //lockSettings
 
        //phase480 和phase500
        mSystemServiceManager.startBootPhase(SystemService.PHASE_LOCK_SETTINGS_READY);
        mSystemServiceManager.startBootPhase(SystemService.PHASE_SYSTEM_SERVICES_READY);
        ...
        // 准备好window, power, package, display服务
        wm.systemReady();
        mPowerManagerService.systemReady(...);
        mPackageManagerService.systemReady();
        mDisplayManagerService.systemReady(...);
        
        //重头戏[见小节2.1]
        mActivityManagerService.systemReady(new Runnable() {
            public void run() {
              ...
            }
        });
    }

在这三个函数中,均会启动一些系统服务。

startBootstrapServices:引导服务

服务功能
Installer系统安装apk时的一个服务类,启动完成Installer服务之后才能启动其他的系统服务
DeviceIdentifiersPolicyService处理设备标识符的策略,用于用户隐私管理
ActivityManagerServiceAndroid 四大组件的启动、切换、调度。
PowerManagerServiceAndroid 系统中电源相关计算
LightsService系统中管理显示和背光LED
PackageManagerService对应用进行安装、解析、删除、卸载、预优化等
DisplayManagerService管理显示设备
UserManagerService多用户管理模式
SensorService为系统提供各种传感器服务
............

startCoreServices:核心服务

服务功能
BatteryService管理电池相关服务
UsageStatsService收集用户使用每一个App的频率、使用日常等
WebViewUpdateServicewebview更新服务
BinderCallsStatsService提供 Binder 调用的性能和统计数据

startOtherServices:其他服务

其他服务包含90多种,主要是一些非引导、非核心服务。通常是cameraService、AlarmManagerService等。

服务功能
StartKeyChainSystemService系统安全性相关的重要组成部分,与密钥存储和证书管理有关
SchedulingPolicyService线程的调度管理
TelecomLoaderService加载和管理电话功能
EntropyMixer(EntropyService)产生随机数服务
AccountManagerService管理用户账户的核心服务
............
CameraService摄像头相关服务
AlarmManagerService全局定时器管理服务
InputManagerService管理输入事件
WindowManagerService窗口管理服务
VrManagerServiceVR模式管理服务
BluetoothService蓝牙管理服务
NotificationManagerService通知管理服务
DeviceStorageMonitorService存储相关管理服务
LocationManagerService定位管理服务
AudioService音频相关管理服务
............

3.4.6 SystemServer小结

Java层的SystemServer启动流程

android_systemserver_start_fluence.drawio.png

在Java层,SystemServer的启动完全由Zygote控制。在ZygoteInit.java中使用forkSystemServer()创建一个名为system_server的进程,然后调用applicationInit()完成SystemServer启动后的工作。在RuntimeInit.java中通过反射调用SystemServer.main()。在SystemServer中依次执行:

  • startBootstrapService()
  • startCoreService()
  • startOtherService()
  • Looper.loop()

启动Java 层核心服务。

3.4.7 Java Framework

android_javaframework_start.drawio.png

3.5 APP层

App层详细启动流程如下:

Activity启动流程(一)

Activity启动流程(二)

android_activity_start.drawio.png

4.启动流程总结

image.png