当我们买了一个手机或者平板,按下电源键的那一刻,到进入Launcher,选择我们想要使用的某个App进入,这个过程中,系统到底在做了什么事,伙伴们有仔细的研究过吗?可能对于Framework这块晦涩难懂的专题比较枯燥,那么从这篇文章开始,将会对Framework相关的知识进行全面的剖析,先从系统启动流程开始。
1 系统启动流程分析
当我们打开电源键的时候,硬件执行的第一段代码就是BootLoader,会做一些初始化的操作,例如初始化CPU速度、内存等。然后会启动第一个进程idle进程(pid = 0),这个进程是在内核空间初始化的。
idle进程作为系统启动的第一个进程,它会创建两个进程,系统创建进程都是通过fork的形式完成,其中在Kernel空间会创建kthreadd进程,还有一个就是在用户空间创建init进程(pid = 1),这个进程想必我们都非常熟悉了。
像我们启动app,或者系统应用,都需要zygote进程来孵化进程,那么zygote进程也是通过init进程来创建完成的,像系统服务的创建和启动,是通过system_server进程来管理,而system_server进程则是由zygote进程fork创建。
所以通过下面这个图,我们就能大致了解,从电源按下的那一刻到应用启动的流程。
接下来我们分析每个进程启动流程。
2 C/C++ Framework Native层
2.1 init进程启动分析
通过上面的流程图,我们知道init进程是通过内核空间启动的,所以我们看一下内核层的代码。
kernel_common/init/main.c
在内核层的main.c文件中,有一个静态方法kernel_init,这个方法会首先执行。
//
static int kernel_init(void *);
static int __ref kernel_init(void *unused)
{
int ret;
kernel_init_freeable();
/* need to finish all async __init code before freeing the memory */
async_synchronize_full();
kprobe_free_init_mem();
ftrace_free_init_mem();
free_initmem();
mark_readonly();
/*
* Kernel mappings are now finalized - update the userspace page-table
* to finalize PTI.
*/
pti_finalize();
system_state = SYSTEM_RUNNING;
numa_default_policy();
rcu_end_inkernel_boot();
//初始化文件
if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;
panic("No working init found. Try passing init= option to kernel. "
"See Linux Documentation/admin-guide/init.rst for guidance.");
}
在kernel_init方法中,我们看到会调用try_to_run_init_process函数去加载一些文件,例如我们需要关心的/bin/init,这个文件就是在设备system/bin/init下的。
init可以看做是一个模块,它与install、gzip等系统能力属于平级,都是通过系统源码编译过来的一种二进制文件,那么在这个文件加载的时候,具体执行的是哪些代码呢,这个需要我们看这个模块具体是怎么编译出来的,需要看下Android.bp文件。
cc_binary {
name: "init_second_stage",
recovery_available: true,
stem: "init",
defaults: ["init_defaults"],
static_libs: ["libinit"],
srcs: ["main.cpp"],
symlinks: ["ueventd"],
target: {
platform: {
required: [
"init.rc",
"ueventd.rc",
"e2fsdroid",
"extra_free_kbytes",
"make_f2fs",
"mke2fs",
"sload_f2fs",
],
},
recovery: {
cflags: ["-DRECOVERY"],
exclude_static_libs: [
"libxml2",
],
exclude_shared_libs: [
"libbinder",
"libutils",
],
required: [
"init_recovery.rc",
"ueventd.rc.recovery",
"e2fsdroid.recovery",
"make_f2fs.recovery",
"mke2fs.recovery",
"sload_f2fs.recovery",
],
},
},
visibility: ["//packages/modules/Virtualization/microdroid"],
}
当系统编译init模块的时候,对应的srcs源码为main.cpp,也就是说系统system/bin/下的init模块入口函数为main.cpp,当kernel内核执行kernel_init函数的时候,其实就会执行init模块的main.cpp。
system/core/init/main.cpp
int main(int argc, char** argv) {
#if __has_feature(address_sanitizer)
__asan_set_error_report_callback(AsanReportCallback);
#endif
if (!strcmp(basename(argv[0]), "ueventd")) {
return ueventd_main(argc, argv);
}
if (argc > 1) {
if (!strcmp(argv[1], "subcontext")) {
android::base::InitLogging(argv, &android::base::KernelLogger);
const BuiltinFunctionMap function_map;
return SubcontextMain(argc, argv, &function_map);
}
if (!strcmp(argv[1], "selinux_setup")) {
return SetupSelinux(argv);
}
if (!strcmp(argv[1], "second_stage")) {
return SecondStageMain(argc, argv);
}
}
return FirstStageMain(argc, argv);
}
所有函数的入口都是main函数,所以看下main函数中做了什么事。首先我们看到当第一次进来时,会执行FirstStageMain这个函数,如果再次进入,此时就会走SecondStageMain。那么我们首先进入第一阶段,看系统做了什么事。
system/core/init/ first_stage_init.cpp
这个类中,我们找一些关键的代码来看一下,
int FirstStageMain(int argc, char** argv) {
if (REBOOT_BOOTLOADER_ON_PANIC) {
//核心代码1
//init如果挂掉,就会重启
InstallRebootSignalHandlers();
}
boot_clock::time_point start_time = boot_clock::now();
std::vector<std::pair<std::string, int>> errors;
#define CHECKCALL(x) \
if (x != 0) errors.emplace_back(#x " failed", errno);
// Clear the umask.
umask(0);
//核心代码2
CHECKCALL(clearenv());
CHECKCALL(setenv("PATH", _PATH_DEFPATH, 1));
// Get the basic filesystem setup we need put together in the initramdisk
// on / and then we'll let the rc file figure out the rest.
CHECKCALL(mount("tmpfs", "/dev", "tmpfs", MS_NOSUID, "mode=0755"));
CHECKCALL(mkdir("/dev/pts", 0755));
CHECKCALL(mkdir("/dev/socket", 0755));
CHECKCALL(mount("devpts", "/dev/pts", "devpts", 0, NULL));
#define MAKE_STR(x) __STRING(x)
CHECKCALL(mount("proc", "/proc", "proc", 0, "hidepid=2,gid=" MAKE_STR(AID_READPROC)));
#undef MAKE_STR
// Don't expose the raw commandline to unprivileged processes.
CHECKCALL(chmod("/proc/cmdline", 0440));
gid_t groups[] = {AID_READPROC};
CHECKCALL(setgroups(arraysize(groups), groups));
CHECKCALL(mount("sysfs", "/sys", "sysfs", 0, NULL));
CHECKCALL(mount("selinuxfs", "/sys/fs/selinux", "selinuxfs", 0, NULL));
CHECKCALL(mknod("/dev/kmsg", S_IFCHR | 0600, makedev(1, 11)));
if constexpr (WORLD_WRITABLE_KMSG) {
CHECKCALL(mknod("/dev/kmsg_debug", S_IFCHR | 0622, makedev(1, 11)));
}
CHECKCALL(mknod("/dev/random", S_IFCHR | 0666, makedev(1, 8)));
CHECKCALL(mknod("/dev/urandom", S_IFCHR | 0666, makedev(1, 9)));
// This is needed for log wrapper, which gets called before ueventd runs.
CHECKCALL(mknod("/dev/ptmx", S_IFCHR | 0666, makedev(5, 2)));
CHECKCALL(mknod("/dev/null", S_IFCHR | 0666, makedev(1, 3)));
// These below mounts are done in first stage init so that first stage mount can mount
// subdirectories of /mnt/{vendor,product}/. Other mounts, not required by first stage mount,
// should be done in rc files.
// Mount staging areas for devices managed by vold
// See storage config details at http://source.android.com/devices/storage/
CHECKCALL(mount("tmpfs", "/mnt", "tmpfs", MS_NOEXEC | MS_NOSUID | MS_NODEV,
"mode=0755,uid=0,gid=1000"));
// /mnt/vendor is used to mount vendor-specific partitions that can not be
// part of the vendor partition, e.g. because they are mounted read-write.
CHECKCALL(mkdir("/mnt/vendor", 0755));
// /mnt/product is used to mount product-specific partitions that can not be
// part of the product partition, e.g. because they are mounted read-write.
CHECKCALL(mkdir("/mnt/product", 0755));
// /apex is used to mount APEXes
CHECKCALL(mount("tmpfs", "/apex", "tmpfs", MS_NOEXEC | MS_NOSUID | MS_NODEV,
"mode=0755,uid=0,gid=0"));
// /debug_ramdisk is used to preserve additional files from the debug ramdisk
CHECKCALL(mount("tmpfs", "/debug_ramdisk", "tmpfs", MS_NOEXEC | MS_NOSUID | MS_NODEV,
"mode=0755,uid=0,gid=0"));
#undef CHECKCALL
SetStdioToDevNull(argv);
// Now that tmpfs is mounted on /dev and we have /dev/kmsg, we can actually
// talk to the outside world...
//初始化日志模块
InitKernelLogging(argv);
//......
const char* path = "/system/bin/init";
const char* args[] = {path, "selinux_setup", nullptr};
execv(path, const_cast<char**>(args));
// execv() only returns if an error happened, in which case we
// panic and never fall through this conditional.
PLOG(FATAL) << "execv(\"" << path << "\") failed";
return 1;
}
核心代码1
首先第一次进入时,会调用InstallRebootSignalHandlers,这个方法的主要作用就是,当系统发生crash时,会把SIGABRT、SIGSEGV、SIGBUS等信号的flag设置为SA_RESTART,这个时候信号监听到这个状态变化,就会直接重启。
static void InstallRebootSignalHandlers() {
// Instead of panic'ing the kernel as is the default behavior when init crashes,
// we prefer to reboot to bootloader on development builds, as this will prevent
// boot looping bad configurations and allow both developers and test farms to easily
// recover.
struct sigaction action;
memset(&action, 0, sizeof(action));
sigfillset(&action.sa_mask);//将所有信号加入至信号集
action.sa_handler = [](int signal) {
// These signal handlers are also caught for processes forked from init, however we do not
// want them to trigger reboot, so we directly call _exit() for children processes here.
if (getpid() != 1) {
_exit(signal);
}
// panic() reboots to bootloader
panic();//重启系统
};
action.sa_flags = SA_RESTART;
sigaction(SIGABRT, &action, nullptr);
sigaction(SIGBUS, &action, nullptr);
sigaction(SIGFPE, &action, nullptr);
sigaction(SIGILL, &action, nullptr);
sigaction(SIGSEGV, &action, nullptr);
#if defined(SIGSTKFLT)
sigaction(SIGSTKFLT, &action, nullptr);
#endif
sigaction(SIGSYS, &action, nullptr);
sigaction(SIGTRAP, &action, nullptr);
}
核心代码2
接下来,我们看是执行了CHECKCALL方法,执行了一系列linux指令,其中有mount、mkdir等;
如果熟悉linux指令的伙伴,应该知道mount是挂载的意思,例如我们把u盘插在电脑上,电脑可以读取u盘中的数据,这个过程就是一个挂载的过程。所以在初始化阶段,系统主要工作就是挂载一些文件路径,创建一些文件夹之类的操作。
然后会在args中传入参数selinux_setup,重新启动init再次进入到main函数中,这个时候argc > 1,会执行SetupSelinux函数。
system/core/init/selinux.cpp
int SetupSelinux(char** argv) {
InitKernelLogging(argv);
if (REBOOT_BOOTLOADER_ON_PANIC) {
InstallRebootSignalHandlers();
}
// Set up SELinux, loading the SELinux policy.
SelinuxSetupKernelLogging();
SelinuxInitialize();
// We're in the kernel domain and want to transition to the init domain. File systems that
// store SELabels in their xattrs, such as ext4 do not need an explicit restorecon here,
// but other file systems do. In particular, this is needed for ramdisks such as the
// recovery image for A/B devices.
if (selinux_android_restorecon("/system/bin/init", 0) == -1) {
PLOG(FATAL) << "restorecon failed of /system/bin/init failed";
}
const char* path = "/system/bin/init";
const char* args[] = {path, "second_stage", nullptr};
execv(path, const_cast<char**>(args));
// execv() only returns if an error happened, in which case we
// panic and never return from this function.
PLOG(FATAL) << "execv(\"" << path << "\") failed";
return 1;
}
其实这个函数我们重点关注最后,我们看到在args中传入了“second_stage”,再次进入到init的main函数中,会调用SecondStageMain类。
system/core/init/init.cpp
int SecondStageMain(int argc, char** argv) {
if (REBOOT_BOOTLOADER_ON_PANIC) {
InstallRebootSignalHandlers();
}
SetStdioToDevNull(argv);
InitKernelLogging(argv);
LOG(INFO) << "init second stage started!";
// Set init and its forked children's oom_adj.
if (auto result = WriteFile("/proc/1/oom_score_adj", "-1000"); !result) {
LOG(ERROR) << "Unable to write -1000 to /proc/1/oom_score_adj: " << result.error();
}
// Enable seccomp if global boot option was passed (otherwise it is enabled in zygote).
GlobalSeccomp();
// Set up a session keyring that all processes will have access to. It
// will hold things like FBE encryption keys. No process should override
// its session keyring.
keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 1);
// Indicate that booting is in progress to background fw loaders, etc.
close(open("/dev/.booting", O_WRONLY | O_CREAT | O_CLOEXEC, 0000));
//初始化属性域
property_init();
//......
// Clean up our environment.
unsetenv("INIT_STARTED_AT");
unsetenv("INIT_SELINUX_TOOK");
unsetenv("INIT_AVB_VERSION");
unsetenv("INIT_FORCE_DEBUGGABLE");
// Now set up SELinux for second stage.
SelinuxSetupKernelLogging();
SelabelInitialize();
SelinuxRestoreContext();
Epoll epoll;
if (auto result = epoll.Open(); !result) {
PLOG(FATAL) << result.error();
}
InstallSignalFdHandler(&epoll);
property_load_boot_defaults(load_debug_prop);
UmountDebugRamdisk();
fs_mgr_vendor_overlay_mount_all();
export_oem_lock_status();
StartPropertyService(&epoll);
MountHandler mount_handler(&epoll);
set_usb_controller();
//做函数匹配 例如linux命令 mkdir 则与创建文件夹的函数做匹配
const BuiltinFunctionMap function_map;
Action::set_function_map(&function_map);
if (!SetupMountNamespaces()) {
PLOG(FATAL) << "SetupMountNamespaces failed";
}
subcontexts = InitializeSubcontexts();
ActionManager& am = ActionManager::GetInstance();
ServiceList& sm = ServiceList::GetInstance();
//核心代码1 解析init.rc文件
LoadBootScripts(am, sm);
// Turning this on and letting the INFO logging be discarded adds 0.2s to
// Nexus 9 boot time, so it's disabled by default.
if (false) DumpState();
// Make the GSI status available before scripts start running.
if (android::gsi::IsGsiRunning()) {
property_set("ro.gsid.image_running", "1");
} else {
property_set("ro.gsid.image_running", "0");
}
am.QueueBuiltinAction(SetupCgroupsAction, "SetupCgroups");
am.QueueEventTrigger("early-init");
// Queue an action that waits for coldboot done so we know ueventd has set up all of /dev...
am.QueueBuiltinAction(wait_for_coldboot_done_action, "wait_for_coldboot_done");
// ... so that we can start queuing up actions that require stuff from /dev.
am.QueueBuiltinAction(MixHwrngIntoLinuxRngAction, "MixHwrngIntoLinuxRng");
am.QueueBuiltinAction(SetMmapRndBitsAction, "SetMmapRndBits");
am.QueueBuiltinAction(SetKptrRestrictAction, "SetKptrRestrict");
Keychords keychords;
am.QueueBuiltinAction(
[&epoll, &keychords](const BuiltinArguments& args) -> Result<Success> {
for (const auto& svc : ServiceList::GetInstance()) {
keychords.Register(svc->keycodes());
}
keychords.Start(&epoll, HandleKeychord);
return Success();
},
"KeychordInit");
am.QueueBuiltinAction(console_init_action, "console_init");
// Trigger all the boot actions to get us started.
am.QueueEventTrigger("init");
// Starting the BoringSSL self test, for NIAP certification compliance.
am.QueueBuiltinAction(StartBoringSslSelfTest, "StartBoringSslSelfTest");
// Repeat mix_hwrng_into_linux_rng in case /dev/hw_random or /dev/random
// wasn't ready immediately after wait_for_coldboot_done
am.QueueBuiltinAction(MixHwrngIntoLinuxRngAction, "MixHwrngIntoLinuxRng");
// Initialize binder before bringing up other system services
am.QueueBuiltinAction(InitBinder, "InitBinder");
// Don't mount filesystems or start core system services in charger mode.
std::string bootmode = GetProperty("ro.bootmode", "");
if (bootmode == "charger") {
am.QueueEventTrigger("charger");
} else {
am.QueueEventTrigger("late-init");
}
// Run all property triggers based on current state of the properties.
am.QueueBuiltinAction(queue_property_triggers_action, "queue_property_triggers");
// 核心代码2
while (true) {
// By default, sleep until something happens.
auto epoll_timeout = std::optional<std::chrono::milliseconds>{};
if (do_shutdown && !shutting_down) {
do_shutdown = false;
if (HandlePowerctlMessage(shutdown_command)) {
shutting_down = true;
}
}
if (!(waiting_for_prop || Service::is_exec_service_running())) {
am.ExecuteOneCommand();
}
if (!(waiting_for_prop || Service::is_exec_service_running())) {
if (!shutting_down) {
auto next_process_action_time = HandleProcessActions();
// If there's a process that needs restarting, wake up in time for that.
if (next_process_action_time) {
epoll_timeout = std::chrono::ceil<std::chrono::milliseconds>(
*next_process_action_time - boot_clock::now());
if (*epoll_timeout < 0ms) epoll_timeout = 0ms;
}
}
// If there's more work to do, wake up again immediately.
if (am.HasMoreCommands()) epoll_timeout = 0ms;
}
if (auto result = epoll.Wait(epoll_timeout); !result) {
LOG(ERROR) << result.error();
}
}
return 0;
}
核心代码1
在创建ActionManager和ServiceList对象之后,就会调用LoadBootScripts进行init.rc文件解析。
static void LoadBootScripts(ActionManager& action_manager, ServiceList& service_list) {
Parser parser = CreateParser(action_manager, service_list);
std::string bootscript = GetProperty("ro.boot.init_rc", "");
if (bootscript.empty()) {
parser.ParseConfig("/init.rc");
if (!parser.ParseConfig("/system/etc/init")) {
late_import_paths.emplace_back("/system/etc/init");
}
if (!parser.ParseConfig("/product/etc/init")) {
late_import_paths.emplace_back("/product/etc/init");
}
if (!parser.ParseConfig("/product_services/etc/init")) {
late_import_paths.emplace_back("/product_services/etc/init");
}
if (!parser.ParseConfig("/odm/etc/init")) {
late_import_paths.emplace_back("/odm/etc/init");
}
if (!parser.ParseConfig("/vendor/etc/init")) {
late_import_paths.emplace_back("/vendor/etc/init");
}
} else {
parser.ParseConfig(bootscript);
}
}
那么如何解析init.rc文件呢?首先CreateParser创建解析器,根据service、on、import创建不同的解析器。
Parser CreateParser(ActionManager& action_manager, ServiceList& service_list) {
Parser parser;
parser.AddSectionParser("service", std::make_unique<ServiceParser>(&service_list, subcontexts));
parser.AddSectionParser("on", std::make_unique<ActionParser>(&action_manager, subcontexts));
parser.AddSectionParser("import", std::make_unique<ImportParser>(&parser));
return parser;
}
核心代码2
这里我们看到,进入了while死循环,这里跟Handler机制有些类似。如果一个进程没有死循环,那么执行完成之后就生命周期就结束了,显然init进程是不能被挂掉的,它需要命令到来的时候,去执行相应的行为。因此在没有指令进入的时候,会执行epoll.Wait挂起。
2.2 init进程启动总结
到此init进程的主要任务就完成了,我们总结一下init进程主要干了什么事:
(1)init进程是由内核进程idle进程fork出来的,因此init进程初始化,也是由kernel启动的,即调用了kernel_int方法,此时会从系统的system/bin文件夹下查找init二进制文件;
(2)init二进制文件,是通过Android.bp脚本编译,从bp文件中可以看到,init关联的srcs为main.cpp,也就是system/core/init/main.cpp文件,其入口为main函数;
(3)当进入到main函数中时,首先会执行FirstStageMain函数,在这个函数中主要是:注册signal,挂载文件或者创建文件,进行一些初始化操作,然后再次进入到main函数中;
(4)此时进入main函数会执行SetupSeLinux,这里主要做linux的一些安全策略,然后会再次执行init的main函数;
(5)此时会执行SecondStageMain函数,在这个函数中,首先会初始化属性域,注册到enpoll中;然后解析init.rc文件,随后进入while循环,继续执行init.rc中的command指令。
3 Java Framework层
过了C/C++源码,真正到.java文件结尾的源码,就是Zygote进程,是由init进程fork出来的,也就是说Zygote才是Java进程的鼻祖。
3.1 init.rc文件
前面我们提到了,在SecondStageMain函数中,会进行init.rc文件的解析,那么init.rc到底是什么呢?你可以理解为就是一个脚本文件,只不过在脚本文件中,需要系统执行指令。
system/core/rootdir/init.rc
import /init.${ro.zygote}.rc
# Mount filesystems and start core system services.
on late-init
//......
# Now we can start zygote for devices with file based encryption
trigger zygote-start
on zygote-start && property:ro.crypto.state=unencrypted
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
从init.rc文件中我们可以看到,当在SecondStageMain中解析init.rc文件的时候,就会启动Zygote进程,所以这个时候,才会真正进入到了Java的进程。
从脚本中看,start zygote最终会执行import进来的init.zygote.rc文件。
3.2 Zygote启动流程
看上图,当Zygote启动的时候,其实就是对应system/bin下的app_process以及根据系统决定启动32位进程或者64位进程。
system/core/rootdir/init.zygote32.rc
所以当启动Zygote进程的时候,如果是32位的操作系统,那么就会解析init.zygote32.rc文件
service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
class main
priority -20
user root
group root readproc reserved_disk
socket zygote stream 660 root system
socket usap_pool_primary stream 660 root system
onrestart write /sys/android_power/request_state wake
onrestart write /sys/power/state on
onrestart restart audioserver
onrestart restart cameraserver
onrestart restart media
onrestart restart netd
onrestart restart wificond
writepid /dev/cpuset/foreground/tasks
对于.rc文件的语法,这里简单介绍一下,对于service命令,具体格式为:
service <name> <pathname> [args......]
name:服务的名称;
pathname:可执行的二进制文件路径,service的文件路径
args:要启动service所要带的参数
这里我们会看到启动Zygote服务进程,会执行/system/bin/app_process二进制文件,对于二进制文件是通过Android.bp来编译生成的,我们看下对应的文件。
cc_binary {
name: "app_process",
srcs: ["app_main.cpp"],
multilib: {
lib32: {
suffix: "32",
},
lib64: {
suffix: "64",
},
},
}
我们可以看到,对于app_process可执行文件,其函数入口为app_main.cpp文件,也就是在启动Zygote进程之后,就会进入到app_main.cpp,我们看下main函数。
frameworks/base/cmds/app_process/app_main.cpp
int main(int argc, char* const argv[])
{
if (!LOG_NDEBUG) {
String8 argv_String;
for (int i = 0; i < argc; ++i) {
argv_String.append("\"");
argv_String.append(argv[i]);
argv_String.append("\" ");
}
ALOGV("app_process main with argv: %s", argv_String.string());
}
//创建app运行时对象
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
// Process command line arguments
// ignore argv[0]
argc--;
argv++;
int i;
for (i = 0; i < argc; i++) {
if (known_command == true) {
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add known option '%s'", argv[i]);
known_command = false;
continue;
}
for (int j = 0;
j < static_cast<int>(sizeof(spaced_commands) / sizeof(spaced_commands[0]));
++j) {
if (strcmp(argv[i], spaced_commands[j]) == 0) {
known_command = true;
ALOGV("app_process main found known command '%s'", argv[i]);
}
}
if (argv[i][0] != '-') {
break;
}
if (argv[i][1] == '-' && argv[i][2] == 0) {
++i; // Skip --.
break;
}
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add option '%s'", argv[i]);
}
// Parse runtime arguments. Stop at first unrecognized option.
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
// 核心代码1
while (i < argc) {
const char* arg = argv[i++];
if (strcmp(arg, "--zygote") == 0) {
zygote = true;
niceName = ZYGOTE_NICE_NAME;
} else if (strcmp(arg, "--start-system-server") == 0) {
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {
niceName.setTo(arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {
className.setTo(arg);
break;
} else {
--i;
break;
}
}
//......
// 核心代码2
if (zygote) {
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (className) {
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.\n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
}
}
我们知道,Zygote进程是Java进程的鼻祖,从Zygote进程启动之后,就正式进入到App运行时的环境,可以这么认为:Zygote进程创建了App运行时环境。
这里我们看到的还是C++的代码,那么如何进入到Java程序中呢?我们看到main函数最开始,就是创建了AppRuntime对象。
核心代码1
这里,我们看到会解析启动Zygote进程时传入的参数:
-Xzygote /system/bin --zygote --start-system-server
那么此时会设置一些标志位:
zygote = true;
niceName = ZYGOTE_NICE_NAME;
startSystemServer = true;
因为我们知道,当启动Zygote进程之后,就会创建system_server进程,所以这里就是将startSystemServer标志位设置为true。
核心代码2
因为zygote此时为ture,那么就会通过AppRuntime对象启动com.android.internal.os.ZygoteInit这个类对象,我们看下具体的实现逻辑。
frameworks/base/core/jni/AndroidRuntime.cpp
void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
ALOGD(">>>>>> START %s uid %d <<<<<<\n",
className != NULL ? className : "(unknown)", getuid());
static const String8 startSystemServer("start-system-server");
/*
* 'startSystemServer == true' means runtime is obsolete and not run from
* init.rc anymore, so we print out the boot start event here.
*/
for (size_t i = 0; i < options.size(); ++i) {
if (options[i] == startSystemServer) {
/* track our progress through the boot sequence */
const int LOG_BOOT_PROGRESS_START = 3000;
LOG_EVENT_LONG(LOG_BOOT_PROGRESS_START, ns2ms(systemTime(SYSTEM_TIME_MONOTONIC)));
}
}
const char* rootDir = getenv("ANDROID_ROOT");
if (rootDir == NULL) {
rootDir = "/system";
if (!hasDir("/system")) {
LOG_FATAL("No root directory specified, and /system does not exist.");
return;
}
setenv("ANDROID_ROOT", rootDir, 1);
}
const char* runtimeRootDir = getenv("ANDROID_RUNTIME_ROOT");
if (runtimeRootDir == NULL) {
LOG_FATAL("No runtime directory specified with ANDROID_RUNTIME_ROOT environment variable.");
return;
}
const char* tzdataRootDir = getenv("ANDROID_TZDATA_ROOT");
if (tzdataRootDir == NULL) {
LOG_FATAL("No tz data directory specified with ANDROID_TZDATA_ROOT environment variable.");
return;
}
//const char* kernelHack = getenv("LD_ASSUME_KERNEL");
//ALOGD("Found LD_ASSUME_KERNEL='%s'\n", kernelHack);
/* start the virtual machine */
JniInvocation jni_invocation;
jni_invocation.Init(NULL);
JNIEnv* env;
//核心代码1
if (startVm(&mJavaVM, &env, zygote) != 0) {
return;
}
onVmCreated(env);
/*
* Register android functions.
*/
if (startReg(env) < 0) {
ALOGE("Unable to register all android natives\n");
return;
}
/*
* We want to call main() with a String array with arguments in it.
* At present we have two arguments, the class name and an option string.
* Create an array to hold them.
*/
jclass stringClass;
jobjectArray strArray;
jstring classNameStr;
stringClass = env->FindClass("java/lang/String");
assert(stringClass != NULL);
strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
assert(strArray != NULL);
classNameStr = env->NewStringUTF(className);
assert(classNameStr != NULL);
env->SetObjectArrayElement(strArray, 0, classNameStr);
for (size_t i = 0; i < options.size(); ++i) {
jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
assert(optionsStr != NULL);
env->SetObjectArrayElement(strArray, i + 1, optionsStr);
}
/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
char* slashClassName = toSlashClassName(className != NULL ? className : "");
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);
#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
free(slashClassName);
ALOGD("Shutting down VM\n");
if (mJavaVM->DetachCurrentThread() != JNI_OK)
ALOGW("Warning: unable to detach main thread\n");
if (mJavaVM->DestroyJavaVM() != 0)
ALOGW("Warning: VM did not shut down cleanly\n");
}
我们重点关注AndroidRuntime中的start方法。其实这里面的代码还是很清晰的,首先会调用startVM方法,从字面意思上来看,就是启动虚拟机;然后调用startReg方法,从注释中我们就可以看到,是注册jni,因此如果想要在Java调用到C++中的方法,或者从C++调用Java层代码,必须要注册jni才能继续执行。
最后,我们看就是调用了CallStaticVoidMethod函数,
jmethodID startMeth = env->GetStaticMethodID(startClass, "main","([Ljava/lang/String;)V");
env->CallStaticVoidMethod(startClass, startMeth, strArray)
最终执行ZygoteInit.java.main方法。
3.3 native启动Zygote进程总结
至此,在native层的Zygote进程就已经启动完成了,我们来简单总结一下,当解析init.rc文件的时候,init进程就会fork出zygote进程。
此时系统执行init.rc中的脚本:执行start zygote时,会执行import进来的init.zygote.rc脚本,此时会根据系统版本,决定执行32位的脚本或者64位的脚本。当执行service zygote命令时,会执行系统system/bin下的二进制执行文件app_process,会进入到app_main.cpp中的main函数。
此时会调用AndroidRuntime的start方法执行ZygoteInit.java.main方法,在此之前会在native层创建VM虚拟机,并注册JNI函数保证C++和Java层之前的双向通信调用。
3.4 Java层的Zygote启动
通过前面我们知道,native层启动Zygote时,会调用Java层的ZygoteInit.java.main方法,我们看下这个类。
public static void main(String[] argv) {
ZygoteServer zygoteServer = null;
//......
Runnable caller;
try {
// ......
boolean startSystemServer = false;
String zygoteSocketName = "zygote";
String abiList = null;
boolean enableLazyPreload = false;
// 与nativ层一致,也是在根据传入的属性给一些状态位赋值
for (int i = 1; i < argv.length; i++) {
if ("start-system-server".equals(argv[i])) {
startSystemServer = true;
} else if ("--enable-lazy-preload".equals(argv[i])) {
enableLazyPreload = true;
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
zygoteSocketName = argv[i].substring(SOCKET_NAME_ARG.length());
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
// .....
// In some configurations, we avoid preloading resources and classes eagerly.
// In such cases, we will preload things prior to our first fork.
//核心代码 1
if (!enableLazyPreload) {
bootTimingsTraceLog.traceBegin("ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
preload(bootTimingsTraceLog);
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
bootTimingsTraceLog.traceEnd(); // ZygotePreload
}
// Do an initial gc to clean up after startup
bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
gcAndFinalize();
bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC
bootTimingsTraceLog.traceEnd(); // ZygoteInit
Zygote.initNativeState(isPrimaryZygote);
ZygoteHooks.stopZygoteNoThreadCreation();
//创建 Socket对象
zygoteServer = new ZygoteServer(isPrimaryZygote);
//核心代码 2
if (startSystemServer) {
Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
Log.i(TAG, "Accepting command socket connections");
// The select loop returns early in the child process after a fork and
// loops forever in the zygote.
caller = zygoteServer.runSelectLoop(abiList);
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with fatal exception", ex);
throw ex;
} finally {
if (zygoteServer != null) {
zygoteServer.closeServerSocket();
}
}
// We're in the child process and have exited the select loop. Proceed to execute the
// command.
if (caller != null) {
caller.run();
}
}
在方法的开始,有一个ZygoteServer对象,它其实是一个Socket,用于与各个进程间通信;既然使用到进程间通信了,为什么不使用Binder呢?
不知有没有伙伴会考虑这个问题,为什么要使用Socket呢?例如AMS想要创建一个进程,那么就会通知Zygote来孵化出一个进程,此时创建进程就需要通过fork这种形式,其实相当于是做了一次进程copy,那么当前进程所有线程、对象都会被copy到新的进程,那么此时线程就不再拥有线程的特性而是一个对象,此时在子进程中如果调用线程的方法,那么是无效的;还有就是如果在父进程中,某个线程持有一把锁,那么在子进程中想要竞争这把锁对象,但是这把锁可能永远无法被释放,导致死锁的情况发生。
所以在Zygote进程中,如果使用Binder,因其内部是多线程组成的线程池,会有发生死锁的可能性,通过Socket进行进程间通信,也是为了避免这种情况的发生。
核心代码1 -- 资源预加载
当然是否支持预加载,还是要看enableLazyPreload这个属性值,它是在解析init.zygoteXX.rc文件时,通过启动Zygote进程时传值决定,所以如果支持预加载,那么会调用preload方法。
static void preload(TimingsTraceLog bootTimingsTraceLog) {
Log.d(TAG, "begin preload");
bootTimingsTraceLog.traceBegin("BeginPreload");
beginPreload();
bootTimingsTraceLog.traceEnd(); // BeginPreload
bootTimingsTraceLog.traceBegin("PreloadClasses");
preloadClasses();
bootTimingsTraceLog.traceEnd(); // PreloadClasses
bootTimingsTraceLog.traceBegin("CacheNonBootClasspathClassLoaders");
cacheNonBootClasspathClassLoaders();
bootTimingsTraceLog.traceEnd(); // CacheNonBootClasspathClassLoaders
bootTimingsTraceLog.traceBegin("PreloadResources");
preloadResources();
bootTimingsTraceLog.traceEnd(); // PreloadResources
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadAppProcessHALs");
nativePreloadAppProcessHALs();
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadGraphicsDriver");
maybePreloadGraphicsDriver();
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
preloadSharedLibraries();
preloadTextResources();
// Ask the WebViewFactory to do any initialization that must run in the zygote process,
// for memory sharing purposes.
WebViewFactory.prepareWebViewInZygote();
endPreload();
warmUpJcaProviders();
Log.d(TAG, "end preload");
sPreloadComplete = true;
}
因为系统执行往往需要某些资源,而资源往往需要初始化完成之后,才可以直接调用,因此这里是否支持预加载,那么也是根据场景决定的。例如PreloadClasses,会预加载一些类,在system/etc/proloaded-classes文件中。
具体有哪些类,伙伴们可以自行去查看。
像PreloadResources,则是会提前加载一些资源文件,例如com.android.internal.R.xx文件夹下的这些文件,都是可以提前加载。所以预加载的目的就是为了提高进程的启动速度。
核心代码2 -- forkSystemServer
还有一个比较重要的工作,就是fork system_server进程。然后调用runSelectLoop方法,同样是开启一个死循环,因为Zygote进程也不可以执行完就死掉,而且Zygote进程会随时接收创建进程的指令,来fork进程。
首先看是如何fork出system_server进程的:
private static native int nativeForkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
int[][] rlimits, long permittedCapabilities, long effectiveCapabilities);
因为这块代码我就不继续跟了,最终就是调用了native的函数nativeForkSystemServer来创建system_server进程。
前面我们提到,当在native层启动Zygote的时候,会调用startReg函数进行JNI的创建,那么这个函数也一定在那个时候注册的,我们去验证一下。
frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
我们看下函数映射关系:
static const JNINativeMethod gMethods[] = {
{ "nativeForkAndSpecialize",
"(II[II[[IILjava/lang/String;Ljava/lang/String;[I[IZLjava/lang/String;Ljava/lang/String;)I",
(void *) com_android_internal_os_Zygote_nativeForkAndSpecialize },
{ "nativeForkSystemServer", "(II[II[[IJJ)I",
(void *) com_android_internal_os_Zygote_nativeForkSystemServer },
}
我们看到,在java层调用nativeForkSystemServer,对应的JNI层的函数为com_android_internal_os_Zygote_nativeForkSystemServer。
static jint com_android_internal_os_Zygote_nativeForkSystemServer(
JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
jint runtime_flags, jobjectArray rlimits, jlong permitted_capabilities,
jlong effective_capabilities) {
std::vector<int> fds_to_close(MakeUsapPipeReadFDVector()),
fds_to_ignore(fds_to_close);
fds_to_close.push_back(gUsapPoolSocketFD);
if (gUsapPoolEventFD != -1) {
fds_to_close.push_back(gUsapPoolEventFD);
fds_to_ignore.push_back(gUsapPoolEventFD);
}
// 进程fork创建
pid_t pid = ForkCommon(env, true,
fds_to_close,
fds_to_ignore);
if (pid == 0) {
SpecializeCommon(env, uid, gid, gids, runtime_flags, rlimits,
permitted_capabilities, effective_capabilities,
MOUNT_EXTERNAL_DEFAULT, nullptr, nullptr, true,
false, nullptr, nullptr);
} else if (pid > 0) {
// The zygote process checks whether the child process has died or not.
ALOGI("System server process %d has been created", pid);
gSystemServerPid = pid;
// There is a slight window that the system server process has crashed
// but it went unnoticed because we haven't published its pid yet. So
// we recheck here just to make sure that all is well.
int status;
if (waitpid(pid, &status, WNOHANG) == pid) {
ALOGE("System server process %d has died. Restarting Zygote!", pid);
RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");
}
if (UsePerAppMemcg()) {
// Assign system_server to the correct memory cgroup.
// Not all devices mount memcg so check if it is mounted first
// to avoid unnecessarily printing errors and denials in the logs.
if (!SetTaskProfiles(pid, std::vector<std::string>{"SystemMemoryProcess"})) {
ALOGE("couldn't add process %d into system memcg group", pid);
}
}
}
return pid;
}
ForkCommon就是用来调用系统的fork()函数来进行进程的创建,代码有兴趣的伙伴可以去跟一下。
3.5 Java层启动Zygote进程总结
当fork出system_server进程之后,Java层的Zygote进程将会进入死循环,接收消息并执行,简单总结一下:
(1)当在native层创建JVM,并注册JNI函数之后,就会执行Zygote.java.main方法,进入到Java代码中;
(2)在main方法中,首先会解析传入的参数,给一些标志位赋值;然后会根据标志位进行判断是否支持预加载,预加载包括但不限于classes、resources,目的为了快速启动进程;
(3)在预加载完成之后(如有需要),那么就会创建Socket连接;然后调用forkSystemServer方法,fork system_server进程,最终调用的还是C++层的函数,调用系统的fork函数;
(4)随后会调用ZygoteServer(scoket)的runSelectLoop方法,开启死循环,socket服务端会接收客户端发送的消息进行处理,例如AMS想要创建一个进程。
通过前面的分析,从电源上电开始,到真正进入到了Java进程中就是上述几个过程,其实最关键的就是在第一小节中的流程图,以及对于.rc文件的理解,对于system_server的启动将会放在下节的AMS专题中进行介绍。
最近刚开通了微信公众号,各位伙伴可以搜索【layz4Android】,或者扫码关注,每周不定时更新,也有惊喜红包🧧哦,也可以后台留言感兴趣的专题,给各位伙伴们产出文章。