Android系统之zygote进程

170 阅读5分钟

读完很多博客后,zygote似乎不是那么陌生了,但提及的时候还是很模糊,仅仅知道他是所有用户进程的鼻祖,然后也是做了一系列的初始化动作,谈及做了哪些就还总是忘记,应该是没有把相应的知识点串起来,本次我们先随着其他博客将这一系列流程梳理记录,先这样,等到日后学习能将这部分相关的内容串起来时,我们再回来修改。

Zygote是什么

zygote意为“受精卵”,是一个进程,它是所有用户进程的鼻祖,当我们想要打开一个app时,此时便会fork该进程,所以意为“受精卵”,zygote进程提供了一个单独的art虚拟机,以及系列初始化操作,为App的运行环境打下基础。

Zygote进程启动流程

app_main.main

Init进程提到,最终fork了zygote进程并exec(),跳到zygote的相关代码处执行,这里便是app_main.cpp->main,他做的事情很简单,解析argv参数列表,根据参数列表添加AppRuntime的配置参数runtime.addOption(),最后调用了runtime.start("com.android.internal.os.ZygoteInit", args, true)

int main(int argc, char* const argv[])
{

    AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
    // Process command line arguments
    // ignore argv[0]
    argc--;
    argv++;

    bool known_command = false;

    int i;
    for (i = 0; i < argc; i++) {
        ...//获取虚拟机相关参数
        runtime.addOption(strdup(argv[i]));
    }

    bool zygote = false;
    bool startSystemServer = false;
    bool application = false;
    String8 niceName;
    String8 className;

    ++i;  // Skip unused "parent dir" argument.
    while (i < argc) {
        const char* arg = argv[i++];
        if (strcmp(arg, "--zygote") == 0) {
            zygote = true;
            niceName = ZYGOTE_NICE_NAME;
        } else if (strcmp(arg, "--start-system-server") == 0) {
            startSystemServer = true;
        } else if (strcmp(arg, "--application") == 0) {
            application = true;
        } else if (strncmp(arg, "--nice-name=", 12) == 0) {
            niceName.setTo(arg + 12);
        } else if (strncmp(arg, "--", 2) != 0) {
            className.setTo(arg);
            break;
        } else {
            --i;
            break;
        }
    }

    if (!niceName.isEmpty()) {
        runtime.setArgv0(niceName.string(), true /* setProcName */);
    }

    if (zygote) {
        runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
    } else if (className) {
        runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
    } else {
        fprintf(stderr, "Error: no class name or --zygote supplied.\n");
        app_usage();
        LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
    }
}

AndroidRuntime::start

void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{

    static const String8 startSystemServer("start-system-server");

    ...忽略一些文件环境配置
    
    /* 启动虚拟机 */
    JniInvocation jni_invocation;
    jni_invocation.Init(NULL);
    JNIEnv* env;
    if (startVm(&mJavaVM, &env, zygote) != 0) {
        return;
    }
    
    onVmCreated(env);

    /*
     * Register android functions. 注册android方法startReg(env)
     */
    if (startReg(env) < 0) {
        ALOGE("Unable to register all android natives\n");
        return;
    }

    /*
     * We want to call main() with a String array with arguments in it.
     * At present we have two arguments, the class name and an option string.
     * Create an array to hold them.
     * 看到jni咯,这里就是将classname和参数列表和到一起,放到了strArray中
     */
    jclass stringClass;
    jobjectArray strArray;
    jstring classNameStr;

    stringClass = env->FindClass("java/lang/String");
    strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
    classNameStr = env->NewStringUTF(className);
    env->SetObjectArrayElement(strArray, 0, classNameStr);

    for (size_t i = 0; i < options.size(); ++i) {
        jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
        env->SetObjectArrayElement(strArray, i + 1, optionsStr);
    }

    /*
     * Start VM.  This thread becomes the main thread of the VM, and will
     * not return until the VM exits.
     * GetStaticMethodID:获取到了ZygoteInit.main
     * CallStaticVoidMain,调用了ZygoteInit.main
     */
    char* slashClassName = toSlashClassName(className != NULL ? className : "");
    jclass startClass = env->FindClass(slashClassName);
    if (startClass == NULL) {
        /* keep going */
    } else {
        jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
            "([Ljava/lang/String;)V");
        if (startMeth == NULL) {
            ALOGE("JavaVM unable to find main() in '%s'\n", className);
            /* keep going */
        } else {
            env->CallStaticVoidMethod(startClass, startMeth, strArray);
        }
    }
    ...
}
  1. startVm:初始化虚拟机,忽略细节
  2. startReg: 将一些android native方法注册给jvm,同时Hook了线程的创建,从这里开始所有的线程都是Jvm线程。
/*
 * Register android native functions with the VM.
 */
/*static*/ int AndroidRuntime::startReg(JNIEnv* env)
{
    /*
     * This hook causes all future threads created in this process to be
     * attached to the JavaVM.  (This needs to go away in favor of JNI
     * Attach calls.)
     */
    androidSetCreateThreadFunc((android_create_thread_fn) javaCreateThreadEtc);

    /*
     * Every "register" function calls one or more things that return
     * a local reference (e.g. FindClass).  Because we haven't really
     * started the VM yet, they're all getting stored in the base frame
     * and never released.  Use Push/Pop to manage the storage.
     */

    if (register_jni_procs(gRegJNI, NELEM(gRegJNI), env) < 0) {
        env->PopLocalFrame(NULL);
        return -1;
    }

    return 0;
}
  1. env->CallStaticVoidMethod(startClass, startMeth, strArray),从这里我们便到了ZygoteInit.main中

ZygoteInit.main

将视线拉到ZygoteInit.main中,很开心,这里我们已经进入到了Java层

public static void main(String argv[]) {

       // Zygote服务管理类,用于注册socket监听
       ZygoteServer zygoteServer = null;

       // Mark zygote start. This ensures that thread creation will throw
       // an error.
       // 拒绝线程创建,如果确实创建了线程则会抛错
       ZygoteHooks.startZygoteNoThreadCreation();

       // Zygote goes into its own process group.
       try {
           Os.setpgid(0, 0);
       } catch (ErrnoException ex) {
           throw new RuntimeException("Failed to setpgid(0,0)", ex);
       }

       Runnable caller;
       try {
            // 启动DDMS虚拟机监控调试服务
           RuntimeInit.enableDdms();

           boolean startSystemServer = false;
           String zygoteSocketName = "zygote";
           String abiList = null;
           boolean enableLazyPreload = false;

           // 参数解析
           for (int i = 1; i < argv.length; i++) {
               if ("start-system-server".equals(argv[i])) {
                   startSystemServer = true;
               } else if ("--enable-lazy-preload".equals(argv[i])) {
                   enableLazyPreload = true;
               } else if (argv[i].startsWith(ABI_LIST_ARG)) {
                   abiList = argv[i].substring(ABI_LIST_ARG.length());
               } else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
                   zygoteSocketName = argv[i].substring(SOCKET_NAME_ARG.length());
               } else {
                   throw new RuntimeException("Unknown command line argument: " + argv[i]);
               }
           }

           final boolean isPrimaryZygote = zygoteSocketName.equals(Zygote.PRIMARY_SOCKET_NAME);

           // In some configurations, we avoid preloading resources and classes eagerly.
           // In such cases, we will preload things prior to our first fork.
           if (!enableLazyPreload) {
               bootTimingsTraceLog.traceBegin("ZygotePreload");
               EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
                       SystemClock.uptimeMillis());
               preload(bootTimingsTraceLog);
               EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
                       SystemClock.uptimeMillis());
               bootTimingsTraceLog.traceEnd(); // ZygotePreload
           } else {
               Zygote.resetNicePriority();
           }

           // Do an initial gc to clean up after startup
           bootTimingsTraceLog.traceBegin("PostZygoteInitGC");

           // 官方注释:运行几个指定的GC,尝试清除几代的软引用和可达的对象,还有别的垃圾
           // 此方法只在fork()前是好使的
           gcAndFinalize();
   
           // Disable tracing so that forked processes do not inherit stale tracing tags from

           Zygote.initNativeState(isPrimaryZygote);
           
           // 可以开始创建线程了
           ZygoteHooks.stopZygoteNoThreadCreation();

           // 初始化Zygote进程管理服务
           zygoteServer = new ZygoteServer(isPrimaryZygote);

           // fork system server子进程,当r不为null时,即在system server子进程中了,此时调用r.run,直接返回
           if (startSystemServer) {
               Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);

               // {@code r == null} in the parent (zygote) process, and {@code r != null} in the
               // child (system_server) process.
               if (r != null) {
                   r.run();
                   return;
               }
           }

       
           Log.i(TAG, "Accepting command socket connections");

           // The select loop returns early in the child process after a fork and
           // loops forever in the zygote.

           // 仅在zyote进程中做无限循环,毕竟是zyogte服务管理程序嘛,处理接收到的socket包
           // 这里还做了很多事情,包括启动了子进程
           // 1. 调用了ZygoteConnection.processOneCommand
           // 2. 在processOneCommand里调用了Zygote.forkAndSpecialize,fork出了子进程,close掉了zygoteServer return
           // 3. 在runSelectLoop的无限循环中,判断当前是否是子进程,是的话,停止循环,所以此时便可以往下走了
           // 4. 注意这里的返回值,仅在子进程中才会返回非空值
           caller = zygoteServer.runSelectLoop(abiList);
       } catch (Throwable ex) {
           Log.e(TAG, "System zygote died with exception", ex);
           throw ex;
       } finally {
           if (zygoteServer != null) {
               // 这里主要是对异常情况下子进程和zygote进程的关闭处理
               zygoteServer.closeServerSocket();
           }
       }

       // We're in the child process and have exited the select loop. Proceed to execute the
       // command.

       // 只有子进程caller非空,看runselectLoop的注释
       if (caller != null) {
           caller.run();
       }
   }

注释已经说得很明白了,这里我们分开去看一些代码,看完后整体总结发生了什么

SystemServer

if (startSystemServer) {
                Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);

                // {@code r == null} in the parent (zygote) process, and {@code r != null} in the
                // child (system_server) process.
                if (r != null) {
                    r.run();
                    return;
                }
            }

在ZygoteInit.main中,调用了forkSystemServer,在这个方法里,fork出了systemServer子进程,当方法返回时,若runable不为null,则处于systemserver进程中,此时r.run,我们点击进去看下forkSystemServer具体做了什么事情。

private static Runnable forkSystemServer(String abiList, String socketName,
            ZygoteServer zygoteServer) {
        
        /* Hardcoded command line to start the system server */
        /*硬编码命令行来启动system server*/
        String args[] = {
                "--setuid=1000",
                "--setgid=1000",
                "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,"
                        + "1024,1032,1065,3001,3002,3003,3006,3007,3009,3010",
                "--capabilities=" + capabilities + "," + capabilities,
                "--nice-name=system_server",
                "--runtime-args",
                "--target-sdk-version=" + VMRuntime.SDK_VERSION_CUR_DEVELOPMENT,
                "com.android.server.SystemServer",
        };
        ZygoteArguments parsedArgs = null;

        int pid;

        try {
            parsedArgs = new ZygoteArguments(args);
            Zygote.applyDebuggerSystemProperty(parsedArgs);
            Zygote.applyInvokeWithSystemProperty(parsedArgs);

            /* 请求fork system server进程 */
            pid = Zygote.forkSystemServer(
                    parsedArgs.mUid, parsedArgs.mGid,
                    parsedArgs.mGids,
                    parsedArgs.mRuntimeFlags,
                    null,
                    parsedArgs.mPermittedCapabilities,
                    parsedArgs.mEffectiveCapabilities);
        } catch (IllegalArgumentException ex) {
            throw new RuntimeException(ex);
        }

        /* For child process */
        // 判断在子进程中
        if (pid == 0) {
            // ablist可能表明有两个zygote进程,看起来是源码10才有的东西,忽略这点
            if (hasSecondZygote(abiList)) {
                waitForSecondaryZygote(socketName);
            }
            
            // 关闭socket,子进程不需要这玩意
            zygoteServer.closeServerSocket();
            
            // 这里嵌套了很多东西,主要的目的是让我们能跳转到SystemServer代码中执行,返回的runable里面就带着system server的代码入口端。
            return handleSystemServerProcess(parsedArgs);
        }

        return null;
    }

我们通过硬编码的形式,来指出system server需要哪些参数,然后调用Zygote.forkSystemServerfork出来了system server子进程,并返回一个runnable,方便在ZygoteInit中回调,那么对于父进程,则执行另一分支代码

zygoteServer.runSelectLoop(abiList)

简述下这里做了什么,zygote进程在这里直接进入了循环等待状态,等待什么呢,等待socket数据,对于我们来说,可以认为这里在等待启动子进程的命令行,当收到了命令行后,会fork出子进程,并在子进程中退出了runSelectLoop,关闭了zygoteserver,去执行子进程的代码,也即我们一个新的APP从这里启动了

 /**
     * Runs the zygote process's select loop. Accepts new connections as
     * they happen, and reads commands from connections one spawn-request's
     * worth at a time.
     */
    Runnable runSelectLoop(String abiList) {
        ArrayList<FileDescriptor> socketFDs = new ArrayList<FileDescriptor>();
        ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();

        socketFDs.add(mZygoteSocket.getFileDescriptor());
        peers.add(null);

        while (true) {
            ...
                        ZygoteConnection connection = peers.get(pollIndex);
                        final Runnable command = connection.processOneCommand(this);

                        if (mIsForkChild) {
                            return command;
                        } else {
                            if (connection.isClosedByPeer()) {
                                connection.closeSocket();
                                peers.remove(pollIndex);
                                socketFDs.remove(pollIndex);
                            }
                        }
                    }
    }

这里我们忽略了大量代码,拿到最关键的一条connection.processOneCommand(this);

connection.processOneCommand

Runnable processOneCommand(ZygoteServer zygoteServer) {
        String args[];
        ZygoteArguments parsedArgs = null;
        FileDescriptor[] descriptors;

        args = Zygote.readArgumentList(mSocketReader);
        descriptors = mSocket.getAncillaryFileDescriptors();
       
        int pid = -1;
        FileDescriptor childPipeFd = null;
        FileDescriptor serverPipeFd = null;

        parsedArgs = new ZygoteArguments(args);
        
        // 设置策略
        Zygote.applyUidSecurityPolicy(parsedArgs, peer);
        Zygote.applyInvokeWithSecurityPolicy(parsedArgs, peer);
        
        // 也是一些设置,不用太关心
        Zygote.applyDebuggerSystemProperty(parsedArgs);
        Zygote.applyInvokeWithSystemProperty(parsedArgs);

        
        pid = Zygote.forkAndSpecialize(parsedArgs.mUid, parsedArgs.mGid, parsedArgs.mGids,
                parsedArgs.mRuntimeFlags, rlimits, parsedArgs.mMountExternal, parsedArgs.mSeInfo,
                parsedArgs.mNiceName, fdsToClose, fdsToIgnore, parsedArgs.mStartChildZygote,
                parsedArgs.mInstructionSet, parsedArgs.mAppDataDir, parsedArgs.mTargetSdkVersion);

        try {
            if (pid == 0) {
                // in child
                zygoteServer.setForkChild();

                zygoteServer.closeServerSocket();
                IoUtils.closeQuietly(serverPipeFd);
                serverPipeFd = null;

                return handleChildProc(parsedArgs, descriptors, childPipeFd,
                        parsedArgs.mStartChildZygote);
            } else {
                // In the parent. A pid < 0 indicates a failure and will be handled in
                // handleParentProc.
                IoUtils.closeQuietly(childPipeFd);
                childPipeFd = null;
                handleParentProc(pid, descriptors, serverPipeFd);
                return null;
            }
        } finally {
            IoUtils.closeQuietly(childPipeFd);
            IoUtils.closeQuietly(serverPipeFd);
        }
    }

这里我们关心的点就在Zygote.forkAndSpecialize,fork进程,这里fork时执行了native方法,我们这里只需要知道他fork出了子进程就ok了,然后在子进程中,关闭socket等等的收尾处理,然后执行handleChildProc,现在我们已经处在新的子进程中了,也就是我们的APP进程

handleChildProc

private Runnable handleChildProc(ZygoteArguments parsedArgs, FileDescriptor[] descriptors,
            FileDescriptor pipeFd, boolean isZygote) {
        /**
         * By the time we get here, the native code has closed the two actual Zygote
         * socket connections, and substituted /dev/null in their place.  The LocalSocket
         * objects still need to be closed properly.
         */

        closeSocket();
        
        if (!isZygote) {
              // 执行这里
             return ZygoteInit.zygoteInit(parsedArgs.mTargetSdkVersion,
                     parsedArgs.mRemainingArgs, null /* classLoader */);
          } else {
             return ZygoteInit.childZygoteInit(parsedArgs.mTargetSdkVersion,
                     parsedArgs.mRemainingArgs, null /* classLoader */);
            }
        }
    }

代码很简单,关注我们需要的点ZygoteInit.zygoteInit

ZygoteInit.zygoteInit

/**
     * The main function called when started through the zygote process. This could be unified with
     * main(), if the native code in nativeFinishInit() were rationalized with Zygote startup.<p>
     *
     * Current recognized args:
     * <ul>
     *   <li> <code> [--] &lt;start class name&gt;  &lt;args&gt;
     * </ul>
     *
     * @param targetSdkVersion target SDK version
     * @param argv arg strings
     */
    public static final Runnable zygoteInit(int targetSdkVersion, String[] argv,
            ClassLoader classLoader) {
        RuntimeInit.commonInit(); // 通用的初始化
        ZygoteInit.nativeZygoteInit(); // zygote native init
        return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
    }

再看下RuntimeInit.applicationInit做了啥事情

RuntimeInit.applicationInit

protected static Runnable applicationInit(int targetSdkVersion, String[] argv,
            ClassLoader classLoader) {
        
        final Arguments args = new Arguments(argv);

        // Remaining arguments are passed to the start class's static main
        return findStaticMain(args.startClass, args.startArgs, classLoader);
    }

findStaticMain

这段代码不用多说了,找到传入的classname的静态main方法,然后执行他 这里有个问题classname来自于哪里, 我们跟着调用链去找,发现还是在runSelectLoop这里,我们接收到了一个socket,然后从socketconnection中读取数据,这里的数据就是命令行,其中最后一行就指明了要调用的classname是谁

protected static Runnable findStaticMain(String className, String[] argv,
            ClassLoader classLoader) {
        Class<?> cl;

        try {
            cl = Class.forName(className, true, classLoader);
        } catch (ClassNotFoundException ex) {
            throw new RuntimeException(
                    "Missing class when invoking static main " + className,
                    ex);
        }

        Method m;
        try {
            m = cl.getMethod("main", new Class[] { String[].class });
        } catch (NoSuchMethodException ex) {
            throw new RuntimeException(
                    "Missing static main on " + className, ex);
        } catch (SecurityException ex) {
            throw new RuntimeException(
                    "Problem getting static main on " + className, ex);
        }

        int modifiers = m.getModifiers();
        if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
            throw new RuntimeException(
                    "Main method is not public and static on " + className);
        }

        /*
         * This throw gets caught in ZygoteInit.main(), which responds
         * by invoking the exception's run() method. This arrangement
         * clears up all the stack frames that were required in setting
         * up the process.
         */
        return new MethodAndArgsCaller(m, argv);
    }

Ok,我们到这里为止,知道了zygote进程在不断的等待着zygoteServer中的命令行数据,那么请求打开app的socket又是由谁发送的呢?里面的classname是谁呢?classname决定了我们打开一个APP时最开始执行的地方,这个我们必须得知道是哪个类。OK,这些疑问我们就可以留个system server进程来回答了~

Android的源码还是比较清晰明了的。

同样的,我们总结一下ZygoteInit这里都做了什么事情

  1. app_main.c::main,创建AppRuntime并调用AppRuntime.start()方法;
  2. 调用AndroidRuntime的startVM()方法创建虚拟机,再调用startReg()注册JNI函数;
  3. 通过JNI方式调用ZygoteInit.main(),第一次进入Java世界;
  4. zygoteServer建立socket通道,zygote作为通信的服务端,用于响应客户端请求;
  5. zygote完毕大部分工作,接下来再通过forkSystemServer(),fork system_server进程,也是上层framework的运行载体。
  6. zygote功成身退,调用runSelectLoop(),随时待命,当接收到请求创建新进程请求时立即唤醒并执行相应工作。

这里我们缺一副时序图,画起来费事,我们直接copy别人的。

image.png