Android 渲染系列-App整个渲染流程全解析前言谈到Android的渲染，可能会想到测量、布局、绘制三大流程。但

1 前言

谈到Android的渲染，可能会想到测量、布局、绘制三大流程。但我们的view到底是如何一步一步显示到屏幕的？App的CPU/GPU渲染到底是什么？OpenGL/Vulkan/skia是什么？ surfaceFlinger和HAL又是什么呢？

带着这些问题，我们今天就深入的去学习Android绘制的整个流程吧。

参考分层思想，我们大概把整个渲染分为App层和SurfaceFlinger层，先讲各层都做什么工作，后面在把二者联系起来。

2 相关概念

2.1 Vsync信号

由系统设备产生。假设在60HZ的屏幕上，屏幕就会每16ms进行一次扫描，在两次扫描中间会有一个间隔，此时系统就会发出Vsync信号，来通知APP（Vsync-app）进行渲染，SurfaceFlinger（Vsync-sf）来进行swap缓冲区进行展示。因此，只要App的渲染过程（CPU计算+GPU绘制）不超过16ms，画面就会显得很流畅。

说明：

如果系统检测到硬件支持，则Vysnc信号由硬件产生，否则就由软件模拟产生。这个了解即可。
Vsync offset机制： Vsync-app、Vsync-sf并不是同时通知的，Vsync-sf会相对晚些，但对于我们App开发者来说，即可认为约等于同时发生。

2.2 OpenGL、Vulkan、Skia

OpenGL：是一种跨平台的3D图形绘制规范接口。OpenGL EL则是专门针对嵌入式设备，如手机做了优化。
Vulkan：跟OpenGL相同功能，不过它同时支持3D、2D，比OpenGL更加的轻量、性能更高。
Skia： skia是图像渲染库，2D图形绘制自己就能完成。3D效果（依赖硬件）由OpenGL、Vulkan、Metal支持。它不仅支持2D、3D，同时支持CPU软件绘制和GPU硬件加速。Android、flutter都是使用它来完成绘制。

2.3 GPU和OpenGL

OpenGL是规范，GPU就是该规范具体的设备实现者。

2.4 Surface 与 Graphic buffer

插入一个问题：一个Android程序有多少个window？？

答： Activity对应一个应用window，dialog对应子window，toast对应系统window。因此，理论上可以存在无限个window。

另外，手机的顶部状态栏和底部菜单栏（statusBar+menu）也各自对应着一个window。

Android的一个window对应一个surface，一个surface对应一个BufferQueue。 但是：一个surface不一定对应一个window。 如surfaceView里面封装了surface，且绘制操作都在子线程，但却属于view。

因此，一个应用程序可以有多个surface。一个surface对应着一个layer，surfaceFlinger管理这些layer的层级，根据层级不同来显示界面

canvas 是通过surface.lockCnavas得到（最终调用JNI的framework层的surface.lock方法获取graphic buffer）。

surface通过dequeue拿到graphic buffer，然后进行渲染绘制，渲染完成后回到BufferQueu队列，最后通知surfaceFlinger来消费。

2.5 SurfaceFlinger 是什么？

可以认为它是协调缓冲区数据和设备显示的协调者。 Vsync信号、三倍缓冲、缓冲区的合成操作都是由它来控制。

3 Android渲染演变

了解Android系统对渲染的不断优化历史，对于理解渲染很有帮助。

3.1 Android 4.1

引入了project butter黄油计划：Vsync、三倍缓冲、choreography编舞者。

3.2 android 5.0

引入了RenderThread线程（该线程是系统在framework层维护），把之前CPU直接操作绘制指令（OpenGL/vulkan/skia）部分，交给了单独的渲染线程。减少主线程的工作。即使主线程卡住，渲染也不受影响。

3.3 Android 7.0

引入了Vulkan支持。 OpenGL是3D渲染API，VulKan是用来替换OpenGL的。它不仅支持3D，也支持2D，同时更加轻量级。

4 App做了什么（重点）

4.1 一个Activity是怎样显示出来的？

Activity的显示分为根Activity（冷启动）和普通Activity来讲。

根Activity由Luancher桌面程序来发起。普通Activity由当前应用发起，最终都会调用Activity中的startActivity()方法。

根Activity的启动相对复杂，涉及到了进程间的通信。这里分析根Activity的启动，普通Activity的启动其实也包含在里面了。

App启动简要流程：

4.1.1，点击桌面App图标Launcher做了什么

当点击桌面的App图标，Laucher进程调用 startActivitySafely(),接着调用Activity. startActivityForResult()，再到instrumentation.execStartActivity()，

public ActivityResult execStartActivity(
        Context who, IBinder contextThread, IBinder token, Activity target,
        Intent intent, int requestCode, Bundle options) {

//...省略
//通过binder 完成对AMS跨进程通信
//这里会携带信息，告诉AMS我要启动Activity的所在的进程信息、包名
int result = ActivityManager.getService()
    .startActivity(whoThread, who.getBasePackageName(), intent,
            intent.resolveTypeIfNeeded(who.getContentResolver()),
            token, target != null ? target.mEmbeddedID : null,
            requestCode, 0, null, options);
checkStartActivityResult(result, intent);
}

至此，调用流程离开Launcher进程，进入到了SystemServer进程的AMS中。

4.1.2, Launcher到AMS

AMS内部会调用到ActivityStackSupervisor.startSpecificActivityLocked():

void startSpecificActivityLocked(ActivityRecord r,
        boolean andResume, boolean checkConfig) {
    // Is this activity's application already running?
    ProcessRecord app = mService.getProcessRecordLocked(r.processName,
            r.info.applicationInfo.uid, true);

    r.getStack().setLaunchTime(r);

// ①  如果目标进程已经启动，则直接启动Activity。
    if (app != null && app.thread != null) {
        try {
            if ((r.info.flags&ActivityInfo.FLAG_MULTIPROCESS) == 0
                    || !"android".equals(r.info.packageName)) {
                // Don't add this if it is a platform component that is marked
                // to run in multiple processes, because this is actually
                // part of the framework so doesn't make sense to track as a
                // separate apk in the process.
                app.addPackage(r.info.packageName, r.info.applicationInfo.versionCode,
                        mService.mProcessStats);
            }
            realStartActivityLocked(r, app, andResume, checkConfig);
            return;
        } catch (RemoteException e) {
            Slog.w(TAG, "Exception when starting activity "
                    + r.intent.getComponent().flattenToShortString(), e);
        }

        // If a dead object exception was thrown -- fall through to
        // restart the application.
    }
//②启动目标进程，这里又回到了AMS类的startProcessLocked（）方法
    mService.startProcessLocked(r.processName, r.info.applicationInfo, true, 0,
            "activity", r.intent.getComponent(), false, false, true);
}

启动的是根Activity，此时进程肯定不存在。所以走的②流程：（Activity：我凑，你不管我了吗？？那你什么时候回来调用realStartActivityLocked方法啊？？）

ActivityManagerService.java


private final void startProcessLocked(ProcessRecord app, String hostingType,
        String hostingNameStr, String abiOverride, String entryPoint, String[] entryPointArgs) {

// ...省略代码
//调用Process.start去开启APP进程 
startResult = Process.start(entryPoint,
        app.processName, uid, uid, gids, debugFlags, mountExternal,
        app.info.targetSdkVersion, seInfo, requiredAbi, instructionSet,
        app.info.dataDir, invokeWith, entryPointArgs);

// ...省略代码  
 }

4.1.3 AMS启动应用进程

Process.start会从zygote孵化进程for出我们应用进程，最终执行应用进程的java类ActivityThread.java的静态main方法。注意，这里是通过socket通信来完成。

4.1.4 目标App进程的初始化

在main方法中，完成主线程looper的创建和启动。创建ActivityThread对象，调用 ActivityThread.attach(),完成ApplicationThread与AMS的绑定，告诉AMS应用进程已经启动完成。

private void attach(boolean system) {

final IActivityManager mgr = ActivityManager.getService();
try {
      //告诉AMS，应用进程和主线程都已经初始化好了
     // ApplicationThread就是应用进程与SystemServer进程的通信桥梁
    mgr.attachApplication(mAppThread);
} catch (RemoteException ex) {
    throw ex.rethrowFromSystemServer();
}
}

4.1.5，AMS回调App进程的生命周期方法

我们看AMS的 attachApplication(IApplicationThread thread) 方法：

@Override
public final void attachApplication(IApplicationThread thread) {
    synchronized (this) {
        int callingPid = Binder.getCallingPid();
        final long origId = Binder.clearCallingIdentity();
        attachApplicationLocked(thread, callingPid);
        Binder.restoreCallingIdentity(origId);
    }
}

attachApplicationLocked（）方法做了如下工作：

回调应用进程的ApplicationThread.bindApplication()，通过handler发消息给ActivityThread，最终调用Application.onCreate（）方法。
检测改进程是否有需要启动Activity，终于我们回到了Activity的启动了。。。太难了啊

private final boolean attachApplicationLocked(IApplicationThread thread,
        int pid) {

//.. 省略代码 

//1. 回调应用进程的Application.onCreate（）方法
if (app.instr != null) {
    thread.bindApplication(processName, appInfo, providers,
            app.instr.mClass,
            profilerInfo, app.instr.mArguments,
            app.instr.mWatcher,
            app.instr.mUiAutomationConnection, testMode,
            mBinderTransactionTrackingEnabled, enableTrackAllocation,
            isRestrictedBackupMode || !normalMode, app.persistent,
            new Configuration(getGlobalConfiguration()), app.compat,
            getCommonServicesLocked(app.isolated),
            mCoreSettingsObserver.getCoreSettingsLocked(),
            buildSerial);
} else {
    thread.bindApplication(processName, appInfo, providers, null, profilerInfo,
            null, null, null, testMode,
            mBinderTransactionTrackingEnabled, enableTrackAllocation,
            isRestrictedBackupMode || !normalMode, app.persistent,
            new Configuration(getGlobalConfiguration()), app.compat,
            getCommonServicesLocked(app.isolated),
            mCoreSettingsObserver.getCoreSettingsLocked(),
            buildSerial);
}


2. 检测改进程是否有需要启动activity
// See if the top visible activity is waiting to run in this process...
if (normalMode) {
    try {
   ** // 就是这里，我们又回去了！！！**
        if (mStackSupervisor.attachApplicationLocked(app)) {
            didSomething = true;
        }
    } catch (Exception e) {
        Slog.wtf(TAG, "Exception thrown launching activities in " + app, e);
        badApp = true;
    }
}

}

4.1.6，AMS启动Activity

application已经启动完成，那么AMS后续就会通过Binder调用ApplicationThread.scheduleLaunchActivity(), 从而回调到主线程中的performLaunchActivity()，里面会调用activity的onCreate（）方法、onResume（）方法。

4.1.7 App进程展示 Activity

最终，Activity会通过WindowMnagerGlobal.addView()方法把decorView加入到集合list中，再调用ViewRootImpl.setView(view)方法，最后调用requestLayout()完成测量、布局、绘制流程。

4.2 渲染入口

经过上面的总结，我们知道Activity的显示最终会调用requestLayout（）方法。

我们想要重绘某个view的时候调用的则是invalidate（）方法（它会在onVsync信号来的时候，也就是下一帧，触发View.onDraw（）方法）。

invalidate（）让 drawing cache（绘制缓存）无效,也就是所谓的标脏，所以才会要重新进行绘制。

4.2.1 View 和 ViewRootImpl的关系

ViewRootImpl在调用setView（）方法的时候，会把自己绑定到View的ViewParent成员

ViewRootImpl

/**
 * We have one child
 */
public void setView(View view, WindowManager.LayoutParams attrs, View panelParentView) {
//...省略代码

//完成绑定
view.assignParent(this);
mAddedTouchMode = (res & WindowManagerGlobal.ADD_FLAG_IN_TOUCH_MODE) != 0;
mAppVisible = (res & WindowManagerGlobal.ADD_FLAG_APP_VISIBLE) != 0;

if (mAccessibilityManager.isEnabled()) {
    mAccessibilityInteractionConnectionManager.ensureConnection();
}
}

View.java

void assignParent(ViewParent parent) {
    if (mParent == null) {
        mParent = parent;
    } else if (parent == null) {
        mParent = null;
    } else {
        throw new RuntimeException("view " + this + " being added, but"
                + " it already has a parent");
    }
}

当调用View的invalidate（）方法时，会调用viewParent成员变量的invalidateChild（）方法，最终调到ViewRootImpl的invalidate（）->scheduleTraversals（）方法。

// Propagate the damage rectangle to the parent view.
final AttachInfo ai = mAttachInfo;
final ViewParent p = mParent;
if (p != null && ai != null && l < r && t < b) {
    final Rect damage = ai.mTmpInvalRect;
    damage.set(l, t, r, b);
    p.invalidateChild(this, damage);
}

因此，我们来看看方法：ViewRootImpl.scheduleTraversals（）：

ViewRootImpl.java

void invalidate() {
    mDirty.set(0, 0, mWidth, mHeight);
    if (!mWillDrawSoon) {
        scheduleTraversals();
    }
}

@Override
public void requestLayout() {
    if (!mHandlingLayoutInLayoutRequest) {
        checkThread();
        mLayoutRequested = true;
        scheduleTraversals();
    }
}

void scheduleTraversals() {
    if (!mTraversalScheduled) {
        mTraversalScheduled = true;
        mTraversalBarrier = mHandler.getLooper().getQueue().postSyncBarrier();
        //想编舞者发送一个callback，在一帧回调。
        mChoreographer.postCallback(
                Choreographer.CALLBACK_TRAVERSAL, mTraversalRunnable, null);
        if (!mUnbufferedInputDispatch) {
            scheduleConsumeBatchedInput();
        }
        notifyRendererOfFramePending();
        pokeDrawLockIfNeeded();
    }
}

viewRootImpl的invalidate()方法会postCalback到choreography类。

choreography是在viewRootImpl创建的的时注册了监听系统的vsync信号。

当onVsync回调下一帧的时候，就会执行choreography.doFrame()方法，然后执行callback，调用 viewRootImpl的performTraversal()--doTraversal()方法，从而执行onMeasure()、onLayout()、onDraw()三大流程。

4.3 UI线程的draw()方法到底做了什么

因为 performMeasure()、performLayout()都还只是通过CPU计算出view的大小和和布局的位置，而真正的绘制正是从 perfomDraw（）方法开始的。

ViewRootImpl的draw（）方法，里面调用到drawSoftware（）方法：

private boolean drawSoftware(Surface surface, AttachInfo attachInfo, int xoff, int yoff,
        boolean scalingRequired, Rect dirty) {

//省略代码..
// Draw with software renderer.
final Canvas canvas;
try {
    final int left = dirty.left;
    final int top = dirty.top;
    final int right = dirty.right;
    final int bottom = dirty.bottom;
    //通过surface 获取canvas，开始绘制
    canvas = mSurface.lockCanvas(dirty);
    // The dirty rectangle can be modified by Surface.lockCanvas()
    //noinspection ConstantConditions
    if (left != dirty.left || top != dirty.top || right != dirty.right
            || bottom != dirty.bottom) {
        attachInfo.mIgnoreDirtyState = true;
    }

    // TODO: Do this in native
    canvas.setDensity(mDensity);
} catch (Surface.OutOfResourcesException e) {
    handleOutOfResourcesException(e);
    return false;
} catch (IllegalArgumentException e) {
    Log.e(mTag, "Could not lock surface", e);
    // Don't assume this is due to out of memory, it could be
    // something else, and if it is something else then we could
    // kill stuff (or ourself) for no reason.
    mLayoutRequested = true;    // ask wm for a new surface next time.
    return false;
}
//省略代码..

try {
    canvas.translate(-xoff, -yoff);
    if (mTranslator != null) {
        mTranslator.translateCanvas(canvas);
    }
    canvas.setScreenDensity(scalingRequired ? mNoncompatDensity : 0);
    attachInfo.mSetIgnoreDirtyState = false;
    // 拿到canvas，传递下去，开始绘制。这都是在主线程的，所以不能做耗时操作。
    mView.draw(canvas);

    drawAccessibilityFocusedDrawableIfNeeded(canvas);
} finally {
    if (!attachInfo.mSetIgnoreDirtyState) {
        // Only clear the flag if it was not set during the mView.draw() call
        attachInfo.mIgnoreDirtyState = false;
    }
}

//省略代码..

} finally {
    try {
        //  最终结束绘制 ,把数据通过JNI交给引擎的surface.cpp
        surface.unlockCanvasAndPost(canvas);
    } catch (IllegalArgumentException e) {
        Log.e(mTag, "Could not unlock surface", e);
        mLayoutRequested = true;    // ask wm for a new surface next time.
        //noinspection ReturnInsideFinallyBlock
        return false;
    }

    if (LOCAL_LOGV) {
        Log.v(mTag, "Surface " + surface + " unlockCanvasAndPost");
    }
}

上面代码做了三件事情：

通过surface.lockCanvas()把canvas对象绑定到native侧，native层则dequeue()出一块graphic buffer，然后把canvas与graphic buffer进行绑定，因此，canvas其实就是graphic buffer在java层的代表。
canvas绑定之后，开始view的draw流程。
绘制结束后，最终是把graphic buffer入队enqueue()，绘制完成，解绑canvas。

4.4 绘制方式的演变

但由于Android4.0之前是软件绘制，4.0后才默认开启硬件绘制所以第三点具体还是有些区别的。

如果是软件绘制（Android3.0之前），则直接由CPU来完成，这势必会引起UI线程的卡顿或者ANR。
硬件绘制，则是把这个过程交给GPU设备调用OpenGL来完成，承担了CPU的部分工作，但此时还是在主线程完成的，主线程既要完成展示列表的更新维护，又要把展示列表转化为OpenGL的绘制指令参与绘制。
Android5.0后，则是在native层开了RenderThread线程，专门用来调用OpenGL接口完成绘制。UI线程只需要记录维护view的更新列表，完成展示列表更新后，通知RenderThread线程去绘制，而不用真正去参与绘制细节，大大减轻了UI线程的工作。

4.5 UI线程、RenderThread线程、SurfaceFlinger之间的数据传递

那UI线程如何与RenderThread交互呢？什么时候把绘制好的数据交给SurfaceFlinger呢？

onMeasure()、onLayout()计算出view的大小和摆放的位置，这都是UI线程要做的事情。

在draw()方法中进行绘制，但此时是没有真正去绘制。而是把绘制的指令封装为displayList,进一步封装为RendNode，在同步给RenderThread。
RenderThread通过dequeue（）拿到graphic buffer（surfaceFlinger的缓冲区），根据绘制指令直接操作OpenGL的绘制接口，最终通过GPU设备把绘制指令渲染到了离屏缓冲区graphic buffer。
完成渲染后，把缓冲区交还给SurfaceFlinger的BufferQueue。SurfaceFlinger会通过硬件设备进行layer的合成，最终展示到屏幕。

以上流程也体现了生产者与消费者模式：

生产者： APP，再深入点就是canvas->surface。

消费者：SurfaceFlinger

FrameBufer 的大小一般是3。

一块缓冲区用来被SurfaceFlinger交由设备展示
一块用来App绘制缓冲数据
还有一块，如果App绘制超过一帧时间16ms的时候，当下一帧vsync到来，其中两块都已经被占用，所以要用到第三块，避免此次vsync信号CPU和GPU处于空闲（因为如果空闲的话，下下帧就会出现jank）。

5 SurfaceFlinger 做了什么

SurfaceFlinger是显示合成系统。在应用程序请求创建surface的时候，SurfaceFlinger会创建一个Layer。Layer是SurfaceFlinger操作合成的基本单元。所以，一个surface对应一个Layer。

当应用程序把绘制好的GraphicBuffer数据放入BufferQueue后，接下来的工作就是SurfaceFlinger来完成了。

说明：

系统会有多个应用程序，一个程序有多个BufferQueue队列。SurfaceFlinger就是用来决定何时以及怎么去管理和显示这些队列的。

SurfaceFlinger请求HAL硬件层，来决定这些Buffer是硬件来合成还是自己通过OpenGL来合成。

最终把合成后的buffer数据，展示在屏幕上。

官方完整渲染架构：

说明：

image stream produceers: 渲染数据的生产者，如App的draw方法会把绘制指令通过canvas传递给framework层的RenderThread线程。
native Framework: RenderThread线程通过surface.dequeue得到缓冲区graphic bufer，然后在上面通过OpenGL来完成真正的渲染命令。在把缓冲区交还给BufferQueue队列中。
image stream consumers: surfaceFlinger从队列中获取数据，同时和HAL完成layer的合成工作，最终交给HAL展示。
HAL: 硬件抽象层。把图形数据展示到设备屏幕

参考：

source.android.google.cn/devices/gra…

zhuanlan.zhihu.com/p/351743856

juejin.cn/post/684490…

testerhome.com/topics/2336…

androidperformance.com/2019/10/22/…