阅读 1470

Android源码系列-解密BlockCanary

blockcanary是什么?

blockcanary是国内开发者MarkZhai开发的一套性能监控组件,它对主线程操作进行了完全透明的监控,并能输出有效的信息,帮助开发分析、定位到问题所在,迅速优化应用

下图为官方原理介绍示例图:

image.png

简介

Github地址:blockcanary

特点

  • 非侵入式
  • 使用简单
  • 实时监控
  • 提供完善的堆栈及内存信息

Android渲染机制

Android系统每隔16ms发出VSYNC信号,触发对UI进行渲染, 如果每次渲染都成功,这样就能够达到流畅的画面所需要的60fps,为了能够实现60fps,这意味着程序的大多数操作都必须在16ms内完成。如果超过了16ms那么可能就出现丢帧的情况。

本文主要对blockcanary的原理进行分析,关于渲染的详细机制及优化,推荐参考如下文章:

Android性能优化-渲染优化

blockcanary怎么用?

1、gradle引入库

 debugImplementation 'com.github.markzhai:blockcanary-android:1.5.0'
 releaseImplementation 'com.github.markzhai:blockcanary-no-op:1.5.0'
 
复制代码

2、自定义Application并且在onCreate中进行初始化

public class ExampleApplication extends Application {

    @Override public void onCreate() {
        super.onCreate();
        BlockCanary.install(this, new BlockCanaryContext()).start();
    }
}
复制代码

blockcanary核心执行流程是怎样?

blockcanary的核心原理是通过自定义一个Printer,设置到主线程ActivityThread的MainLooper中。MainLooper在dispatch消息前后都会调用Printer进行打印。从而获取前后执行的时间差值,判断是否超过设置的阈值。如果超过,则会将记录的栈信息及cpu信息发通知到前台。

关键类功能说明

说明
BlockCanary 外观类,提供初始化及开始、停止监听
BlockCanaryContext 配置上下文,可配置id、当前网络信息、卡顿阈值、log保存路径等
BlockCanaryInternals blockcanary核心的调度类,内部包含了monitor(设置到MainLooper的printer)、stackSampler(栈信息处理器)、cpuSampler(cpu信息处理器)、mInterceptorChain(注册的拦截器)、以及onBlockEvent的回调及拦截器的分发
LooperMonitor 继承了Printer接口,用于设置到MainLooper中。通过复写println的方法来获取MainLooper的dispatch前后的执行时间差,并控制stackSampler和cpuSampler的信息采集。
StackSampler 用于获取线程的栈信息,将采集的栈信息存储到一个以key为时间戳的LinkHashMap中。通过mCurrentThread.getStackTrace()获取当前线程的StackTraceElement
CpuSampler 用于获取cpu信息,将采集的cpu信息存储到一个以key为时间戳的LinkHashMap中。通过读取/proc/stat文件获取cpu的信息
DisplayService 继承了BlockInterceptor拦截器,onBlock回调会触发发送前台通知
DisplayActivity 用于显示记录的异常信息Activity

代码执行流程

leakcanary的核心流程主要包含3个步骤。

1、init-初始化

2、monitor-监听MainLooper的dispatch时间差,推送前台通知

3、dump-采集线程栈信息及cpu信息

这里先上一下整体的流程图,建议结合源码进行查看。

image

下面我们通过上述3个步骤相关的源码来进行分析。

1、init

根据Application中的使用,我们首先看install方法

  public static BlockCanary install(Context context, BlockCanaryContext blockCanaryContext) {
        //BlockCanaryContext.init会将保存应用的applicationContext和用户设置的配置参数
        BlockCanaryContext.init(context, blockCanaryContext);
        //etEnabled将根据用户的通知栏消息配置开启
        setEnabled(context, DisplayActivity.class, BlockCanaryContext.get().displayNotification());
        return get();
    }
    
复制代码

接着看get方法的实现如下:

    //使用单例创建了一个BlockCanary对象
    public static BlockCanary get() {
        if (sInstance == null) {
            synchronized (BlockCanary.class) {
                if (sInstance == null) {
                    sInstance = new BlockCanary();
                }
            }
        }
        return sInstance;
    }
复制代码

接着我们看BlockCanary的对象的构造方法实现如下:

private BlockCanary() {
        //初始化lockCanaryInternals调度类
        BlockCanaryInternals.setContext(BlockCanaryContext.get());
        mBlockCanaryCore = BlockCanaryInternals.getInstance();
        //为BlockCanaryInternals添加拦截器(责任链)BlockCanaryContext对BlockInterceptor是空实现
        mBlockCanaryCore.addBlockInterceptor(BlockCanaryContext.get());
        if (!BlockCanaryContext.get().displayNotification()) {
            return;
        }
        //DisplayService只在开启通知栏消息的时候添加,当卡顿发生时将通过DisplayService发起通知栏消息
        mBlockCanaryCore.addBlockInterceptor(new DisplayService());

    }

复制代码

接着我们看BlockCanaryInternals的构造方法,实现如下:

public BlockCanaryInternals() {
        //初始化栈采集器
        stackSampler = new StackSampler(
                Looper.getMainLooper().getThread(),
                sContext.provideDumpInterval());
        //初始化cpu采集器
        cpuSampler = new CpuSampler(sContext.provideDumpInterval());

        //初始化LooperMonitor,并实现了onBlockEvent的回调,该回调会在触发阈值后被调用
        setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {

            @Override
            public void onBlockEvent(long realTimeStart, long realTimeEnd,
                                     long threadTimeStart, long threadTimeEnd) {
                ArrayList<String> threadStackEntries = stackSampler
                        .getThreadStackEntries(realTimeStart, realTimeEnd);
                if (!threadStackEntries.isEmpty()) {
                    BlockInfo blockInfo = BlockInfo.newInstance()
                            .setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
                            .setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd))
                            .setRecentCpuRate(cpuSampler.getCpuRateInfo())
                            .setThreadStackEntries(threadStackEntries)
                            .flushString();
                    LogWriter.save(blockInfo.toString());

                    if (mInterceptorChain.size() != 0) {
                        for (BlockInterceptor interceptor : mInterceptorChain) {
                            interceptor.onBlock(getContext().provideContext(), blockInfo);
                        }
                    }
                }
            }
        }, getContext().provideBlockThreshold(), getContext().stopWhenDebugging()));

        LogWriter.cleanObsolete();
    }
复制代码

2、monitor

首先我们先看下系统的Looper的loop()方法中对于printer的使用,如下:

   for (;;) {
            Message msg = queue.next(); // might block
            if (msg == null) {
                // No message indicates that the message queue is quitting.
                return;
            }

            // 执行dispatchMessage前,执行Printer的println方法
            final Printer logging = me.mLogging;
            if (logging != null) {
                logging.println(">>>>> Dispatching to " + msg.target + " " +
                        msg.callback + ": " + msg.what);
            }

            final long traceTag = me.mTraceTag;
            long slowDispatchThresholdMs = me.mSlowDispatchThresholdMs;
            long slowDeliveryThresholdMs = me.mSlowDeliveryThresholdMs;
            if (thresholdOverride > 0) {
                slowDispatchThresholdMs = thresholdOverride;
                slowDeliveryThresholdMs = thresholdOverride;
            }
            final boolean logSlowDelivery = (slowDeliveryThresholdMs > 0) && (msg.when > 0);
            final boolean logSlowDispatch = (slowDispatchThresholdMs > 0);

            final boolean needStartTime = logSlowDelivery || logSlowDispatch;
            final boolean needEndTime = logSlowDispatch;

            if (traceTag != 0 && Trace.isTagEnabled(traceTag)) {
                Trace.traceBegin(traceTag, msg.target.getTraceName(msg));
            }

            final long dispatchStart = needStartTime ? SystemClock.uptimeMillis() : 0;
            final long dispatchEnd;
            try {
                msg.target.dispatchMessage(msg);
                dispatchEnd = needEndTime ? SystemClock.uptimeMillis() : 0;
            } finally {
                if (traceTag != 0) {
                    Trace.traceEnd(traceTag);
                }
            }
            if (logSlowDelivery) {
                if (slowDeliveryDetected) {
                    if ((dispatchStart - msg.when) <= 10) {
                        Slog.w(TAG, "Drained");
                        slowDeliveryDetected = false;
                    }
                } else {
                    if (showSlowLog(slowDeliveryThresholdMs, msg.when, dispatchStart, "delivery",
                            msg)) {
                        // Once we write a slow delivery log, suppress until the queue drains.
                        slowDeliveryDetected = true;
                    }
                }
            }
            if (logSlowDispatch) {
                showSlowLog(slowDispatchThresholdMs, dispatchStart, dispatchEnd, "dispatch", msg);
            }
          // 执行dispatchMessage后,执行Printer的println方法
            if (logging != null) {
                logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
            }

            // Make sure that during the course of dispatching the
            // identity of the thread wasn't corrupted.
            final long newIdent = Binder.clearCallingIdentity();
            if (ident != newIdent) {
                Log.wtf(TAG, "Thread identity changed from 0x"
                        + Long.toHexString(ident) + " to 0x"
                        + Long.toHexString(newIdent) + " while dispatching to "
                        + msg.target.getClass().getName() + " "
                        + msg.callback + " what=" + msg.what);
            }

            msg.recycleUnchecked();
        }
复制代码

当install进行初始化完成后,接着会调用start()方法,实现如下:

  public void start() {
        if (!mMonitorStarted) {
            mMonitorStarted = true;
            //把mBlockCanaryCore中的monitor设置MainLooper中进行监听
            Looper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor);
        }
    }
复制代码

当MainLooper执行dispatch的前后会调用printer的println方法,所以这里我们看LooperMonitor对println方法的实现如下:

 @Override
    public void println(String x) {
        //如果再debug模式,不执行监听
        if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
            return;
        }
        if (!mPrintingStarted) {//dispatchMesage前执行的println
            //记录开始时间
            mStartTimestamp = System.currentTimeMillis();
            mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
            mPrintingStarted = true;
            //开始采集栈及cpu信息
            startDump();
        } else {//dispatchMesage后执行的println
            //获取结束时间
            final long endTime = System.currentTimeMillis();
            mPrintingStarted = false;
            //判断耗时是否超过阈值
            if (isBlock(endTime)) {
                notifyBlockEvent(endTime);
            }
            stopDump();
        }
    }
 //判断是否超过阈值
 private boolean isBlock(long endTime) {
        return endTime - mStartTimestamp > mBlockThresholdMillis;
    }
//回调监听
 private void notifyBlockEvent(final long endTime) {
        final long startTime = mStartTimestamp;
        final long startThreadTime = mStartThreadTimestamp;
        final long endThreadTime = SystemClock.currentThreadTimeMillis();
        HandlerThreadFactory.getWriteLogThreadHandler().post(new Runnable() {
            @Override
            public void run() {
                mBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime);
            }
        });
    }
复制代码

当发现时间差超过阈值后,会回调onBlockEvent。具体的实现在BlockCanaryInternals的构造方法中,如下:

 setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {

            @Override
            public void onBlockEvent(long realTimeStart, long realTimeEnd,
                                     long threadTimeStart, long threadTimeEnd) {
                //根据开始及结束时间,从栈的map当中获取记录信息
                ArrayList<String> threadStackEntries = stackSampler
                        .getThreadStackEntries(realTimeStart, realTimeEnd);
                if (!threadStackEntries.isEmpty()) {
                    //构建 BlockInfo对象,设置相关的信息
                    BlockInfo blockInfo = BlockInfo.newInstance()
                            .setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
                            .setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd))
                            .setRecentCpuRate(cpuSampler.getCpuRateInfo())
                            .setThreadStackEntries(threadStackEntries)
                            .flushString();
                    //记录信息
                    LogWriter.save(blockInfo.toString());
                    //遍历拦截器,通知
                    if (mInterceptorChain.size() != 0) {
                        for (BlockInterceptor interceptor : mInterceptorChain) {
                            interceptor.onBlock(getContext().provideContext(), blockInfo);
                        }
                    }
                }
            }
        }, getContext().provideBlockThreshold(), getContext().stopWhenDebugging()));
复制代码

最后我们看拦截器的实现DisplayService,会发送前台的通知,代码如下:

  @Override
    public void onBlock(Context context, BlockInfo blockInfo) {
        Intent intent = new Intent(context, DisplayActivity.class);
        intent.putExtra("show_latest", blockInfo.timeStart);
        intent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK | Intent.FLAG_ACTIVITY_CLEAR_TOP);
        PendingIntent pendingIntent = PendingIntent.getActivity(context, 1, intent, FLAG_UPDATE_CURRENT);
        String contentTitle = context.getString(R.string.block_canary_class_has_blocked, blockInfo.timeStart);
        String contentText = context.getString(R.string.block_canary_notification_message);
        show(context, contentTitle, contentText, pendingIntent);
    }
复制代码

3、dump

从上面的流程我们可以知道,当dispatchMessage前的println触发时,会执行dump的start方法,当dispatchMessage后的println触发时,会执行dump的stop方法。

 private void startDump() {
        if (null != BlockCanaryInternals.getInstance().stackSampler) {
            BlockCanaryInternals.getInstance().stackSampler.start();
        }

        if (null != BlockCanaryInternals.getInstance().cpuSampler) {
            BlockCanaryInternals.getInstance().cpuSampler.start();
        }
    }

    private void stopDump() {
        if (null != BlockCanaryInternals.getInstance().stackSampler) {
            BlockCanaryInternals.getInstance().stackSampler.stop();
        }

        if (null != BlockCanaryInternals.getInstance().cpuSampler) {
            BlockCanaryInternals.getInstance().cpuSampler.stop();
        }
    }
复制代码

下面我们分Stacksampler和CpuSampler进行介绍。

1、Stacksampler

start()的执行流程如下:

 public void start() {
        if (mShouldSample.get()) {
            return;
        }
        mShouldSample.set(true);

        HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
        //通过一个HandlerThread延时执行了mRunnable
        HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,
                BlockCanaryInternals.getInstance().getSampleDelay());
    }
   //mRunnable在基类AbstractSampler中定义
  private Runnable mRunnable = new Runnable() {
        @Override
        public void run() {
            //抽象方法
            doSample();
            //继续执行采集
            if (mShouldSample.get()) {
                HandlerThreadFactory.getTimerThreadHandler()
                        .postDelayed(mRunnable, mSampleInterval);
            }
        }
    };
 //Stacksampler的doSample()实现
  @Override
    protected void doSample() {
        StringBuilder stringBuilder = new StringBuilder();
        //通过mCurrentThread.getStackTrace()获取StackTraceElement,加入到StringBuilder
        for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
            stringBuilder
                    .append(stackTraceElement.toString())
                    .append(BlockInfo.SEPARATOR);
        }

        synchronized (sStackMap) {
        //Lru算法,控制LinkHashMap的长度
            if (sStackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
                sStackMap.remove(sStackMap.keySet().iterator().next());
            }
            //加入到map中
            sStackMap.put(System.currentTimeMillis(), stringBuilder.toString());
        }
    }
复制代码

stop()的执行流程如下:

 public void stop() {
        if (!mShouldSample.get()) {
            return;
        }
        //设置控制变量
        mShouldSample.set(false);
        //取消handler消息
        HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
    }
复制代码

2、CpuSampler

其他执行流程均与StackSampler一致,这里主要分析doSample的实现,如下:

 //主要通过获取/proc/stat文件 去获取cpu的信息
  protected void doSample() {
        BufferedReader cpuReader = null;
        BufferedReader pidReader = null;

        try {
            cpuReader = new BufferedReader(new InputStreamReader(
                    new FileInputStream("/proc/stat")), BUFFER_SIZE);
            String cpuRate = cpuReader.readLine();
            if (cpuRate == null) {
                cpuRate = "";
            }

            if (mPid == 0) {
                mPid = android.os.Process.myPid();
            }
            pidReader = new BufferedReader(new InputStreamReader(
                    new FileInputStream("/proc/" + mPid + "/stat")), BUFFER_SIZE);
            String pidCpuRate = pidReader.readLine();
            if (pidCpuRate == null) {
                pidCpuRate = "";
            }

            parse(cpuRate, pidCpuRate);
        } catch (Throwable throwable) {
            Log.e(TAG, "doSample: ", throwable);
        } finally {
            try {
                if (cpuReader != null) {
                    cpuReader.close();
                }
                if (pidReader != null) {
                    pidReader.close();
                }
            } catch (IOException exception) {
                Log.e(TAG, "doSample: ", exception);
            }
        }
    }
复制代码

blockcanary是如何进行卡顿的判定?

blockcanary的核心原理是通过自定义一个Printer,设置到主线程ActivityThread的MainLooper中。MainLooper在dispatch消息前后都会调用Printer进行打印。从而获取前后执行的时间差值,判断是否超过设置的阈值。如果超过,则判定为卡顿。

leakcanary是如何获取线程的堆栈信息?

通过mCurrentThread.getStackTrace()方法,遍历获取StackTraceElement,转化为一个StringBuilder的value,并存储到一个key为时间戳的LinkHashMap中。

leakcanary是如何获取cpu的信息?

通过读取/proc/stat文件,获取所有CPU活动的信息来计算CPU使用率。解析出信息后,转化为一个StringBuilder的value,并存储到一个key为时间戳的LinkHashMap中。

总结

思考

blockcanary充分的利用了Loop的机制,在MainLooper的loop方法中执行dispatchMessage前后都会执行printer的println进行输出,并且提供了方法设置printer。通过分析前后打印的时差与阈值进行比对,从而判定是否卡顿。

参考资料

Android性能优化-渲染优化

Android UI卡顿监测框架BlockCanary原理分析

推荐

Android源码系列-解密OkHttp

Android源码系列-解密Retrofit

Android源码系列-解密Glide

Android源码系列-解密EventBus

Android源码系列-解密RxJava

Android源码系列-解密LeakCanary

Android源码系列-解密BlockCanary

关于

欢迎关注我的个人公众号

微信搜索:一码一浮生,或者搜索公众号ID:life2code

image