LeakCanary 源码阅读笔记(三)

193 阅读6分钟

LeakCanary 源码阅读笔记(三)

20241102-7.jpg

在前面篇文章中我介绍了 LeakCanary 的初始化,ActivityFragmentViewModelroot viewService 的销毁监控:LeakCanary 源码阅读笔记(一), LeakCanary 源码阅读笔记(二)

当所有组件销毁时都会通知 DeletableObjectReporter#expectDeletionFor() 方法,我们来看看这个对象是怎么创建的:

fun appDefaultWatchers(
  application: Application,
  deletableObjectReporter: DeletableObjectReporter = objectWatcher.asDeletableObjectReporter()
): List<InstallableWatcher> {
  // Use app context resources to avoid NotFoundException
  // https://github.com/square/leakcanary/issues/2137
  val resources = application.resources
  val watchDismissedDialogs = resources.getBoolean(R.bool.leak_canary_watcher_watch_dismissed_dialogs)
  return listOf(
    ActivityWatcher(application, deletableObjectReporter),
    FragmentAndViewModelWatcher(application, deletableObjectReporter),
    RootViewWatcher(deletableObjectReporter, WindowTypeFilter(watchDismissedDialogs)),
    ServiceWatcher(deletableObjectReporter)
  )
}

在获取默认 Watcher 的时候我们就看到了它的身影,希望你还没有忘记上面的代码,他是通过 ObjectWatcher#asDeletableObjectReporter() 方法获取的,ObjectWatcher 已经被标记为弃用了,看来不久的将来会移除它。

fun asDeletableObjectReporter(): DeletableObjectReporter =
  DeletableObjectReporter { target, reason ->
  // 这个 Labmbda 其实就是 expectDeletionFor() 方法
  
    expectWeaklyReachable(target, reason)
    // This exists for backward-compatibility purposes and as such is unable to return
    // an accurate [TrackedObjectReachability] implementation.
    object : TrackedObjectReachability {
      override val isStronglyReachable: Boolean
        get() = error("Use a non deprecated DeletableObjectReporter implementation instead")
      override val isRetained: Boolean
        get() = error("Use a non deprecated DeletableObjectReporter implementation instead")
    }
  }

然后主要逻辑是调用了 expectWeaklyReachable() 方法,我们来看看在 ObjectWatcher 中的实现:


override fun expectWeaklyReachable(
  watchedObject: Any,
  description: String
) {
  if (!isEnabled()) {
    return
  }
  val retainTrigger =
    retainedObjectTracker.expectDeletionOnTriggerFor(watchedObject, description)

  checkRetainedExecutor.execute {
    retainTrigger.markRetainedIfStronglyReachable()
  }
}

调用 ReferenceQueueRetainedObjectTracker#expectDeletionOnTriggerFor() 方法,这个方法会返回一个 RetainTrigger,然后在后台线程池中再执行 RetainTrigger#markRetainedIfStronglyReachable() 方法。

我相信大部分人都使用过 WeakReference,但是我相信不是所有人都知道如何监听 WeakReference 中的对象何时被回收,那就是在创建 WeakReference 的时候传入一个 ReferenceQueue,被回收的对象会被添加到 ReferenceQueue 中,这样我们就知道哪些对象被 gc 给回收了。在 LeakCanary 中也是这么判断的,应该被回收的对象都会传给过来,这个过程中还会手动触发 gc,然后超过一段时间后还是没有被回收 LeakCanary 就认为这个对象泄漏了,然后就会采取下一步措施锁定问题。

在了解了上面的知识后我们继续看 expectDeletionOnTriggerFor() 方法:


override fun expectDeletionOnTriggerFor(
  target: Any,
  reason: String
): RetainTrigger {
  // 移除已经被回收的对象
  removeWeaklyReachableObjects()
  val key = UUID.randomUUID()
    .toString()
  val watchUptime = clock.uptime()
  // 生成一个自定义的带 key 的 weak reference, 注意这里有传入 ReferenceQueue
  val reference =
    KeyedWeakReference(target, key, reason, watchUptime.inWholeMilliseconds, queue)
  SharkLog.d {
    "Watching " +
      (if (target is Class<*>) target.toString() else "instance of ${target.javaClass.name}") +
      (if (reason.isNotEmpty()) " ($reason)" else "") +
      " with key $key"
  }
  
  // 保存弱引用到成员变量
  watchedObjects[key] = reference
  // 构建一个 trigger 对象返回
  return object : RetainTrigger {
  
    // 判断强引用是否可以到达
    override val isStronglyReachable: Boolean
      get() {
        removeWeaklyReachableObjects()
        val weakRef = watchedObjects[key]
        return weakRef != null
      }
   
    // 判断当前对象是否还存在,如果这里是 true 那就表示当前对象已经泄漏了
    override val isRetained: Boolean
      get() {
        removeWeaklyReachableObjects()
        val weakRef = watchedObjects[key]
        return weakRef?.retained ?: false
      }
    
    // 触发检查当前对象是否发生泄漏
    override fun markRetainedIfStronglyReachable() {
      moveToRetained(key)
    }
  }
}

我们在前面说到会在后台线程触发 RetainTrigger#markRetainedIfStronglyReachable() 方法,这个后台线程的触发的延迟时间是 5s,内部调用的是 moveToRetained() 方法,这里简单做一个总结,也就是 Android 中的某个组件进入 destroy 生命周期,LeakCanary 会将它对应的 Java 对象创建一个 WeakReference,并添加 ReferenceQueue,用于后续检查对应的对象是否已经被回收。然后开启一个延迟 5s 的检查任务,当这个任务触发时去检查目标对象是否被回收,如果没有被回收就开始后续的证据收集任务。

我们先来简单看看 LeakCanary 自定义的 KeyedWeakReference 类:


class KeyedWeakReference(
  referent: Any,
  // 对应对象的 key
  val key: String,
  // 组件的描述
  val description: String,
  // 组件对象 destory 生命周期时的时间戳
  val watchUptimeMillis: Long,
  referenceQueue: ReferenceQueue<Any>
) : WeakReference<Any>(
  referent, referenceQueue
) {
  /**
   * Time at which the associated object ([referent]) was considered retained, or -1 if it hasn't
   * been yet.
   */
  // 触发回收检查时,并且没有回收时的时间戳 
  @Volatile
  var retainedUptimeMillis = -1L

  // 是否被回收
  val retained: Boolean
    get() = retainedUptimeMillis != -1L

  override fun clear() {
    super.clear()
    retainedUptimeMillis = -1L
  }

  override fun get(): Any? {
    error("Calling KeyedWeakReference.get() is a mistake as it revives the reference")
  }

  /**
   * Same as [WeakReference.get] but does not trigger an intentional crash.
   *
   * Calling this method will end up creating local references to the objects, preventing them from
   * becoming weakly reachable, and creating a leak. If you need to check for identity equality, use
   * Reference.refersTo instead.
   */
  fun getAndLeakReferent(): Any? {
    return super.get()
  }

  companion object {
    @Volatile
    @JvmStatic var heapDumpUptimeMillis = 0L
  }
}

上面的对象并没有做太多的工作,添加了关键的对象的 key,还使用了一个变量来描述对象是否被回收,或者表示成是否泄漏。

继续看看 removeWeaklyReachableObjects() 方法如何清除被回收的对象。

private fun removeWeaklyReachableObjects() {
  // WeakReferences are enqueued as soon as the object to which they point to becomes weakly
  // reachable. This is before finalization or garbage collection has actually happened.
  var ref: KeyedWeakReference?
  do {
    ref = queue.poll() as KeyedWeakReference?
    if (ref != null) {
      watchedObjects.remove(ref.key)
    }
  } while (ref != null)
}

方法很简单,直接去读取 ReferenceQueue 中的对象,这里面的对象都是被回收了的,然后把它们从 watchedObjects 中移除。上面的方法在很多的地方都会调用。

我们继续看看 moveToRetained() 方法是如何强制检查某个对象是否泄漏了:


private fun moveToRetained(key: String) {
  // 清除已经被回收的对象
  removeWeaklyReachableObjects()
  // 获取对应对象的 WeakRef
  val retainedRef = watchedObjects[key]
  if (retainedRef != null) {
    // 这里就表示发生了泄漏。
    
    // 更新泄漏检查的时间
    retainedRef.retainedUptimeMillis = clock.uptime().inWholeMilliseconds
    // 通知已经发生泄漏
    onObjectRetainedListener.onObjectRetained()
  }
}

检查某个对象是否泄漏也非常简单,先清除一次被回收的对象,然后去查找对应对象是否还存在,如果存在就表示发生了泄漏,发生泄漏后会更新泄漏时的时间戳,然后通过 listener 通知别的地方发生了泄漏,那么这个 listener 是在哪儿添加的呢?
在第一篇文章中讲 LeakCannary 初始化的时候我就说过,主要初始化分为两块逻辑,一是初始化各种 Watchers 去监控 Android 组件的销毁,还有就是初始化 LeakCanaryDelegate,你可能已经忘了,我们再回忆下:

@JvmOverloads
fun manualInstall(
  application: Application,
  retainedDelayMillis: Long = TimeUnit.SECONDS.toMillis(5),
  watchersToInstall: List<InstallableWatcher> = appDefaultWatchers(application)
) {
  checkMainThread()
  if (isInstalled) {
    throw IllegalStateException(
      "AppWatcher already installed, see exception cause for prior install call", installCause
    )
  }
  check(retainedDelayMillis >= 0) {
    "retainedDelayMillis $retainedDelayMillis must be at least 0 ms"
  }
  this.retainedDelayMillis = retainedDelayMillis
  if (application.isDebuggableBuild) {
    LogcatSharkLog.install()
  }
  // Requires AppWatcher.objectWatcher to be set
  // 初始化 LeakCanaryDelegate
  LeakCanaryDelegate.loadLeakCanary(application)
  
  // 初始化各种 watchers
  watchersToInstall.forEach {
    it.install()
  }
  // Only install after we're fully done with init.
  installCause = RuntimeException("manualInstall() first called here")
}

继续看看 LeakCanaryDelegate.loadLeakCanary() 方法的实现:

@Suppress("UNCHECKED_CAST")
val loadLeakCanary by lazy {
  try {
    val leakCanaryListener = Class.forName("leakcanary.internal.InternalLeakCanary")
    leakCanaryListener.getDeclaredField("INSTANCE")
      .get(null) as (Application) -> Unit
  } catch (ignored: Throwable) {
    NoLeakCanary
  }
}

这里又通过反射的方法调用 InternalLeakCanary 的方法:

override fun invoke(application: Application) {
  _application = application

  checkRunningInDebuggableBuild()
  
  // 添加泄漏监听
  AppWatcher.objectWatcher.addOnObjectRetainedListener(this)
  
  // gc 触发器
  val gcTrigger = GcTrigger.inProcess()

  val configProvider = { LeakCanary.config }

  // 后台线程初始化
  val handlerThread = HandlerThread(LEAK_CANARY_THREAD_NAME)
  handlerThread.start()
  val backgroundHandler = Handler(handlerThread.looper)

  // 内存 dump 触发器
  heapDumpTrigger = HeapDumpTrigger(
    application, backgroundHandler, AppWatcher.objectWatcher, gcTrigger,
    configProvider
  )
  
  // APP 可见性监听
  application.registerVisibilityListener { applicationVisible ->
    this.applicationVisible = applicationVisible
   heapDumpTrigger.onApplicationVisibilityChanged(applicationVisible)
  }
  registerResumedActivityListener(application)
 
 // 添加 Android shortcut 的UI
 LeakCanaryAndroidInternalUtils.addLeakActivityDynamicShortcut(application)

  // We post so that the log happens after Application.onCreate() where
  // the config could be updated.
  mainHandler.post {
    // https://github.com/square/leakcanary/issues/1981
    // We post to a background handler because HeapDumpControl.iCanHasHeap() checks a shared pref
    // which blocks until loaded and that creates a StrictMode violation.
    backgroundHandler.post {
      SharkLog.d {
        when (val iCanHasHeap = HeapDumpControl.iCanHasHeap()) {
          is Yup -> application.getString(R.string.leak_canary_heap_dump_enabled_text)
          is Nope -> application.getString(
            R.string.leak_canary_heap_dump_disabled_text, iCanHasHeap.reason()
          )
        }
      }
    }
  }
}

首先注册最重要的泄漏监听,这也解决了我们上面提出的问题;初始化 gc 触发器;初始化内存 dump 触发器;监听 APP 的可见性;添加 Android shortcut 的 UI。

我们来看看发生泄漏后它会怎么处理:

override fun onObjectRetained() = scheduleRetainedObjectCheck()

fun scheduleRetainedObjectCheck() {
  if (this::heapDumpTrigger.isInitialized) {
    heapDumpTrigger.scheduleRetainedObjectCheck()
  }
}

直接调用 HeapDumpTrigger#scheduleRetainedObjectCheck() 方法:

fun scheduleRetainedObjectCheck(
  delayMillis: Long = 0L
) {
  val checkCurrentlyScheduledAt = checkScheduledAt
  if (checkCurrentlyScheduledAt > 0) {
    return
  }
  checkScheduledAt = SystemClock.uptimeMillis() + delayMillis
  backgroundHandler.postDelayed({
    checkScheduledAt = 0
    checkRetainedObjects()
  }, delayMillis)
}

直接在后台线程中调用 checkRetainedObjects() 方法,继续追踪:

private fun checkRetainedObjects() {
  // 检查当前是否可以 dump
  val iCanHasHeap = HeapDumpControl.iCanHasHeap()

  val config = configProvider()

  if (iCanHasHeap is Nope) {
    // 不能 dump
    // ...
    return
  }
  
  // 泄漏的对象数量
  var retainedReferenceCount = retainedObjectTracker.retainedObjectCount

  if (retainedReferenceCount > 0) {
    // 当泄漏的对象大于 0 时,触发 gc
    gcTrigger.runGc()
    retainedReferenceCount = retainedObjectTracker.retainedObjectCount
  }

  // 检查泄漏对象的数量是否达到配置的上限,如果没有达到上限,跳过 dump, 默认上限值是 5
  if (checkRetainedCount(retainedReferenceCount, config.retainedVisibleThreshold)) return

  // 检查两次 dump 的时间间隔,如果小于最小的间隔开启一个延时任务等到间隔时间了再 dump,最小间隔是 60s,
  val now = SystemClock.uptimeMillis()
  val elapsedSinceLastDumpMillis = now - lastHeapDumpUptimeMillis
  if (elapsedSinceLastDumpMillis < WAIT_BETWEEN_HEAP_DUMPS_MILLIS) {
    onRetainInstanceListener.onEvent(DumpHappenedRecently)
    showRetainedCountNotification(
      objectCount = retainedReferenceCount,
      contentText = application.getString(R.string.leak_canary_notification_retained_dump_wait)
    )
    scheduleRetainedObjectCheck(
      delayMillis = WAIT_BETWEEN_HEAP_DUMPS_MILLIS - elapsedSinceLastDumpMillis
    )
    return
  }
  
  // 关闭通知 UI
  dismissRetainedCountNotification()
  val visibility = if (applicationVisible) "visible" else "not visible"
  // 执行 dump
  dumpHeap(
    retainedReferenceCount = retainedReferenceCount,
    retry = true,
    reason = "$retainedReferenceCount retained objects, app is $visibility"
  )
}
  1. 检查当前是否能够 dump,不能直接退出。
  2. 检查泄漏对象数量,如果大于 0 通过 GCTracker 触发一次 GC
  3. 检查泄漏的对象是否达到配置的泄漏对象的上限,默认是 5,如果没有达到上限跳过 dump
  4. 检查上次 dump 到现在的间隔,如果小于 60s,那么需要等到 60s 后再次执行该任务。
  5. 执行 dump

看看 GCTracker 的实现:

override fun runGc() {
  // Code taken from AOSP FinalizationTest:
  // https://android.googlesource.com/platform/libcore/+/master/support/src/test/java/libcore/
  // java/lang/ref/FinalizationTester.java
  System.gc()
  enqueueReferences()
  System.runFinalization()
  System.gc()
}

private fun enqueueReferences() {
  // Hack. We don't have a programmatic way to wait for the reference queue daemon to move
  // references to the appropriate queues.
  try {
    Thread.sleep(100)
  } catch (e: InterruptedException) {
    throw AssertionError()
  }
}

看这个描述,这段代码是在 AOSP 中的测试代码中抄过来的,我们也可以直接拿过来直接用。

我们再来看看 dumpHeap() 方法的实现:


private fun dumpHeap(
  retainedReferenceCount: Int,
  retry: Boolean,
  reason: String
) {
  val directoryProvider =
    InternalLeakCanary.createLeakDirectoryProvider(InternalLeakCanary.application)
  // 获取 dump 输出的文件  
  val heapDumpFile = directoryProvider.newHeapDumpFile()

  val durationMillis: Long
  if (currentEventUniqueId == null) {
    currentEventUniqueId = UUID.randomUUID().toString()
  }
  try {
    // 发送开始 dump 的事件 
    InternalLeakCanary.sendEvent(DumpingHeap(currentEventUniqueId!!))
    if (heapDumpFile == null) {
      throw RuntimeException("Could not create heap dump file")
    }
    saveResourceIdNamesToMemory()
    val heapDumpUptimeMillis = SystemClock.uptimeMillis()
    KeyedWeakReference.heapDumpUptimeMillis = heapDumpUptimeMillis
    durationMillis = measureDurationMillis {
      // 执行 dump
      configProvider().heapDumper.dumpHeap(heapDumpFile)
    }
    if (heapDumpFile.length() == 0L) {
      throw RuntimeException("Dumped heap file is 0 byte length")
    }
    lastDisplayedRetainedObjectCount = 0
    lastHeapDumpUptimeMillis = SystemClock.uptimeMillis()
    retainedObjectTracker.clearObjectsTrackedBefore(heapDumpUptimeMillis.milliseconds)
    currentEventUniqueId = UUID.randomUUID().toString()
    // 发送 dump 成功的事件
    InternalLeakCanary.sendEvent(HeapDump(currentEventUniqueId!!, heapDumpFile, durationMillis, reason))
  } catch (throwable: Throwable) {
    // 出错
    InternalLeakCanary.sendEvent(HeapDumpFailed(currentEventUniqueId!!, throwable, retry))
    if (retry) {
      scheduleRetainedObjectCheck(
        delayMillis = WAIT_AFTER_DUMP_FAILED_MILLIS
      )
    }
    // 展示 dump 出错的通知栏。
    showRetainedCountNotification(
      objectCount = retainedReferenceCount,
      contentText = application.getString(
        R.string.leak_canary_notification_retained_dump_failed
      )
    )
    return
  }
}

上面的代码非常简单,dump 开始,结束和出错都会以事件的形式通知 InternalLeakCanary,针对不同的事件做出不同的处理,dump 的具体实现是 heapDumper。 下面是它的接口:

fun interface HeapDumper {

  /**
   * Dumps the heap. The implementation is expected to be blocking until the heap is dumped
   * or heap dumping failed.
   *
   * Implementations can throw a runtime exception if heap dumping failed.
   */
  fun dumpHeap(heapDumpFile: File)

  /**
   * This allows external modules to add factory methods for implementations of this interface as
   * extension functions of this companion object.
   */
  companion object
}

fun HeapDumper.withGc(gcTrigger: GcTrigger = GcTrigger.inProcess()): HeapDumper {
  val delegate = this
  return HeapDumper { file ->
    gcTrigger.runGc()
    delegate.dumpHeap(file)
  }
}

HeapDumper 这三个实现我认为可以看看:AndroidDebugHeapDumperHotSpotHeapDumperUiAutomatorShellHeapDumper

AndroidDebugHeapDumper 也就是默认 AndroidDebug 时使用,只有 Debug 可以用:

/**
 * Dumps the Android heap using [Debug.dumpHprofData].
 *
 * Note: despite being part of the Debug class, [Debug.dumpHprofData] can be called from non
 * debuggable non profileable builds.
 */
object AndroidDebugHeapDumper : HeapDumper {
  override fun dumpHeap(heapDumpFile: File) {
    Debug.dumpHprofData(heapDumpFile.absolutePath)
  }
}

fun HeapDumper.Companion.forAndroidInProcess() = AndroidDebugHeapDumper

朴实无华的代码。

HotSpotHeapDumper HotSpot 虚拟机使用,我看到在测试的代码中有用到。

object HotSpotHeapDumper : HeapDumper {
  private val hotspotMBean: HotSpotDiagnosticMXBean by lazy {
    val mBeanServer = ManagementFactory.getPlatformMBeanServer()
    ManagementFactory.newPlatformMXBeanProxy(
      mBeanServer,
      "com.sun.management:type=HotSpotDiagnostic",
      HotSpotDiagnosticMXBean::class.java
    )
  }

  override fun dumpHeap(heapDumpFile: File) {
    val live = true
    hotspotMBean.dumpHeap(heapDumpFile.absolutePath, live)
  }
}

fun HeapDumper.Companion.forJvmInProcess() = HotSpotHeapDumper

同样朴实无华的代码。

UiAutomatorShellHeapDumperAndroid 中通过命令行的方式进行 dump

class UiAutomatorShellHeapDumper(
  private val withGc: Boolean,
  private val dumpedAppPackageName: String
) : HeapDumper {
  override fun dumpHeap(heapDumpFile: File) {
    val instrumentation = InstrumentationRegistry.getInstrumentation()
    val device = UiDevice.getInstance(instrumentation)
    val processId = device.getPidsForProcess(dumpedAppPackageName)
      // TODO Figure out what to do when we get more than one.
      .single()

    SharkLog.d { "Dumping heap for "$dumpedAppPackageName" with pid $processId to ${heapDumpFile.absolutePath}" }

    val forceGc = if (withGc && Build.VERSION.SDK_INT >= 27) {
      "-g "
    } else {
      ""
    }

    device.executeShellCommand("am dumpheap $forceGc$processId ${heapDumpFile.absolutePath}")
    // Make the heap dump world readable, otherwise we can't read it.
    device.executeShellCommand("chmod +r ${heapDumpFile.absolutePath}")
  }

  // Based on https://cs.android.com/androidx/platform/frameworks/support/+/androidx-main:benchmark/benchmark-common/src/main/java/androidx/benchmark/Shell.kt;l=467;drc=8f2ba6a5469f67b7e385878d704f97bde22419ce
  private fun UiDevice.getPidsForProcess(processName: String): List<Int> {
    if (Build.VERSION.SDK_INT >= 23) {
      return pgrepLF(pattern = processName)
        .mapNotNull { (pid, fullProcessName) ->
          if (fullProcessNameMatchesProcess(fullProcessName, processName)) {
            pid
          } else {
            null
          }
        }
    }
    val processList = executeShellCommand("ps")
    return processList.lines()
      .filter { psLineContainsProcess(it, processName) }
      .map {
        val columns = SPACE_PATTERN.split(it)
        columns[1].toInt()
      }
  }

  private fun UiDevice.pgrepLF(pattern: String): List<Pair<Int, String>> {
    return executeShellCommand("pgrep -l -f $pattern")
      .split(Regex("\r?\n"))
      .filter { it.isNotEmpty() }
      .map {
        val (pidString, process) = it.trim().split(" ")
        Pair(pidString.toInt(), process)
      }
  }

  private fun psLineContainsProcess(
    psOutputLine: String,
    processName: String
  ): Boolean {
    return psOutputLine.endsWith(" $processName") || psOutputLine.endsWith("/$processName")
  }

  private fun fullProcessNameMatchesProcess(
    fullProcessName: String,
    processName: String
  ): Boolean {
    return fullProcessName == processName || fullProcessName.endsWith("/$processName")
  }

  private companion object {
    private val SPACE_PATTERN = Regex("\s+")
  }
}

fun HeapDumper.Companion.forUiAutomatorAsShell(
  withGc: Boolean,
  dumpedAppPackageName: String = InstrumentationRegistry.getInstrumentation().targetContext.packageName
) = UiAutomatorShellHeapDumper(withGc, dumpedAppPackageName)

这里用到了 androidx 的库来执行命令行,具体的命令行是 am dumpheap -g [pid] [file], 最后通过 chmod +r [file] 让文件可读,上面的大量代码是去拿当前应用的 pid,使用的是命令 pgrep -l -f [process name]

最后

拿到内存堆的 dump 文件后,还需要去解析该文件,然后才能够找到泄漏的 gc root,我们后面的文章再介绍。