LeakCanary 源码阅读笔记（五）本篇文章是阅读 LeakCanary 源码的系列文章第五篇，如果没有看过前面四篇

本篇文章是阅读 LeakCanary 源码的系列文章第五篇，如果没有看过前面四篇文章建议先看看前面的文章：

LeakCanary 源码阅读笔记（一）
LeakCanary 源码阅读笔记（二）
LeakCanary 源码阅读笔记（三）
LeakCanary 源码阅读笔记（四）

在第四篇文章中介绍完了 HPROF 文件是如何解析的，这篇文章就介绍如何从 HPROF 的解析结果中如何找到泄漏的对象，那么准备好了就开始今天的内容。

在上面一篇文章中讲到解析完成的 HPROF 文件内容存放在 HprofHeapGraph 中，后续的泄漏对象的分析是通过 HeapAnalyzer#analyze()，它也作为今天源码分析的入口函数。

    fun analyze(
      heapDumpFile: File,
      graph: HeapGraph,
      leakingObjectFinder: LeakingObjectFinder,
      referenceMatchers: List<ReferenceMatcher> = emptyList(),
      computeRetainedHeapSize: Boolean = false,
      objectInspectors: List<ObjectInspector> = emptyList(),
      metadataExtractor: MetadataExtractor = MetadataExtractor.NO_OP,
    ): HeapAnalysis {
      val analysisStartNanoTime = System.nanoTime()

      return try {
        // 创建 LeakTracer 对象
        val leakTracer = RealLeakTracerFactory(
          shortestPathFinderFactory = PrioritizingShortestPathFinder.Factory(
            listener =  { event -> // ... },
            referenceReaderFactory = AndroidReferenceReaderFactory(referenceMatchers),
            gcRootProvider = MatchingGcRootProvider(referenceMatchers),
            computeRetainedHeapSize = computeRetainedHeapSize,
          ),
          objectInspectors
        ) { event ->
          // ...
        }.createFor(graph)

        listener.onAnalysisProgress(EXTRACTING_METADATA)
        val metadata = metadataExtractor.extractMetadata(graph)

        // 计算泄漏的对象数量
        val retainedClearedWeakRefCount = KeyedWeakReferenceFinder.findKeyedWeakReferences(graph)
          .count { it.isRetained && !it.hasReferent }

        // This should rarely happens, as we generally remove all cleared weak refs right before a heap
        // dump.
        val metadataWithCount = if (retainedClearedWeakRefCount > 0) {
          metadata + ("Count of retained yet cleared" to "$retainedClearedWeakRefCount KeyedWeakReference instances")
        } else {
          metadata
        }

        listener.onAnalysisProgress(FINDING_RETAINED_OBJECTS)
        // 获取泄漏对象的 Ids
        val leakingObjectIds = leakingObjectFinder.findLeakingObjectIds(graph)
        
        // 计算泄漏结果
        val (applicationLeaks, libraryLeaks, unreachableObjects) = leakTracer.traceObjects(
          leakingObjectIds
        )

        HeapAnalysisSuccess(
          heapDumpFile = heapDumpFile,
          createdAtTimeMillis = System.currentTimeMillis(),
          analysisDurationMillis = since(analysisStartNanoTime),
          metadata = metadataWithCount,
          applicationLeaks = applicationLeaks,
          libraryLeaks = libraryLeaks,
          unreachableObjects = unreachableObjects
        )
      } catch (exception: Throwable) {
        HeapAnalysisFailure(
          heapDumpFile = heapDumpFile,
          createdAtTimeMillis = System.currentTimeMillis(),
          analysisDurationMillis = since(analysisStartNanoTime),
          exception = HeapAnalysisException(exception)
        )
      }
    }

首先通过 KeyedWeakReferenceFinder.findKeyedWeakReferences() 方法查找泄漏的对象：

internal fun findKeyedWeakReferences(graph: HeapGraph): List<KeyedWeakReferenceMirror> {
  // 这个 context 是一个简单的内存缓存，如果没有缓存就会执行 Lambda 中的内容
  return graph.context.getOrPut(KEYED_WEAK_REFERENCE.name) {
  
    // 查找 KeyedWeakReference 的 Class，看这里是有一个适配，有两个 KeyedWeakReference
    val keyedWeakReferenceClass = graph.findClassByName("leakcanary.KeyedWeakReference")

    val keyedWeakReferenceClassId = keyedWeakReferenceClass?.objectId ?: 0
    val legacyKeyedWeakReferenceClassId =
      graph.findClassByName("com.squareup.leakcanary.KeyedWeakReference")?.objectId ?: 0
    
    val heapDumpUptimeMillis = heapDumpUptimeMillis(graph)
    // 从实例中查找
    val addedToContext: List<KeyedWeakReferenceMirror> = graph.instances
      // 过滤 KeyedWeakReference 实例
      .filter { instance ->
        instance.instanceClassId == keyedWeakReferenceClassId || instance.instanceClassId == legacyKeyedWeakReferenceClassId
      }
      .map {
        // 从 KeyedWeakReference 实例中获取泄漏对象的实例
        KeyedWeakReferenceMirror.fromInstance(
          it, heapDumpUptimeMillis
        )
      }
      .toList()
    // 保存到缓存中  
    graph.context[KEYED_WEAK_REFERENCE.name] = addedToContext
    addedToContext
  }
}

前面的文章中我们说到所有被销毁的对象都会被放在自定义的弱引用对象 KeyedWeakReference，如果发生泄漏那么 KeyedWeakReference 中的引用对象是不会被回收的，所以找到所有的 KeyedWeakReference 后，就能够找到泄漏的对象。

KeyedWeakReferenceMirror.fromInstance() 方法就可以找到 KeyedWeakReference 所引用的泄漏的对象，来看看源码：

fun fromInstance(
  weakRef: HeapInstance,
  // Null for pre 2.0 alpha 3 heap dumps
  heapDumpUptimeMillis: Long?
): KeyedWeakReferenceMirror {

  val keyWeakRefClassName = weakRef.instanceClassName
  // 获取时间戳
  val watchDurationMillis = if (heapDumpUptimeMillis != null) {
    heapDumpUptimeMillis - weakRef[keyWeakRefClassName, "watchUptimeMillis"]!!.value.asLong!!
  } else {
    null
  }
  
  // 获取时间戳
  val retainedDurationMillis = if (heapDumpUptimeMillis != null) {
    val retainedUptimeMillis =
      weakRef[keyWeakRefClassName, "retainedUptimeMillis"]!!.value.asLong!!
    if (retainedUptimeMillis == -1L) -1L else heapDumpUptimeMillis - retainedUptimeMillis
  } else {
    null
  }

  // 获取泄漏对象的 Key
  val keyString = weakRef[keyWeakRefClassName, "key"]!!.value.readAsJavaString()!!

  // Changed from name to description after 2.0
  // 获取泄漏对象的描述，这里也有一个版本适配
  val description = (weakRef[keyWeakRefClassName, "description"]
    ?: weakRef[keyWeakRefClassName, "name"])?.value?.readAsJavaString() ?: UNKNOWN_LEGACY
  return KeyedWeakReferenceMirror(
    watchDurationMillis = watchDurationMillis,
    retainedDurationMillis = retainedDurationMillis,
    // 获取泄漏对象的引用
    referent = weakRef["java.lang.ref.Reference", "referent"]!!.value.holder as ReferenceHolder,
    key = keyString,
    description = description
  )
}

上面代码也比较简单，就在 KeyedWeakReference 中拿了一些时间戳，还有泄漏对象的 Key 和描述，然后通过 Reference#referent 成员 Feild 拿到了泄漏对象的引用。这些然后这些数据都被封装在 KeyedWeakReferenceMirror 对象中。

通过上面的方法就拿到了所有的泄漏的对象，我们再来看看 LeakTracer#traceObjects() 如何查找泄漏的对象的引用树的，首先我们要看看 LeakTracer 对象创建的地方：

override fun createFor(heapGraph: HeapGraph): LeakTracer {
  // TODO Remove the listener and replace that by specific events
  //  Also for each event some notion of progress? Should that be configurable?
  //  We should be able to tell the total number of objects so we'll know when we've
  //  traversed the whole graph.
  //  referenceMatchers are only needed for the NativeGlobalVariablePattern, which is related
  //  to GC roots
  return LeakTracer { objectIds ->
    val helpers = FindLeakInput(
      heapGraph,
      shortestPathFinderFactory.createFor(heapGraph),
      objectInspectors,
    )
    helpers.findLeaks(objectIds)
  }
}

LeakTracer 的实现是一个匿名类，它通过构建 FindLeakInput 对象，然后通过 findLeaks() 方法去查找泄漏对象的引用树。

private fun FindLeakInput.findLeaks(leakingObjectIds: Set<Long>): LeaksAndUnreachableObjects {
  // 查找引用树
  val pathFindingResults =
    shortestPathFinder.findShortestPathsFromGcRoots(leakingObjectIds)
  
  // 找到 unreachable 的泄漏对象
  val unreachableObjects = findUnreachableObjects(pathFindingResults, leakingObjectIds)
  
  // 裁剪引用树，只保留到 GCRoot 最短的路径
  val shortestPaths =
    deduplicateShortestPaths(pathFindingResults.pathsToLeakingObjects)
  
  // 标记哪些对象为泄漏和添加一些泄漏信息
  val inspectedObjectsByPath = inspectObjects(shortestPaths)
  
  // 计算泄漏对象的内存占用的 ReatainedSizes
  val retainedSizes =
    if (pathFindingResults.dominatorTree != null) {
      computeRetainedSizes(inspectedObjectsByPath, pathFindingResults.dominatorTree)
    } else {
      null
    }
  val (applicationLeaks, libraryLeaks) = buildLeakTraces(
    shortestPaths, inspectedObjectsByPath, retainedSizes
  )
  return LeaksAndUnreachableObjects(applicationLeaks, libraryLeaks, unreachableObjects)
}

通过 PrioritizingShortestPathFinder#findShortestPathsFromGcRoots() 方法查找泄漏对象的引用树；然后通过 findUnreachableObjects() 方法找到无法到达 GCRoot 的泄漏对象；通过 deduplicateShortestPaths() 方法裁剪引用树，只保留到 GCRoot 最短的路径；通过 computeRetainedSizes() 方法计算泄漏对象的 ReatainedSize。

我们需要重点分析的是如何找到泄漏的对象到达 GCRoot 的路径，也就是重点分析 PrioritizingShortestPathFinder#findShortestPathsFromGcRoots() 方法。

private fun State.findPathsFromGcRoots(): PathFindingResults {
  // 让所有的 GCRoot (ROOT_JAVA_FRAME 除外，因为遍历 Thread 的时候能够找到它) 添加到队列
  enqueueGcRoots()

  val shortestPathsToLeakingObjects = mutableListOf<ReferencePathNode>()
  // 开始遍历队列中需要查找 node
  visitingQueue@ while (queuesNotEmpty) {
    // 取出一个 node 开始查找
    val node = poll()
    
    // 如果当前 node 的 id 在泄漏的对象之中，表示找到了一个泄漏的引用路径
    if (leakingObjectIds.contains(node.objectId)) {
      // 将路径添加到结果中
      shortestPathsToLeakingObjects.add(node)
      // Found all refs, stop searching (unless computing retained size)
      // 判断是否已经找到所有的泄漏对象的路径
      if (shortestPathsToLeakingObjects.size == leakingObjectIds.size()) {
        if (computeRetainedHeapSize) {
          listener.onEvent(StartedFindingDominators)
        } else {
          // 如果不需要计算泄漏对象的 RaintedSize，就跳出循环
          break@visitingQueue
        }
      }
    }

    val heapObject = try {
      // 查找对应 node 的实例
      graph.findObjectById(node.objectId)
    } catch (objectIdNotFound: IllegalArgumentException) {
      // This should never happen (a heap should only have references to objects that exist)
      // but when it does happen, let's at least display how we got there.
      throw RuntimeException(graph.invalidObjectIdErrorMessage(node), objectIdNotFound)
    }
    // 通过 AndroidReferenceReaderFactory 构建的 Reader 去读取当前 Node 引用的其他对象，然后再把这些对象继续添加到队列中，供后续继续遍历
    objectReferenceReader.read(heapObject).forEach { reference ->
      // 构建新的 ChildNode
      val newNode = ChildNode(
        objectId = reference.valueObjectId,
        // 父 Node
        parent = node,
        lazyDetailsResolver = reference.lazyDetailsResolver
      )
      // 添加到队列中
      enqueue(
        node = newNode,
        isLowPriority = reference.isLowPriority,
        isLeafObject = reference.isLeafObject
      )
    }
  }
  return PathFindingResults(
    shortestPathsToLeakingObjects,
    if (visitTracker is Dominated) visitTracker.dominatorTree else null
  )
}

需要查找 node 都被添加到一个队列中，首先将所有的 GCRoot （ROOT_JAVA_FRAME 除外，因为遍历 Thread 的时候能够找到它）先添加到队列中，也就是从 GCRoot 开始查找，通过 AndroidReferenceReaderFactory 构建的 Reader 去读取对应 node 的其他引用的 Instance，然后再添加到队列中。
我们看看 AndroidReferenceReaderFactory 的源码：

class AndroidReferenceReaderFactory(
  private val referenceMatchers: List<ReferenceMatcher>
) : ReferenceReader.Factory<HeapObject> {

  private val virtualizingFactory = VirtualizingMatchingReferenceReaderFactory(
    referenceMatchers = referenceMatchers,
    virtualRefReadersFactory = { graph ->
      listOf(
        JavaLocalReferenceReader(graph, referenceMatchers),
      ) +
        AndroidReferenceReaders.values().mapNotNull { it.create(graph) } +
        OpenJdkInstanceRefReaders.values().mapNotNull { it.create(graph) } +
        ApacheHarmonyInstanceRefReaders.values().mapNotNull { it.create(graph) }
    }
  )

  override fun createFor(heapGraph: HeapGraph): ReferenceReader<HeapObject> {
    return virtualizingFactory.createFor(heapGraph)
  }
}

其实最终的实现是 VirtualizingMatchingReferenceReaderFactory，我们还看到一些其他的 Reader，作为构造函数传递给了 VirtualizingMatchingReferenceReaderFactory，我们看到有 JavaLocalReferenceReader，AndroidReferenceReaders，OpenJdkInstanceRefReaders，ApacheHarmonyInstanceRefReaders 等，其实他们主要的功能是将其中的集合类读取对应的 Element 做一个优化，例如读取 ArrayList 的时候我们更加关注的是它当中 elements，这些对象就会被直接读取成一个单个的引用，其中还有一些其他的功能，例如前面讲到 GCRoot 添加到遍历队列中时移除了 ROOT_JAVA_FRAME，在 JavaLocalReferenceReader 中会通过 Thread 的实例去查找这些 ROOT_JAVA_FRAME。AndroidReferenceReaders 中还有添加一些 Android Framework 特有的泄漏信息。后面再简单看看 JavaLocalReferenceReader 和 AndroidReferenceReaders。

继续看看 VirtualizingMatchingReferenceReaderFactory#createFor() 方法的实现。

override fun createFor(heapGraph: HeapGraph): ReferenceReader<HeapObject> {
  // 实例的引用 Field 的 Reader
  val fieldRefReader = FieldInstanceReferenceReader(heapGraph, referenceMatchers)
  return DelegatingObjectReferenceReader(
    classReferenceReader = ClassReferenceReader(heapGraph, referenceMatchers),
    instanceReferenceReader = ChainingInstanceReferenceReader(
      virtualRefReaders = virtualRefReadersFactory.createFor(heapGraph),
      flatteningInstanceReader = FlatteningPartitionedInstanceReferenceReader(heapGraph, fieldRefReader),
      fieldRefReader = fieldRefReader
    ),
    objectArrayReferenceReader = ObjectArrayReferenceReader()
  )
}

继续看看 DelegatingObjectReferenceReader#read() 方法：

override fun read(source: HeapObject): Sequence<Reference> {
  return when(source) {
    is HeapClass -> classReferenceReader.read(source)
    is HeapInstance -> instanceReferenceReader.read(source)
    is HeapObjectArray -> objectArrayReferenceReader.read(source)
    is HeapPrimitiveArray -> emptySequence()
  }
}

会根据不同的实例类型选择不同的 Reader。

Class 的实例处理：

override fun read(source: HeapClass): Sequence<Reference> {
  val ignoredStaticFields = staticFieldNameByClassName[source.name] ?: emptyMap()

  return source.readStaticFields().mapNotNull { staticField ->
    // not non null: no null + no primitives.
    if (!staticField.value.isNonNullReference) {
      return@mapNotNull null
    }
    val fieldName = staticField.name
    if (
    // Android noise
      fieldName == "$staticOverhead" ||
      // Android noise
      fieldName == "$classOverhead" ||
      // JVM noise
      fieldName == "<resolved_references>"
    ) {
      return@mapNotNull null
    }

    // Note: instead of calling staticField.value.asObjectId!! we cast holder to ReferenceHolder
    // and access value directly. This allows us to avoid unnecessary boxing of Long.
    val valueObjectId = (staticField.value.holder as ReferenceHolder).value
    val referenceMatcher = ignoredStaticFields[fieldName]

    if (referenceMatcher is IgnoredReferenceMatcher) {
      null
    } else {
      val sourceObjectId = source.objectId
      Reference(
        valueObjectId = valueObjectId,
        isLowPriority = referenceMatcher != null,
        lazyDetailsResolver = {
          LazyDetails(
            name = fieldName,
            locationClassObjectId = sourceObjectId,
            locationType = STATIC_FIELD,
            isVirtual = false,
            matchedLibraryLeak = referenceMatcher as LibraryLeakReferenceMatcher?,
          )
        }
      )
    }
  }
}

代码很简单只读取静态 Field 而且只是引用类型，其中还移除了 Android 和 JVM 中特有的系统的 Field。

ObjectArray 实例的处理：

override fun read(source: HeapObjectArray): Sequence<Reference> {
  if (source.isSkippablePrimitiveWrapperArray) {
    // primitive wrapper arrays aren't interesting.
    // That also means the wrapped size isn't added to the dominator tree, so we need to
    // add that back when computing shallow size in ShallowSizeCalculator.
    // Another side effect is that if the wrapped primitive is referenced elsewhere, we might
    // double count its size.
    return emptySequence()
  }

  val graph = source.graph
  val record = source.readRecord()
  val arrayClassId = source.arrayClassId
  return record.elementIds.asSequence().filter { objectId ->
    objectId != ValueHolder.NULL_REFERENCE && graph.objectExists(objectId)
  }.mapIndexed { index, elementObjectId ->
    Reference(
      valueObjectId = elementObjectId,
      isLowPriority = false,
      lazyDetailsResolver = {
        LazyDetails(
          name = index.toString(),
          locationClassObjectId = arrayClassId,
          locationType = ARRAY_ENTRY,
          isVirtual = false,
          matchedLibraryLeak = null
        )
      }
    )
  }
}

上面的代码也很简单，首先过滤掉基本类型的包裹类的数组（也就是 Integer balabala 等等对应的数组）, 然后过滤出不是空的实例。

普通实例的处理相对于其他的实例处理起来就要麻烦很多了，对应的处理类是 ChainingInstanceReferenceReader：

override fun read(source: HeapInstance): Sequence<Reference> {
  // 找到处理的 VirtualRefReader
  val virtualRefReader = findMatchingVirtualReader(source)
  return if (virtualRefReader == null) {
    // 如果没有找到，直接读取对应的 Field
    fieldRefReader.read(source)
  } else {
    if (flatteningInstanceReader != null && virtualRefReader.readsCutSet) {
      flatteningInstanceReader.read(virtualRefReader, source)
    } else {
      // 调用 VirtualRefReader
      val virtualRefs = virtualRefReader.read(source)
      // Note: always forwarding to fieldRefReader means we may navigate the structure twice
      // which increases IO reads. However this is a trade-of that allows virtualRef impls to
      // focus on a subset of references and more importantly it means we still get a proper
      // calculation of retained size as we don't skip any instance.
      // 读取 Field
      val fieldRefs = fieldRefReader.read(source)
      // 结合两次的结果
      virtualRefs + fieldRefs
    }
  }
}

上面源码中的 VirtualRefReader 其实就是上面提到的特有的 Reader，如果没有找到对应处理的 VirtualRefReader 就直接读取它的 Field 就好，如果有对应的 VirtualRefReader 还会结合它的结果，看他的注释描述，VirtualRefReader 和 FieldInstanceReferenceReader 中可能有重复的结果，但是也是一种取舍。

先看看 FieldInstanceReferenceReader#read() 如何读取成员 Field 的引用：

override fun read(source: HeapInstance): Sequence<Reference> {
  
  // 跳过处理基本类型的包裹类的实例，String 类的实例和错误字节数的实例
  if (source.isPrimitiveWrapper ||
    // We ignore the fact that String references a value array to avoid having
    // to read the string record and find the object id for that array, since we know
    // it won't be interesting anyway.
    // That also means the value array isn't added to the dominator tree, so we need to
    // add that back when computing shallow size in ShallowSizeCalculator.
    // Another side effect is that if the array is referenced elsewhere, we might
    // double count its side.
    source.instanceClassName == "java.lang.String" ||
    source.instanceClass.instanceByteSize <= sizeOfObjectInstances
  ) {
    return emptySequence()
  }

  val fieldReferenceMatchers = LinkedHashMap<String, ReferenceMatcher>()
  
  // 找到当前实例对应的 Class 继承的所有 Class。
  val classHierarchy = source.instanceClass.classHierarchyWithoutJavaLangObject(javaLangObjectId)

  // 找到对应的 Matcher，跳过这部分
  classHierarchy.forEach {
    val referenceMatcherByField = fieldNameByClassName[it.name]
    if (referenceMatcherByField != null) {
      for ((fieldName, referenceMatcher) in referenceMatcherByField) {
        if (!fieldReferenceMatchers.containsKey(fieldName)) {
          fieldReferenceMatchers[fieldName] = referenceMatcher
        }
      }
    }
  }

  return with(source) {
    // Assigning to local variable to avoid repeated lookup and cast:
    // HeapInstance.graph casts HeapInstance.hprofGraph to HeapGraph in its getter
    val hprofGraph = graph
    // 将 Instance 的 Record 的字节数组读取到 FieldIdReader 中，供后面读取值的时候使用
    val fieldReader by lazy(NONE) {
      FieldIdReader(readRecord(), hprofGraph.identifierByteSize)
    }
    val result = mutableListOf<Pair<String, Reference>>()
    var skipBytesCount = 0
    
    // 遍历当前的 Class 和所有继承的 Class
    for (heapClass in classHierarchy) {
      // 遍历 Class 中的所有成员 Field
      for (fieldRecord in heapClass.readRecordFields()) {
        if (fieldRecord.type != PrimitiveType.REFERENCE_HPROF_TYPE) {
          // Skip all fields that are not references. Track how many bytes to skip
          // 如果不是引用类型，直接跳过
          skipBytesCount += hprofGraph.getRecordSize(fieldRecord)
        } else {
          // Skip the accumulated bytes offset
          fieldReader.skipBytes(skipBytesCount)
          skipBytesCount = 0
          // 读取引用对象的 ID
          val valueObjectId = fieldReader.readId()
          if (valueObjectId != 0L) {
            // 读取成员 Field 的名字
            val name = heapClass.instanceFieldName(fieldRecord)
            val referenceMatcher = fieldReferenceMatchers[name]
            if (referenceMatcher !is IgnoredReferenceMatcher) {
              val locationClassObjectId = heapClass.objectId
              // 添加到返回的结果中
              result.add(
                name to Reference(
                  valueObjectId = valueObjectId,
                  isLowPriority = referenceMatcher != null,
                  lazyDetailsResolver = {
                    LazyDetails(
                      name = name,
                      locationClassObjectId = locationClassObjectId,
                      locationType = INSTANCE_FIELD,
                      matchedLibraryLeak = referenceMatcher as LibraryLeakReferenceMatcher?,
                      isVirtual = false
                    )
                  }
                )
              )
            }
          }
        }
      }
    }
    result.sortBy { it.first }
    result.asSequence().map { it.second }
  }
}

上面的代码简单来说就是先去读取所有继承的 Class 类，然后再去读取 Instance 中的内容，也就是一个字节数组，然后包裹在 FieldIdReader 中，通过它去读取各种 ID 和引用。然后遍历所有的 Class 和他们对应的引用类型的成员 Field，然后通过 FieldIdReader 读取对应的引用 ID。具体的详细内容看我上面的代码，注视写得很清楚了。

继续看看 JavaLocalReferenceReader#read() 的实现：

override fun read(source: HeapInstance): Sequence<Reference> {
  val referenceMatcher =  source[Thread::class, "name"]?.value?.readAsJavaString()?.let { threadName ->
    threadNameReferenceMatchers[threadName]
  }

  if (referenceMatcher is IgnoredReferenceMatcher) {
    return emptySequence()
  }
  // 读取 Class 实例的 ID
  val threadClassId = source.instanceClassId
  // 读取实例的 ID 对应的 ROOT_JAVA_FRAME
  return JavaFrames.getByThreadObjectId(graph, source.objectId)?.let { frames ->
    frames.asSequence().map { frame ->
      Reference(
        valueObjectId = frame.id,
        // Java Frames always have low priority because their path is harder to understand
        // for developers
        isLowPriority = true,
        lazyDetailsResolver = {
          LazyDetails(
            // Unfortunately Android heap dumps do not include stack trace data, so
            // JavaFrame.frameNumber is always -1 and we cannot know which method is causing the
            // reference to be held.
            name = "",
            locationClassObjectId = threadClassId,
            locationType = LOCAL,
            matchedLibraryLeak = referenceMatcher as LibraryLeakReferenceMatcher?,
            isVirtual = true
          )
        }
      )
    }
  } ?: emptySequence()
}

JavaLocalReferenceReader 它只处理 Thread 及其派生类的实例。

我们再来看看 AndroidReferenceReaders 中的 ACTIVITY_THREAD__NEW_ACTIVITIES 实现，这个 Reader 很有意思，也有学习的价值，它主要干两件事：将 ActivityThread 中的以链表形式保存的 ActivityClientRecord 依次读取出来，以 ARRAY_ENTRY 的形式保存（前面也说到过大部分的 VirtualReader 都是干这个活，将集合类中的数据读取成 ARRAY_ENTRY ）；在 Android 中新创建的 Activity 实例，都会添加到 ActivityThread 成员变量 mNewActivities 中，这时会当主线程 idle 的时候就会移除它，如果主线程一直很忙，那么他就无法被移除，如果这时候对应的 Activity 又被销毁了就可能导致泄漏，这个问题和我之前分析的一个问题类似：[Framework] Activity onDestroy 生命周期延迟回调原理。

ACTIVITY_THREAD__NEW_ACTIVITIES {
  override fun create(graph: HeapGraph): VirtualInstanceReferenceReader? {
    // 获取 ActivityThread 的 Class
    val activityThreadClass = graph.findClassByName("android.app.ActivityThread") ?: return null
    
    // 检查是否有 mNewActivities 成员变量
    if (activityThreadClass.readRecordFields().none {
        activityThreadClass.instanceFieldName(it) == "mNewActivities"
      }
    ) {
      return null
    }
    
    // 获取 ActivityClientRecord 的 Class
    val activityClientRecordClass =
      graph.findClassByName("android.app.ActivityThread$ActivityClientRecord") ?: return null
    
    // 获取 ActivityClientRecord 所有 Field
    val activityClientRecordFieldNames = activityClientRecordClass.readRecordFields()
      // 这里貌似有一个 BUG，这里应该使用 activityClientRecordClass 才对，不过这里写错了不影响结果
      .map { activityThreadClass.instanceFieldName(it) }
      .toList()
    
    // 如果 ActivityClientRecord 没有 nextIdle 和 activity 这两个参数直接返回空
    if ("nextIdle" !in activityClientRecordFieldNames ||
      "activity" !in activityClientRecordFieldNames
    ) {
      return null
    }

    val activityThreadClassId = activityThreadClass.objectId
    val activityClientRecordClassId = activityClientRecordClass.objectId

    return object : VirtualInstanceReferenceReader {
      // 只处理 ActivityThread 和 ActivityClientRecord 实例
      override fun matches(instance: HeapInstance) =
        instance.instanceClassId == activityThreadClassId ||
          instance.instanceClassId == activityClientRecordClassId

      override val readsCutSet = false

      override fun read(source: HeapInstance): Sequence<Reference> {
        return if (source.instanceClassId == activityThreadClassId) {
          // 如果是 ActivityThread 类
          
          // 读取 ActivityThread#mNewActivities 实例的 ID
          val mNewActivities =
            source["android.app.ActivityThread", "mNewActivities"]!!.value.asObjectId!!
          if (mNewActivities == ValueHolder.NULL_REFERENCE) {
            emptySequence()
          } else {
           // 缓存 mNewActivities 的实例
            source.graph.context[ACTIVITY_THREAD__NEW_ACTIVITIES.name] = mNewActivities
           
            sequenceOf(
              Reference(
                valueObjectId = mNewActivities,
                isLowPriority = false,
                lazyDetailsResolver = {
                  // 添加关于这种情况下的泄漏描述
                  LazyDetails(
                    name = "mNewActivities",
                    locationClassObjectId = activityThreadClassId,
                    locationType = INSTANCE_FIELD,
                    isVirtual = false,
                    matchedLibraryLeak = instanceField(
                      className = "android.app.ActivityThread",
                      fieldName = "mNewActivities"
                    ).leak(
                      description = """
                     New activities are leaked by ActivityThread until the main thread becomes idle.
                     Tracked here: https://issuetracker.google.com/issues/258390457
                   """.trimIndent()
                    )
                  )
                })
            )
          }
        } else {
          // 如果是 ActivityClientRecord 
          
          val mNewActivities =
            source.graph.context.get<Long?>(ACTIVITY_THREAD__NEW_ACTIVITIES.name)
          if (mNewActivities == null || source.objectId != mNewActivities) {
            emptySequence()
          } else {
            // 遍历 ActivityClientRecord，它是一个链表结构，nextIdle 指向下一个节点
            generateSequence(source) { node ->
            // 读取下一个节点
              node["android.app.ActivityThread$ActivityClientRecord", "nextIdle"]!!.valueAsInstance
            }.withIndex().mapNotNull { (index, node) ->

              // 读取 Activity 实例
              val activity =
                node["android.app.ActivityThread$ActivityClientRecord", "activity"]!!.valueAsInstance
              if (activity == null ||
                // Skip non destroyed activities.
                // (!= true because we also skip if mDestroyed is missing)
                // 读取 Activiy 是否销毁，只保留没有销毁的 Activity 记录
                activity["android.app.Activity", "mDestroyed"]?.value?.asBoolean != true
              ) {
                null
              } else {
                // 封装成结果
                Reference(
                  valueObjectId = activity.objectId,
                  isLowPriority = false,
                  lazyDetailsResolver = {
                    LazyDetails(
                      name = "$index",
                      locationClassObjectId = activityClientRecordClassId,
                      locationType = ARRAY_ENTRY,
                      isVirtual = true,
                      matchedLibraryLeak = null
                    )
                  })
              }
            }
          }
        }
      }
    }
  }
}

上面的代码也不复杂，顺着我的注释看就行了。不过在读取 ActivityClientRecord 的 Fields 的名字的时候貌似他使用成了 ActiivtyThread 的 Class，应该使用 ActivityClientRecord 才对，不过那个地方写错了也不影响最终结果。这个问题我提交了一个PR，希望能够被 Merge 吧。😂

泄漏对象占用的内存计算，这部分代码我就不贴源码了，简单说说吧，每个 Instance 都有专门的4个字节来描述当前实例的大小，这也就是所谓的 ShallowSize，然后从这个对象出发它的一个引用树（如果有多个节点都引用了这个节点，谁到达 GCRoot 的距离近就算谁的节点）中的所有节点占用 ShallowSize 和就是 RetainedSize。

裁剪引用树的代码比较有意思，它是基于 Trie 树（中文翻译成字典树）来裁剪的，如果不知道 Trie 可以点点前面的链接。来看看裁剪引用树的代码：

internal sealed class TrieNode {
  abstract val objectId: Long

  class ParentNode(override val objectId: Long) : TrieNode() {
    val children = mutableMapOf<Long, TrieNode>()
    override fun toString(): String {
      return "ParentNode(objectId=$objectId, children=$children)"
    }
  }

  class LeafNode(
    override val objectId: Long,
    val pathNode: ReferencePathNode
  ) : TrieNode()
}

// 裁剪树的入口方法
private fun deduplicateShortestPaths(
  // 前面的代码找到的泄漏的对象到达 GCRoot 的路径，但是一个对象可能有多条到达 GCRoot 的路径，我们只需要保存一条最近到达 GCRoot 的路径
  inputPathResults: List<ReferencePathNode>
): List<ShortestPath> {
  val rootTrieNode = ParentNode(0)
  
  // 遍历所有的输入路径
  inputPathResults.forEach { pathNode ->
    // Go through the linked list of nodes and build the reverse list of instances from
    // root to leaking.
    // 遍历单条路径的节点，path 中是由 GCRoot 到泄漏对象的的 ID
    // leakNode 遍历完成后就是 GCRoot
    val path = mutableListOf<Long>()
    var leakNode: ReferencePathNode = pathNode
    while (leakNode is ChildNode) {
      path.add(0, leakNode.objectId)
      leakNode = leakNode.parent
    }
    path.add(0, leakNode.objectId)
    // 更新字典树
    updateTrie(pathNode, path, 0, rootTrieNode)
  }

  val outputPathResults = mutableListOf<ReferencePathNode>()
  // 当字典树更新完成后，就完成了对 GCRoot 的裁剪，只需要获取每个叶子结点到 Root 的路径就是泄漏对象到 GCRoot 的最短路径
  findResultsInTrie(rootTrieNode, outputPathResults)

  if (outputPathResults.size != inputPathResults.size) {
    SharkLog.d {
      "Found ${inputPathResults.size} paths to retained objects," +
        " down to ${outputPathResults.size} after removing duplicated paths"
    }
  } else {
    SharkLog.d { "Found ${outputPathResults.size} paths to retained objects" }
  }

  return outputPathResults.map { retainedObjectNode ->
    val shortestChildPath = mutableListOf<ChildNode>()
    var node = retainedObjectNode
    while (node is ChildNode) {
      shortestChildPath.add(0, node)
      node = node.parent
    }
    val rootNode = node as RootNode
    ShortestPath(rootNode, shortestChildPath)
  }
}

// 更新字典树
private fun updateTrie(
  // 输入的原始泄漏路径
  pathNode: ReferencePathNode,
  // GCRoot 到 泄漏对象的路径
  path: List<Long>,
  // 遍历的 index
  pathIndex: Int,
  // 字典树的节点
  parentNode: ParentNode
) {
  val objectId = path[pathIndex]
  if (pathIndex == path.lastIndex) {
    // 如果是最后一个节点直接添加 LeafNode，如果这里原来是一个 ParentNode 就表示原来有一条更远的路径，然后会被替换掉，也就完成了裁剪
    parentNode.children[objectId] = LeafNode(objectId, pathNode)
  } else {
    // 如果 ChildNode 为空就需要创建一个新的 ParentNode
    val childNode = parentNode.children[objectId] ?: run {
      val newChildNode = ParentNode(objectId)
      parentNode.children[objectId] = newChildNode
      newChildNode
    }
    // 如果 childNode 是 LeafNode 那么不需要在遍历了，因为后续的到达 GCRoot 的路径肯定比当前的远
    if (childNode is ParentNode) {
      // 继续遍历 path，并吧 index + 1
      updateTrie(pathNode, path, pathIndex + 1, childNode)
    }
  }
}

private fun findResultsInTrie(
  parentNode: ParentNode,
  outputPathResults: MutableList<ReferencePathNode>
) {
  parentNode.children.values.forEach { childNode ->
    when (childNode) {
      is ParentNode -> {
        findResultsInTrie(childNode, outputPathResults)
      }

      is LeafNode -> {
        outputPathResults += childNode.pathNode
      }
    }
  }
}

最后

本篇文章是 LeakCanary 源码阅读系列的最后一篇，每次阅读源码自己总会有新的收获，希望自己能够坚持阅读优秀开源库源码的习惯，也希望通过阅读 LeakCanary 的源码能够对你也有一些的启发。