从 Hprof 源码初探虚拟机内存管理

1,731 阅读14分钟

背景

在 Android 开发过程中, 内存相关的问题总是层出不穷, 如内存泄漏, OOM等问题。
线下我们一般可以引入 leakcanary 来排查内存泄漏问题,但是它有一定的局限性不能用于线上环境。
线上也出现了不少优秀的可用于采集分析线上的内存泄漏以及 OOM 等问题。如快手的 KOOM
直接接入上述俩个框架,我们可以排查解决我们遇到的内存泄漏以及 OOM 等问题。
但是问题就止于此吗?答案当时是否定的。
今天我们主要来探索一下它们分析的内存快照 hprof 格式是什么样子的呢?
GcRoot 是什么以及怎么找到这些对象?
我们 Java 创建的对象是如何管理的呢?
下面带着种种疑惑与对技术那颗热忱的心进入下一篇章「溯源之 hprof 格式」

Hprof 格式

简介

hprof 格式

看了这个 hprof 格式,是不是有点懵,还是用代码来加深一下我们对于 hprof 格式的理解吧。接下来进入下一篇章「源码之 hprof 格式」

源码

art/runtime/hprof/hprof.cc

ProcessHeap

总体来说堆快照分俩部分内容 header + body

void ProcessHeap(bool header_first)
      REQUIRES(Locks::mutator_lock_) {
    // Reset current heap and object count.
    current_heap_ = HPROF_HEAP_DEFAULT;
    objects_in_segment_ = 0;
    if (header_first) {
      // 可以快速浏览一下格式组织
      ProcessHeader(true);
      // 重点关注这里!!!
      ProcessBody();
    } else {
      ProcessBody();
      ProcessHeader(false);
    }
}

ProcessHeader

先看其堆的 header 部分

void ProcessHeader(bool string_first) REQUIRES(Locks::mutator_lock_) {
    // Write the header.
    // 固定的格式比如 class 格式 / so 的 ELF 格式 等
    WriteFixedHeader();
    // Write the string and class tables, and any stack traces, to the header.
    if (string_first) {
      WriteStringTable();
    }
    // Class Table 我们加载的类都是用这个结构来管理的
    WriteClassTable();
    WriteStackTraces();
    if (!string_first) {
      WriteStringTable();
    }
    output_->EndRecord();
  }
FixedHeader

对照该部分的代码于 hprof 格式可以加深一下这部分格式

void WriteFixedHeader() {
    // Write the file header.
    // U1: NUL-terminated magic string.
    const char magic[] = "JAVA PROFILE 1.0.3";
    __ AddU1List(reinterpret_cast<const uint8_t*>(magic), sizeof(magic));

    // U4: size of identifiers.  We're using addresses as IDs and our heap references are stored
    // as uint32_t.
    // Note of warning: hprof-conv hard-codes the size of identifiers to 4.
    static_assert(sizeof(mirror::HeapReference<mirror::Object>) == sizeof(uint32_t),
                  "Unexpected HeapReference size");
    __ AddU4(sizeof(uint32_t));

    // The current time, in milliseconds since 0:00 GMT, 1/1/70.
    timeval now;
    const uint64_t nowMs = (gettimeofday(&now, nullptr) < 0) ? 0 :
        (uint64_t)now.tv_sec * 1000 + now.tv_usec / 1000;
    // TODO: It seems it would be correct to use U8.
    // U4: high word of the 64-bit time.
    __ AddU4(static_cast<uint32_t>(nowMs >> 32));
    // U4: low word of the 64-bit time.
    __ AddU4(static_cast<uint32_t>(nowMs & 0xFFFFFFFF));
  }
StringTable
void WriteStringTable() {
    // SafeMap<std::string, HprofStringId> strings_;
    // using HprofStringId = uint32_t;
    for (const auto& p : strings_) {
      const std::string& string = p.first;
      const HprofStringId id = p.second;

      output_->StartNewRecord(HPROF_TAG_STRING, kHprofTime);

      // STRING format:
      // ID:  ID for this string
      // U1*: UTF8 characters for string (NOT null terminated)
      //      (the record format encodes the length)
      __ AddU4(id);
      __ AddUtf8String(string.c_str());
    }
  }

ClassTable
void WriteClassTable() REQUIRES_SHARED(Locks::mutator_lock_) {
    // classes_ 的实际类型如下 Map<Class, Integer>
    // SafeMap<mirror::Class*, HprofClassSerialNumber> classes_;
    // using HprofClassSerialNumber = uint32_t;
    for (const auto& p : classes_) {
      mirror::Class* c = p.first;
      HprofClassSerialNumber sn = p.second;
      CHECK(c != nullptr);
      // 已经加载的类
      output_->StartNewRecord(HPROF_TAG_LOAD_CLASS, kHprofTime);
      // LOAD CLASS format:
      // U4: class serial number (always > 0)
      // ID: class object ID. We use the address of the class object structure as its ID.
      // U4: stack trace serial number
      // ID: class name string ID
      __ AddU4(sn);
      __ AddObjectId(c);
      __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(c));
      __ AddStringId(LookupClassNameId(c));
    }
  }

StackTraces
void WriteStackTraces() REQUIRES_SHARED(Locks::mutator_lock_) {
    // Write a fake stack trace record so the analysis tools don't freak out.
    output_->StartNewRecord(HPROF_TAG_STACK_TRACE, kHprofTime);
    __ AddStackTraceSerialNumber(kHprofNullStackTrace);
    __ AddU4(kHprofNullThread);
    __ AddU4(0);    // no frames

    // TODO: jhat complains "WARNING: Stack trace not found for serial # -1", but no trace should
    // have -1 as its serial number (as long as HprofStackTraceSerialNumber doesn't overflow).
    for (const auto& it : traces_) {
      const gc::AllocRecordStackTrace* trace = it.first;
      HprofStackTraceSerialNumber trace_sn = it.second;
      size_t depth = trace->GetDepth();

      // First write stack frames of the trace
      for (size_t i = 0; i < depth; ++i) {
        const gc::AllocRecordStackTraceElement* frame = &trace->GetStackElement(i);
        ArtMethod* method = frame->GetMethod();
        CHECK(method != nullptr);
        output_->StartNewRecord(HPROF_TAG_STACK_FRAME, kHprofTime);
        // STACK FRAME format:
        // ID: stack frame ID. We use the address of the AllocRecordStackTraceElement object as its ID.
        // ID: method name string ID
        // ID: method signature string ID
        // ID: source file name string ID
        // U4: class serial number
        // U4: >0, line number; 0, no line information available; -1, unknown location
        auto frame_result = frames_.find(frame);
        CHECK(frame_result != frames_.end());
        __ AddU4(frame_result->second);
        __ AddStringId(LookupStringId(method->GetName()));
        __ AddStringId(LookupStringId(method->GetSignature().ToString()));
        const char* source_file = method->GetDeclaringClassSourceFile();
        if (source_file == nullptr) {
          source_file = "";
        }
        __ AddStringId(LookupStringId(source_file));
        auto class_result = classes_.find(method->GetDeclaringClass().Ptr());
        CHECK(class_result != classes_.end());
        __ AddU4(class_result->second);
        __ AddU4(frame->ComputeLineNumber());
      }

      // Then write the trace itself
      output_->StartNewRecord(HPROF_TAG_STACK_TRACE, kHprofTime);
      // STACK TRACE format:
      // U4: stack trace serial number. We use the address of the AllocRecordStackTrace object as its serial number.
      // U4: thread serial number. We use Thread::GetTid().
      // U4: number of frames
      // [ID]*: series of stack frame ID's
      __ AddStackTraceSerialNumber(trace_sn);
      __ AddU4(trace->GetTid());
      __ AddU4(depth);
      for (size_t i = 0; i < depth; ++i) {
        const gc::AllocRecordStackTraceElement* frame = &trace->GetStackElement(i);
        auto frame_result = frames_.find(frame);
        CHECK(frame_result != frames_.end());
        __ AddU4(frame_result->second);
      }
    }
  }

ProcessBody

对于开篇提到的一系列问题,这里就可以解惑了。前方高度预警,请高度关注!!!

void ProcessBody() REQUIRES(Locks::mutator_lock_) {
    Runtime* const runtime = Runtime::Current();
    // Walk the roots and the heap.
    output_->StartNewRecord(HPROF_TAG_HEAP_DUMP_SEGMENT, kHprofTime);
    simple_roots_.clear();
    // 1. GcRoot 有哪些呢?
    // 1.1 GcRoot
    runtime->VisitRoots(this);
    // 1.2 ImageRoot
    runtime->VisitImageRoots(this);
    // C++ lambda 表达式 (回调函数)
    auto dump_object = [this](mirror::Object* obj) REQUIRES_SHARED(Locks::mutator_lock_) {
      DCHECK(obj != nullptr);
      // 3. dump 对象
      DumpHeapObject(obj);
    };
    // 2.1 这些对象是从哪儿找到的呢?
    runtime->GetHeap()->VisitObjectsPaused(dump_object);
    output_->StartNewRecord(HPROF_TAG_HEAP_DUMP_END, kHprofTime);
    output_->EndRecord();
  }
GcRoot

先看 GcRoot 有哪些分类呢?

Thread Root

线程相关的 GcRoot

template <bool kPrecise>
void Thread::VisitRoots(RootVisitor* visitor) {
  const uint32_t thread_id = GetThreadId();
  // 熟悉 Java Thread 本质的同学,对这里就比较熟悉了
  // 接下来就看 GcRoot 了
  // 1. kRootThreadObject
  visitor->VisitRootIfNonNull(&tlsPtr_.opeer, RootInfo(kRootThreadObject, thread_id));
  if (tlsPtr_.exception != nullptr && tlsPtr_.exception != GetDeoptimizationException()) {
    // kRootNativeStack
    visitor->VisitRoot(reinterpret_cast<mirror::Object**>(&tlsPtr_.exception),
                       RootInfo(kRootNativeStack, thread_id));
  }
  if (tlsPtr_.async_exception != nullptr) {
    // kRootNativeStack
    visitor->VisitRoot(reinterpret_cast<mirror::Object**>(&tlsPtr_.async_exception),
                       RootInfo(kRootNativeStack, thread_id));
  }
  // kRootNativeStack
  visitor->VisitRootIfNonNull(&tlsPtr_.monitor_enter_object, RootInfo(kRootNativeStack, thread_id));
  // kRootJNILocal
  tlsPtr_.jni_env->VisitJniLocalRoots(visitor, RootInfo(kRootJNILocal, thread_id));
  // kRootJNIMonitor
  tlsPtr_.jni_env->VisitMonitorRoots(visitor, RootInfo(kRootJNIMonitor, thread_id));
  HandleScopeVisitRoots(visitor, thread_id);
  // Visit roots for deoptimization.
  if (tlsPtr_.stacked_shadow_frame_record != nullptr) {
    RootCallbackVisitor visitor_to_callback(visitor, thread_id);
    ReferenceMapVisitor<RootCallbackVisitor, kPrecise> mapper(this, nullptr, visitor_to_callback);
    for (StackedShadowFrameRecord* record = tlsPtr_.stacked_shadow_frame_record;
         record != nullptr;
         record = record->GetLink()) {
      for (ShadowFrame* shadow_frame = record->GetShadowFrame();
           shadow_frame != nullptr;
           shadow_frame = shadow_frame->GetLink()) {
        // Java Frame    
        mapper.VisitShadowFrame(shadow_frame);
      }
    }
  }
  for (DeoptimizationContextRecord* record = tlsPtr_.deoptimization_context_stack;
       record != nullptr;
       record = record->GetLink()) {
    if (record->IsReference()) {
      // kRootThreadObject
      visitor->VisitRootIfNonNull(record->GetReturnValueAsGCRoot(),
                                  RootInfo(kRootThreadObject, thread_id));
    }
    // kRootThreadObject
    visitor->VisitRootIfNonNull(record->GetPendingExceptionAsGCRoot(),
                                RootInfo(kRootThreadObject, thread_id));
  }
  if (tlsPtr_.frame_id_to_shadow_frame != nullptr) {
    RootCallbackVisitor visitor_to_callback(visitor, thread_id);
    ReferenceMapVisitor<RootCallbackVisitor, kPrecise> mapper(this, nullptr, visitor_to_callback);
    for (FrameIdToShadowFrame* record = tlsPtr_.frame_id_to_shadow_frame;
         record != nullptr;
         record = record->GetNext()) {
      mapper.VisitShadowFrame(record->GetShadowFrame());
    }
  }
  for (auto* verifier = tlsPtr_.method_verifier; verifier != nullptr; verifier = verifier->link_) {
    verifier->VisitRoots(visitor, RootInfo(kRootNativeStack, thread_id));
  }
  // Visit roots on this thread's stack
  RuntimeContextType context;
  RootCallbackVisitor visitor_to_callback(visitor, thread_id);
  ReferenceMapVisitor<RootCallbackVisitor, kPrecise> mapper(this, &context, visitor_to_callback);
  mapper.template WalkStack<StackVisitor::CountTransitions::kNo>(false);
  for (auto& entry : *GetInstrumentationStack()) {
    visitor->VisitRootIfNonNull(&entry.second.this_object_, RootInfo(kRootVMInternal, thread_id));
  }
}
NonThread Root

该部分主要是 kRootVMInternal

void Runtime::VisitNonThreadRoots(RootVisitor* visitor) {
  // JavaVMExt globals_
  java_vm_->VisitRoots(visitor);
  sentinel_.VisitRootIfNonNull(visitor, RootInfo(kRootVMInternal));
  pre_allocated_OutOfMemoryError_when_throwing_exception_
      .VisitRootIfNonNull(visitor, RootInfo(kRootVMInternal));
  pre_allocated_OutOfMemoryError_when_throwing_oome_
      .VisitRootIfNonNull(visitor, RootInfo(kRootVMInternal));
  pre_allocated_OutOfMemoryError_when_handling_stack_overflow_
      .VisitRootIfNonNull(visitor, RootInfo(kRootVMInternal));
  pre_allocated_NoClassDefFoundError_.VisitRootIfNonNull(visitor, RootInfo(kRootVMInternal));
  VisitImageRoots(visitor);
  verifier::ClassVerifier::VisitStaticRoots(visitor);
  VisitTransactionRoots(visitor);
}
ConcurrentRoot
void Runtime::VisitConcurrentRoots(RootVisitor* visitor, VisitRootFlags flags) {
  // kRootInternedString
  intern_table_->VisitRoots(visitor, flags);
  // kRootVMInternal
  // kRootStickyClass
  class_linker_->VisitRoots(visitor, flags);
  // kRootVMInternal
  jni_id_manager_->VisitRoots(visitor);
  // kRootDebugger
  heap_->VisitAllocationRecords(visitor);
  if ((flags & kVisitRootFlagNewRoots) == 0) {
    // Guaranteed to have no new roots in the constant roots.
    // kRootVMInternal
    VisitConstantRoots(visitor);
  }
}
ImageRoot

ImageRoot 枚举类型

enum ImageRoot {
    kDexCaches,
    kClassRoots,
    kSpecialRoots,                    // Different for boot image and app image, see aliases below.
    kImageRootsMax,

    // Aliases.
    kAppImageClassLoader = kSpecialRoots,   // The class loader used to build the app image.
    kBootImageLiveObjects = kSpecialRoots,  // Array of boot image objects that must be kept live.
};
void Runtime::VisitImageRoots(RootVisitor* visitor) {
  for (auto* space : GetHeap()->GetContinuousSpaces()) {
    if (space->IsImageSpace()) {
      auto* image_space = space->AsImageSpace();
      const auto& image_header = image_space->GetImageHeader();
      for (int32_t i = 0, size = image_header.GetImageRoots()->GetLength(); i != size; ++i) {
        mirror::Object* obj =
            image_header.GetImageRoot(static_cast<ImageHeader::ImageRoot>(i)).Ptr();
        if (obj != nullptr) {
          mirror::Object* after_obj = obj;
          // kRootStickyClass
          visitor->VisitRoot(&after_obj, RootInfo(kRootStickyClass));
          CHECK_EQ(after_obj, obj);
        }
      }
    }
  }
}

boot ImageSpace

// boot ImageSpace 加载该部分类可以裁剪掉
6fa92000-6fd21000 rw-p 00000000 00:00 0                                  [anon:dalvik-/apex/com.android.art/javalib/boot.art]
6fd21000-6fd7c000 rw-p 00000000 00:00 0                                  [anon:dalvik-/apex/com.android.art/javalib/boot-core-libart.art]
6fd7c000-6fe48000 rw-p 00000000 00:00 0                                  [anon:dalvik-/apex/com.android.art/javalib/boot-core-icu4j.art]
6fe48000-6fe7f000 rw-p 00000000 00:00 0                                  [anon:dalvik-/apex/com.android.art/javalib/boot-okhttp.art]
6fe7f000-6fec3000 rw-p 00000000 00:00 0                                  [anon:dalvik-/apex/com.android.art/javalib/boot-bouncycastle.art]
6fec3000-6fed2000 rw-p 00000000 00:00 0                                  [anon:dalvik-/apex/com.android.art/javalib/boot-apache-xml.art]

app ImageSpace

// dex -> dex2oat 
// 加载 oat file 成为了 app ImageSpace
std::unique_ptr<ImageSpace> ImageSpace::CreateFromAppImage(const char* image,
                                                           const OatFile* oat_file,
                                                           std::string* error_msg) {
  // Note: The oat file has already been validated.
  const std::vector<ImageSpace*>& boot_image_spaces =
      Runtime::Current()->GetHeap()->GetBootImageSpaces();
  return CreateFromAppImage(image,
                            oat_file,
                            ArrayRef<ImageSpace* const>(boot_image_spaces),
                            error_msg);
}
MarkRootObject
// Always called when marking objects, but only does
// something when ctx->gc_scan_state_ is non-zero, which is usually
// only true when marking the root set or unreachable
// objects.  Used to add rootset references to obj.
// 标记 GcRoot 对象
void Hprof::MarkRootObject(const mirror::Object* obj, jobject jni_obj, HprofHeapTag heap_tag,
                           uint32_t thread_serial) {
  if (heap_tag == 0) {
    return;
  }

  CheckHeapSegmentConstraints();

  switch (heap_tag) {
    // ID: object ID
    case HPROF_ROOT_UNKNOWN:
    case HPROF_ROOT_STICKY_CLASS:
    case HPROF_ROOT_MONITOR_USED:
    case HPROF_ROOT_INTERNED_STRING:
    case HPROF_ROOT_DEBUGGER:
    case HPROF_ROOT_VM_INTERNAL: {
      uint64_t key = (static_cast<uint64_t>(heap_tag) << 32) | PointerToLowMemUInt32(obj);
      if (simple_roots_.insert(key).second) {
        __ AddU1(heap_tag);
        __ AddObjectId(obj);
      }
      break;
    }

      // ID: object ID
      // ID: JNI global ref ID
    case HPROF_ROOT_JNI_GLOBAL:
      __ AddU1(heap_tag);
      __ AddObjectId(obj);
      __ AddJniGlobalRefId(jni_obj);
      break;

      // ID: object ID
      // U4: thread serial number
      // U4: frame number in stack trace (-1 for empty)
    case HPROF_ROOT_JNI_LOCAL:
    case HPROF_ROOT_JNI_MONITOR:
    case HPROF_ROOT_JAVA_FRAME:
      __ AddU1(heap_tag);
      __ AddObjectId(obj);
      __ AddU4(thread_serial);
      __ AddU4((uint32_t)-1);
      break;

      // ID: object ID
      // U4: thread serial number
    case HPROF_ROOT_NATIVE_STACK:
    case HPROF_ROOT_THREAD_BLOCK:
      __ AddU1(heap_tag);
      __ AddObjectId(obj);
      __ AddU4(thread_serial);
      break;

      // ID: thread object ID
      // U4: thread serial number
      // U4: stack trace serial number
    case HPROF_ROOT_THREAD_OBJECT:
      __ AddU1(heap_tag);
      __ AddObjectId(obj);
      __ AddU4(thread_serial);
      __ AddU4((uint32_t)-1);    // xxx
      break;

    case HPROF_CLASS_DUMP:
    case HPROF_INSTANCE_DUMP:
    case HPROF_OBJECT_ARRAY_DUMP:
    case HPROF_PRIMITIVE_ARRAY_DUMP:
    case HPROF_HEAP_DUMP_INFO:
    case HPROF_PRIMITIVE_ARRAY_NODATA_DUMP:
      // Ignored.
      break;

    case HPROF_ROOT_FINALIZING:
    case HPROF_ROOT_REFERENCE_CLEANUP:
    case HPROF_UNREACHABLE:
      LOG(FATAL) << "obsolete tag " << static_cast<int>(heap_tag);
      UNREACHABLE();
  }
}

标记了 GcRoot,接下来就要看 mirror::Object 从哪儿获取了?

mirror::Object
// 该函数参数 visitor 是 dump_object 的函数指针
template <typename Visitor>
inline void Heap::VisitObjectsPaused(Visitor&& visitor) {
  Thread* self = Thread::Current();
  Locks::mutator_lock_->AssertExclusiveHeld(self);
  // RegionSpace 空间的对象
  VisitObjectsInternalRegionSpace(visitor);
  // 其它空间
  VisitObjectsInternal(visitor);
}
Region Space
// Visit objects in the region spaces.
template <typename Visitor>
inline void Heap::VisitObjectsInternalRegionSpace(Visitor&& visitor) {
  Thread* self = Thread::Current();
  Locks::mutator_lock_->AssertExclusiveHeld(self);
  if (region_space_ != nullptr) {
    DCHECK(IsGcConcurrentAndMoving());
    // region sapce 空间的对象的遍历
    // 感兴趣的同学可以看一下该空间的遍历
    region_space_->Walk(visitor);
  }
}

RegionSpace是什么呢?

Other Spaces
// Visit objects in the other spaces.
template <typename Visitor>
inline void Heap::VisitObjectsInternal(Visitor&& visitor) {
  if (bump_pointer_space_ != nullptr) {
    // Visit objects in bump pointer space.
    // bump pointer space 的对象遍历
    // BumpPointerSpace 又是什么呢?
    bump_pointer_space_->Walk(visitor);
  }
  // TODO: Switch to standard begin and end to use ranged a based loop.
  // allocation_stack_ 是管理哪儿的对象呢?
  for (auto* it = allocation_stack_->Begin(), *end = allocation_stack_->End(); it < end; ++it) {
    mirror::Object* const obj = it->AsMirrorPtr();

    mirror::Class* kls = nullptr;
    if (obj != nullptr && (kls = obj->GetClass()) != nullptr) {
      // 访问每一个 mirror::Object
      visitor(obj);
    }
  }
  {
    ReaderMutexLock mu(Thread::Current(), *Locks::heap_bitmap_lock_);
    // live_bitmap_ 管理哪些对象呢?
    GetLiveBitmap()->Visit<Visitor>(visitor);
  }
}

BumpPointerSpace是什么?
allocation_stack_管理什么对象呢?
Heap::GetLiveBitmap()管理什么对象呢?
上面这几个问题先不要关心,稍后再看,主要看一下上面的 visitor 是谁?还记得 ProcessBody 里的 dump_object吗?

DumpHeapObject
void Hprof::DumpHeapObject(mirror::Object* obj) {
  // 忽略其它代码 ...
  gc::Heap* const heap = Runtime::Current()->GetHeap();
  const gc::space::ContinuousSpace* const space = heap->FindContinuousSpaceFromObject(obj, true);
  // 默认是 app ImageSpace
  HprofHeapId heap_type = HPROF_HEAP_APP;
  if (space != nullptr) {
    if (space->IsZygoteSpace()) {
      // 这里 ZygoteSpace 空间的内存管理
      // 该空间的内存回收策略是 never gc
      // 故该空间可以进行裁剪以降低内存快照的大小
      heap_type = HPROF_HEAP_ZYGOTE;
      VisitRoot(obj, RootInfo(kRootVMInternal));
    } else if (space->IsImageSpace() && heap->ObjectIsInBootImageSpace(obj)) {
      // Only count objects in the boot image as HPROF_HEAP_IMAGE, 
      // this leaves app image objects as HPROF_HEAP_APP. b/35762934
      // boot ImageSpace 可以裁剪
      // 内存回收策略跟 ZygoteSpace 一致,也是 never gc!!!
      // 故该空间可以进行裁剪以降低内存快照的大小
      heap_type = HPROF_HEAP_IMAGE;
      VisitRoot(obj, RootInfo(kRootVMInternal));
    }
  } else {
    const auto* los = heap->GetLargeObjectsSpace();
    if (los->Contains(obj) && los->IsZygoteLargeObject(Thread::Current(), obj)) {
      // larget object in zygote space
      // 故该空间可以进行裁剪以降低内存快照的大小
      heap_type = HPROF_HEAP_ZYGOTE;
      VisitRoot(obj, RootInfo(kRootVMInternal));
    }
  }
  // segment 
  CheckHeapSegmentConstraints();

  if (heap_type != current_heap_) {
    HprofStringId nameId;
    __ AddU1(HPROF_HEAP_DUMP_INFO);
    __ AddU4(static_cast<uint32_t>(heap_type));   // uint32_t: heap type
    switch (heap_type) {
    case HPROF_HEAP_APP:
      nameId = LookupStringId("app");
      break;
    case HPROF_HEAP_ZYGOTE:
      nameId = LookupStringId("zygote");
      break;
    case HPROF_HEAP_IMAGE:
      nameId = LookupStringId("image");
      break;
    default:
      // Internal error
      LOG(ERROR) << "Unexpected desiredHeap";
      nameId = LookupStringId("<ILLEGAL>");
      break;
    }
    __ AddStringId(nameId);
    // app , zygote , image
    // 其中 zgyote , image 可以裁剪掉
    // 快手开源库 KOOM 里内存快照就把这俩部分数据裁剪掉以减小 hprof 的大小
    current_heap_ = heap_type;
  }

  mirror::Class* c = obj->GetClass();
  if (c == nullptr) {
    // This object will bother HprofReader, because it has a null
    // class, so just don't dump it. It could be
    // gDvm.unlinkedJavaLangClass or it could be an object just
    // allocated which hasn't been initialized yet.
  } else {
    if (obj->IsClass()) {
      DumpHeapClass(obj->AsClass().Ptr());
    } else if (c->IsArrayClass()) {
      DumpHeapArray(obj->AsArray().Ptr(), c);
    } else {
      DumpHeapInstanceObject(obj, c, visitor.GetRoots());
    }
  }

  ++objects_in_segment_;
}
DumpHeapClass

void Hprof::DumpHeapClass(mirror::Class* klass) {
  if (!klass->IsResolved()) {
    // Class is allocated but not yet resolved: we cannot access its fields or super class.
    return;
  }

  // Note: We will emit instance fields of Class as synthetic static fields with a prefix of
  //       "$class$" so the class fields are visible in hprof dumps. For tools to account for that
  //       correctly, we'll emit an instance size of zero for java.lang.Class, and also emit the
  //       instance fields of java.lang.Object.
  //
  //       For other overhead (currently only the embedded vtable), we will generate a synthetic
  //       byte array (or field[s] in case the overhead size is of reference size or less).

  const size_t num_static_fields = klass->NumStaticFields();

  // Total class size:
  //   * class instance fields (including Object instance fields)
  //   * vtable
  //   * class static fields
  const size_t total_class_size = klass->GetClassSize();

  // Base class size (common parts of all Class instances):
  //   * class instance fields (including Object instance fields)
  constexpr size_t base_class_size = sizeof(mirror::Class);
  CHECK_LE(base_class_size, total_class_size);

  // Difference of Total and Base:
  //   * vtable
  //   * class static fields
  const size_t base_overhead_size = total_class_size - base_class_size;

  // Tools (ahat/Studio) will count the static fields and account for them in the class size. We
  // must thus subtract them from base_overhead_size or they will be double-counted.
  size_t class_static_fields_size = 0;
  for (ArtField& class_static_field : klass->GetSFields()) {
    size_t size = 0;
    SignatureToBasicTypeAndSize(class_static_field.GetTypeDescriptor(), &size);
    class_static_fields_size += size;
  }

  CHECK_GE(base_overhead_size, class_static_fields_size);
  // Now we have:
  //   * vtable
  const size_t base_no_statics_overhead_size = base_overhead_size - class_static_fields_size;

  // We may decide to display native overhead (the actual IMT, ArtFields and ArtMethods) in the
  // future.
  const size_t java_heap_overhead_size = base_no_statics_overhead_size;

  // For overhead greater 4, we'll allocate a synthetic array.
  if (java_heap_overhead_size > 4) {
    // Create a byte array to reflect the allocation of the
    // StaticField array at the end of this class.
    __ AddU1(HPROF_PRIMITIVE_ARRAY_DUMP);
    __ AddClassStaticsId(klass);
    __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(klass));
    __ AddU4(java_heap_overhead_size - 4);
    __ AddU1(hprof_basic_byte);
    for (size_t i = 0; i < java_heap_overhead_size - 4; ++i) {
      __ AddU1(0);
    }
  }
  const size_t java_heap_overhead_field_count = java_heap_overhead_size > 0
                                                    ? (java_heap_overhead_size == 3 ? 2u : 1u)
                                                    : 0;

  __ AddU1(HPROF_CLASS_DUMP);
  __ AddClassId(LookupClassId(klass));
  __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(klass));
  __ AddClassId(LookupClassId(klass->GetSuperClass().Ptr()));
  __ AddObjectId(klass->GetClassLoader().Ptr());
  __ AddObjectId(nullptr);    // no signer
  __ AddObjectId(nullptr);    // no prot domain
  __ AddObjectId(nullptr);    // reserved
  __ AddObjectId(nullptr);    // reserved
  // Instance size.
  if (klass->IsClassClass()) {
    // As mentioned above, we will emit instance fields as synthetic static fields. So the
    // base object is "empty."
    __ AddU4(0);
  } else if (klass->IsStringClass()) {
    // Strings are variable length with character data at the end like arrays.
    // This outputs the size of an empty string.
    __ AddU4(sizeof(mirror::String));
  } else if (klass->IsArrayClass() || klass->IsPrimitive()) {
    __ AddU4(0);
  } else {
    __ AddU4(klass->GetObjectSize());  // instance size
  }

  __ AddU2(0);  // empty const pool

  // Static fields
  //
  // Note: we report Class' and Object's instance fields here, too. This is for visibility reasons.
  //       (b/38167721)
  mirror::Class* class_class = klass->GetClass();

  DCHECK(class_class->GetSuperClass()->IsObjectClass());
  const size_t static_fields_reported = class_class->NumInstanceFields()
                                        + class_class->GetSuperClass()->NumInstanceFields()
                                        + java_heap_overhead_field_count
                                        + num_static_fields;
  __ AddU2(dchecked_integral_cast<uint16_t>(static_fields_reported));

  if (java_heap_overhead_size != 0) {
    __ AddStringId(LookupStringId(kClassOverheadName));
    size_t overhead_fields = 0;
    if (java_heap_overhead_size > 4) {
      __ AddU1(hprof_basic_object);
      __ AddClassStaticsId(klass);
      ++overhead_fields;
    } else {
      switch (java_heap_overhead_size) {
        case 4: {
          __ AddU1(hprof_basic_int);
          __ AddU4(0);
          ++overhead_fields;
          break;
        }

        case 2: {
          __ AddU1(hprof_basic_short);
          __ AddU2(0);
          ++overhead_fields;
          break;
        }

        case 3: {
          __ AddU1(hprof_basic_short);
          __ AddU2(0);
          __ AddStringId(LookupStringId(std::string(kClassOverheadName) + "2"));
          ++overhead_fields;
        }
        FALLTHROUGH_INTENDED;

        case 1: {
          __ AddU1(hprof_basic_byte);
          __ AddU1(0);
          ++overhead_fields;
          break;
        }
      }
    }
    DCHECK_EQ(java_heap_overhead_field_count, overhead_fields);
  }

  // Helper lambda to emit the given static field. The second argument name_fn will be called to
  // generate the name to emit. This can be used to emit something else than the field's actual
  // name.
  auto static_field_writer = [&](ArtField& field, auto name_fn)
      REQUIRES_SHARED(Locks::mutator_lock_) {
    __ AddStringId(LookupStringId(name_fn(field)));

    size_t size;
    HprofBasicType t = SignatureToBasicTypeAndSize(field.GetTypeDescriptor(), &size);
    __ AddU1(t);
    switch (t) {
      case hprof_basic_byte:
        __ AddU1(field.GetByte(klass));
        return;
      case hprof_basic_boolean:
        __ AddU1(field.GetBoolean(klass));
        return;
      case hprof_basic_char:
        __ AddU2(field.GetChar(klass));
        return;
      case hprof_basic_short:
        __ AddU2(field.GetShort(klass));
        return;
      case hprof_basic_float:
      case hprof_basic_int:
      case hprof_basic_object:
        __ AddU4(field.Get32(klass));
        return;
      case hprof_basic_double:
      case hprof_basic_long:
        __ AddU8(field.Get64(klass));
        return;
    }
    LOG(FATAL) << "Unexpected size " << size;
    UNREACHABLE();
  };

  {
    auto class_instance_field_name_fn = [](ArtField& field) REQUIRES_SHARED(Locks::mutator_lock_) {
      return std::string("$class$") + field.GetName();
    };
    for (ArtField& class_instance_field : class_class->GetIFields()) {
      static_field_writer(class_instance_field, class_instance_field_name_fn);
    }
    for (ArtField& object_instance_field : class_class->GetSuperClass()->GetIFields()) {
      static_field_writer(object_instance_field, class_instance_field_name_fn);
    }
  }

  {
    auto class_static_field_name_fn = [](ArtField& field) REQUIRES_SHARED(Locks::mutator_lock_) {
      return field.GetName();
    };
    for (ArtField& class_static_field : klass->GetSFields()) {
      static_field_writer(class_static_field, class_static_field_name_fn);
    }
  }

  // Instance fields for this class (no superclass fields)
  int iFieldCount = klass->NumInstanceFields();
  // add_internal_runtime_objects is only for classes that may retain objects live through means
  // other than fields. It is never the case for strings.
  const bool add_internal_runtime_objects = AddRuntimeInternalObjectsField(klass);
  if (klass->IsStringClass() || add_internal_runtime_objects) {
    __ AddU2((uint16_t)iFieldCount + 1);
  } else {
    __ AddU2((uint16_t)iFieldCount);
  }
  for (int i = 0; i < iFieldCount; ++i) {
    ArtField* f = klass->GetInstanceField(i);
    __ AddStringId(LookupStringId(f->GetName()));
    HprofBasicType t = SignatureToBasicTypeAndSize(f->GetTypeDescriptor(), nullptr);
    __ AddU1(t);
  }
  // Add native value character array for strings / byte array for compressed strings.
  if (klass->IsStringClass()) {
    __ AddStringId(LookupStringId("value"));
    __ AddU1(hprof_basic_object);
  } else if (add_internal_runtime_objects) {
    __ AddStringId(LookupStringId("runtimeInternalObjects"));
    __ AddU1(hprof_basic_object);
  }
}

DumpHeapArray
void Hprof::DumpHeapArray(mirror::Array* obj, mirror::Class* klass) {
  uint32_t length = obj->GetLength();

  if (obj->IsObjectArray()) {
    // obj is an object array.
    __ AddU1(HPROF_OBJECT_ARRAY_DUMP);

    __ AddObjectId(obj);
    __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(obj));
    __ AddU4(length);
    __ AddClassId(LookupClassId(klass));

    // Dump the elements, which are always objects or null.
    __ AddIdList(obj->AsObjectArray<mirror::Object>().Ptr());
  } else {
    size_t size;
    HprofBasicType t = SignatureToBasicTypeAndSize(
        Primitive::Descriptor(klass->GetComponentType()->GetPrimitiveType()), &size);

    // obj is a primitive array.
    __ AddU1(HPROF_PRIMITIVE_ARRAY_DUMP);

    __ AddObjectId(obj);
    __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(obj));
    __ AddU4(length);
    __ AddU1(t);

    // Dump the raw, packed element values.
    if (size == 1) {
      __ AddU1List(reinterpret_cast<const uint8_t*>(obj->GetRawData(sizeof(uint8_t), 0)), length);
    } else if (size == 2) {
      __ AddU2List(reinterpret_cast<const uint16_t*>(obj->GetRawData(sizeof(uint16_t), 0)), length);
    } else if (size == 4) {
      __ AddU4List(reinterpret_cast<const uint32_t*>(obj->GetRawData(sizeof(uint32_t), 0)), length);
    } else if (size == 8) {
      __ AddU8List(reinterpret_cast<const uint64_t*>(obj->GetRawData(sizeof(uint64_t), 0)), length);
    }
  }
}
DumpHeapInstanceObject
void Hprof::DumpHeapInstanceObject(mirror::Object* obj,
                                   mirror::Class* klass,
                                   const std::set<mirror::Object*>& fake_roots) {
  // obj is an instance object.
  __ AddU1(HPROF_INSTANCE_DUMP);
  __ AddObjectId(obj);
  __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(obj));
  __ AddClassId(LookupClassId(klass));

  // Reserve some space for the length of the instance data, which we won't
  // know until we're done writing it.
  size_t size_patch_offset = output_->Length();
  __ AddU4(0x77777777);

  // What we will use for the string value if the object is a string.
  mirror::Object* string_value = nullptr;
  mirror::Object* fake_object_array = nullptr;

  // Write the instance data;  fields for this class, followed by super class fields, and so on.
  do {
    const size_t instance_fields = klass->NumInstanceFields();
    for (size_t i = 0; i < instance_fields; ++i) {
      ArtField* f = klass->GetInstanceField(i);
      size_t size;
      HprofBasicType t = SignatureToBasicTypeAndSize(f->GetTypeDescriptor(), &size);
      switch (t) {
      case hprof_basic_byte:
        __ AddU1(f->GetByte(obj));
        break;
      case hprof_basic_boolean:
        __ AddU1(f->GetBoolean(obj));
        break;
      case hprof_basic_char:
        __ AddU2(f->GetChar(obj));
        break;
      case hprof_basic_short:
        __ AddU2(f->GetShort(obj));
        break;
      case hprof_basic_int:
        if (mirror::kUseStringCompression &&
            klass->IsStringClass() &&
            f->GetOffset().SizeValue() == mirror::String::CountOffset().SizeValue()) {
          // Store the string length instead of the raw count field with compression flag.
          __ AddU4(obj->AsString()->GetLength());
          break;
        }
        FALLTHROUGH_INTENDED;
      case hprof_basic_float:
      case hprof_basic_object:
        __ AddU4(f->Get32(obj));
        break;
      case hprof_basic_double:
      case hprof_basic_long:
        __ AddU8(f->Get64(obj));
        break;
      }
    }
    // Add value field for String if necessary.
    if (klass->IsStringClass()) {
      ObjPtr<mirror::String> s = obj->AsString();
      if (s->GetLength() == 0) {
        // If string is empty, use an object-aligned address within the string for the value.
        string_value = reinterpret_cast<mirror::Object*>(
            reinterpret_cast<uintptr_t>(s.Ptr()) + kObjectAlignment);
      } else {
        if (s->IsCompressed()) {
          string_value = reinterpret_cast<mirror::Object*>(s->GetValueCompressed());
        } else {
          string_value = reinterpret_cast<mirror::Object*>(s->GetValue());
        }
      }
      __ AddObjectId(string_value);
    } else if (AddRuntimeInternalObjectsField(klass)) {
      // We need an id that is guaranteed to not be used, use 1/2 of the object alignment.
      fake_object_array = reinterpret_cast<mirror::Object*>(
          reinterpret_cast<uintptr_t>(obj) + kObjectAlignment / 2);
      __ AddObjectId(fake_object_array);
    }
    klass = klass->GetSuperClass().Ptr();
  } while (klass != nullptr);

  // Patch the instance field length.
  __ UpdateU4(size_patch_offset, output_->Length() - (size_patch_offset + 4));

  // Output native value character array for strings.
  CHECK_EQ(obj->IsString(), string_value != nullptr);
  if (string_value != nullptr) {
    ObjPtr<mirror::String> s = obj->AsString();
    __ AddU1(HPROF_PRIMITIVE_ARRAY_DUMP);
    __ AddObjectId(string_value);
    __ AddStackTraceSerialNumber(LookupStackTraceSerialNumber(obj));
    __ AddU4(s->GetLength());
    if (s->IsCompressed()) {
      __ AddU1(hprof_basic_byte);
      __ AddU1List(s->GetValueCompressed(), s->GetLength());
    } else {
      __ AddU1(hprof_basic_char);
      __ AddU2List(s->GetValue(), s->GetLength());
    }
  } else if (fake_object_array != nullptr) {
    DumpFakeObjectArray(fake_object_array, fake_roots);
  }
}

总结

android_hprof_format.png

图中中间黑框就是完整的 Hprof 文件内容,主要有下面几个部分:
String:  所有字符串的值与 ID
LoadClass:  所有已加载的类名字与类 ID
HeapSegment:  堆转储时是一批一批对象写入文件的,这里的一批就对应一个 Segment,一个 Segment 又包含多种对象类型,如下
GcRoot:  根对象,也就是垃圾回收器扫描的入口,分为很多类
Class:  类的 Class 对象,包含类的元数据以及静态属性
Instance:  实例对象,包含对象属性值
ObjectArray:  对象数组,包含数组长度以及数组的值(通过 ObjectID 的方式)
PrimitiveArray:  基本类型数组,包含数组长度、数组类型和每一项的值

hprof 格式通过结合源码有了一个清晰的认识了。
还记得前面提到RegionSpace,BumpPointerSapce,Heap::GetLiveBitmap()等是什么呢?接下来就来探索一下 Android 的 Space 管理?

Space 管理

GetLiveBitmap()

// hprof::VisitObject
template <typename Visitor>
inline void HeapBitmap::Visit(Visitor&& visitor) {
  // continuous_space_bitmaps_
  for (const auto& bitmap : continuous_space_bitmaps_) {
    bitmap->VisitMarkedRange(bitmap->HeapBegin(), bitmap->HeapLimit(), visitor);
  }
  // large_object_bitmaps_
  for (const auto& bitmap : large_object_bitmaps_) {
    bitmap->VisitMarkedRange(bitmap->HeapBegin(), bitmap->HeapLimit(), visitor);
  }
}
// 通过查找 AddSpace 的调用地方
art/runtime/gc/heap.cc(12 occurrences)
494: AddSpace(space.release());
607: AddSpace(non_moving_space_);
618: AddSpace(region_space_); // RegionSpace
626: AddSpace(bump_pointer_space_); // BumpPointerSpace
630: AddSpace(temp_space_);
635: AddSpace(main_space_); // DlMallocSpace / RosAllocSpace
650: AddSpace(main_space_backup_.get());
668: AddSpace(large_object_space_); // LargeObjectSpace
2123: AddSpace(to_space);
2416: AddSpace(main_space_);
2464: AddSpace(zygote_space_); // ZygoteSpace
2466: AddSpace(non_moving_space_);

设备虚拟内存

// 1. adb shell
// 2. su
// 3. top | grep packageName
// 4. cat /proc/pid/maps
// 虚拟内存的大小是 「384MB」 
12c00000-2ac00000 rw-p 00000000 00:00 0                                  [anon:dalvik-main space (region space)]
// zygote / non moving space 
// 虚拟内存的大小是 「64MB」
71de5000-720fa000 rw-p 00000000 00:00 0                                  [anon:dalvik-zygote space]
720fa000-720fb000 rw-p 00000000 00:00 0                                  [anon:dalvik-non moving space]
720fb000-720fc000 rw-p 00000000 00:00 0                                  [anon:dalvik-non moving space]
720fc000-755e6000 ---p 00000000 00:00 0                                  [anon:dalvik-non moving space]
755e6000-75de5000 rw-p 00000000 00:00 0                                  [anon:dalvik-non moving space]
// free list larget object space (12KB)
// 虚拟内存的大小是 「576MB」
75de5000-99de5000 rw-p 00000000 00:00 0                                  [anon:dalvik-free list large object space]

Space 分类

art/runtime/gc/heap.cc

// Define space name.
static const char* kDlMallocSpaceName[2] = {"main dlmalloc space", "main dlmalloc space 1"};
static const char* kRosAllocSpaceName[2] = {"main rosalloc space", "main rosalloc space 1"};
static const char* kMemMapSpaceName[2] = {"main space", "main space 1"};
static const char* kNonMovingSpaceName = "non moving space";
static const char* kZygoteSpaceName = "zygote space";
ImageSpace
  • boot ImageSpace 内存回收策略 「never gc」-> 可裁剪
  • app ImageSpace
ZygoteSpace
  • 内存回收策略 「full gc」-> 可裁剪
RegionSpace / BumpPointerSpace
  • NewTLAB 线程私有数据
  • 不支持 Alloc / Free 单个 mirror::Object 对象
  • 只能释放一块区域的对象
  • always gc
MallocSpace
  • DlMallocSpace
  • RosMallocSpace
  • always gc
LargeObjectSpace
  • 申请的对象大于3个PAGE_SIZE(12KB) 该区域管理
  • always gc
SemiSpace
  • 退后台使用内存整理算法来减小内存碎片

参考

《深入理解Android:Java虚拟机ART》

cs.android.com/android

hg.openjdk.java.net/jdk8/jdk8/j…