深入源码理解 G1 Young GC （2）前言前文首先简单的介绍了 JVM 的启动，其次简单的介绍了后续需要用到的

前言

前文首先简单的介绍了 JVM 的启动，其次简单的介绍了后续需要用到的 G1堆相关组件，其次介绍了 G1 如何选定收集集（Cset），其次从源码级别详细介绍了 Young GC 的准备工作，主要是刷新线程本地数据。本文开始介绍 Young GC 的第二个阶段，根结点扫描（Roots Scan）和拷贝活对象（Evacuate Live Objects）。

记忆集

记忆集是 G1 非常重要的结构，G1 使用它实现增量收集。之前的文章堆记忆做了简单的介绍，使用的源码是 Java 8，本节使用最新的代码，深入介绍其结构，共同的部分不再做过多的讲述。

整体结构

class G1HeapRegion : public CHeapObj<mtGC> {
  // The remembered set for this region.
  G1HeapRegionRemSet* _rem_set
}
class G1HeapRegionRemSet : public CHeapObj<mtGC> {
  // The set of cards in the Java heap
  G1CardSet* _card_set
}
class G1CardSet : public CHeapObj<mtGCCardSet> {
    G1CardSetHashTable* _table;
}

Region的记忆集主要是 G1CardSet 维护的，而 G1CardSet 中维护了 G1CardSetHashTable ，它是对底层 ConcurrentHashTable 封装。

class G1CardSetHashTable : public CHeapObj<mtGCCardSet> {
  CardSetHash _table;
  CHTScanTask _table_scanner;
}
using CardSetHash = ConcurrentHashTable<G1CardSetHashTableConfig, mtGCCardSet>;

可以看到记忆集最底层的结构是一个 HashTable，本文并不深入探究其实现。HashTable 的 key 是指向当前 region 的其他 region 的索引，value 是一个指针，其他 region的 card id 的集合，根据记录 card数量的多少会变换不同的数据结构。

如图中，r1的记忆集记录应该是{ (6,[12])}，r6的记忆集记录应该是 {(7,[14])}。

注意实际上 card 的大小为 512 byte。

记忆集的添加

下面通过源码结合记忆集的添加操作来深入理解 G1 是如何维护记忆集的。

源对象指针

函数do_oop_work的参数 p 是指向源对象属性的指针，通过这个指针可以获取源对象属性所在的 region、card，同样可以获取目标对象的所在的 region、 card，体现了指针的强大之处。

inline void G1ConcurrentRefineOopClosure::do_oop_work(T* p) {
  T o = RawAccess<MO_RELAXED>::oop_load(p);
  //目标对象
  oop obj = CompressedOops::decode_not_null(o);
   //目标对象的记忆集
  G1HeapRegionRemSet* to_rem_set = _g1h->heap_region_containing(obj)->rem_set();
  //将源对象添加到目标对象的记忆集中
  if (to_rem_set->is_tracked()) {
    to_rem_set->add_reference(p, _worker_id);
  }
}

用下面的图表示老年代跨 region 引用，p 是指向某个源对象的属性，此属性又指向目的对象。

指针转换

将指针转换成 region 和 card作为 key 和 value 添加到目的对象的记忆集中。注意 card_within_region 获取是 region 内 card 的索引，这样做是为了节省内存，比如某个 card 全局编号可能是 1024，但是 region 内的编号可能是 12，前者 10 bit（2^10）进行存储，后者只需要 4 bit（2^4）就可以存储了。

_card_set->add_card(to_card(from));

uintptr_t G1HeapRegionRemSet::to_card(OopOrNarrowOopStar from) const {
  return pointer_delta(from, _heap_base_address, 1) >> CardTable::card_shift();
}

G1AddCardResult G1CardSet::add_card(uintptr_t card) {
  uint card_region; uint card_within_region;
  split_card(card, card_region, card_within_region); //获取源对象的region 和 card
  return add_card(card_region, card_within_region, true /* increment_total */);
}

ContainerPtr

using ContainerPtr = void*;

ContainerPtr 是指向真实存储 card 的容器，容器的类型根据数量card的多少进行升级。依次为：

ContainerInlinePtr : 内联指针存储，存储的数据最少。
ContainerArrayOfCards : 使用数组存储
ContainerBitMap：使用 bitMap 存储
ContainerHowl : 使用前套结构存储
Full：不存储，使用时遍历 region 的所有 card。

ContainerPtr 的存储类型是数据精度与内存占用的平衡。

ContainerInlinePtr

本文着重说一下 ContainerInlinePtr的结构，其他容器读者可自行查阅源码。

ContainerInlinePtr 使用 64 位（64位机器）指针存储，最低两位 bit 为 00 表示容器类型。剩下最低三位表示存储 card 的数量，余下的高位存储 card。

// MSB                                                 LSB
// +------+         +---------------+--------------+-----+
// |unused|   ...   |  card_index1  | card_index0  |SSS00|
// +------+         +---------------+--------------+-----+

举个例子，对于 8M 大小的 region，card 的编号从 0 到 16383（810241024/512）。14 个bit才能表示 16383 （2^14-1）。最多能存储 4 个（4*14 + 5 = 61 bit）。

card 添加

G1AddCardResult G1CardSet::add_card(uint card_region, uint card_in_region) {
    //根据 reigon 获取 hash 表中的 entry
    G1CardSetHashTableValue* table_entry = get_or_add_container(card_region, &should_grow_table);
      while (true) {
        container = acquire_container(&table_entry->_container);
        //将card添加到容器中
        add_result = add_to_container(&table_entry->_container, container, card_region, card_in_region, increment_total)
            
        if (add_result != Overflow) break; //添加成功不需要 coarsened
    
        // Card set has overflown. Coarsen or retry.
        //添加失败需要升级到其他类型的容器
        bool coarsened = coarsen_container(&table_entry->_container, container, card_in_region);
        if (coarsened)  break;
    }
}

根据类型选择对应的添加方法

G1CardSet::add_to_container(ContainerPtr volatile* container_addr,
                                            ContainerPtr container,
                                            uint card_region,
                                            uint card_in_region) {

  G1AddCardResult add_result;
  switch (container_type(container)) {
    case ContainerInlinePtr: {
      add_result = add_to_inline_ptr(container_addr, container, card_in_region);
      break;
    }
    case ContainerArrayOfCards: {
      add_result = add_to_array(container, card_in_region);
      break;
    }
    case ContainerBitMap: {
      add_result = add_to_bitmap(container, card_in_region);
      break;
    }
    //省略
  }
  return add_result;
}

本文着重看下 ContainerInlinePtr 添加：

inline G1AddCardResult G1CardSetInlinePtr::add(uint card_idx, uint bits_per_card, uint max_cards_in_inline_ptr) {
  uint cur_idx = 0;
  while (true) {
    uint num_cards = num_cards_in(_value);
    //如果 card 数量大于0，先看看 card 是否存在
    if (num_cards > 0) {
      cur_idx = find(card_idx, bits_per_card, cur_idx, num_cards);
    }
    // Check if the card is already stored in the pointer.
    //找到了不用添加即返回
    if (cur_idx < num_cards) {
      return Found;
    }
    // Check if there is actually enough space.
    //当前容器太小了，需要升级。
    if (num_cards >= max_cards_in_inline_ptr) {
      return Overflow;
    }
    //添加进容器
    ContainerPtr new_value = merge(_value, card_idx, num_cards, bits_per_card);
    ContainerPtr old_value = Atomic::cmpxchg(_value_addr, _value, new_value, memory_order_relaxed);
    if (_value == old_value) {
      return Added;
    }
    //其他
  }
}

记忆集的遍历

记忆集遍历非常重要，在垃圾收集时会使用到。do_value 实际处理每一个 region 对应的 card 集合。这里简单贴一下代码，后面垃圾收集时再说明。

void G1CardSet::iterate_containers(ContainerPtrClosure* cl, bool at_safepoint) {
  auto do_value =
    [&] (G1CardSetHashTableValue* value) {
        //对每一个 region card 集合进行处理
      cl->do_containerptr(value->_region_idx, value->_num_occupied, value->_container);
      return true;
    };
     _table->iterate(do_value);
}

void do_containerptr(uint card_region_idx, size_t num_occupied, G1CardSet::ContainerPtr container) override {
CardOrRanges<Closure> cl(_cl,
                         card_region_idx >> _log_card_regions_per_region,
                         (card_region_idx & _card_regions_per_region_mask) << _log_card_region_size);
_card_set->iterate_cards_or_ranges_in_container(container, cl);
  
inline void G1CardSet::iterate_cards_or_ranges_in_container(ContainerPtr const container, CardOrRangeVisitor& cl) {
  switch (container_type(container)) {
    case ContainerInlinePtr: {
      if (cl.start_iterate(G1GCPhaseTimes::MergeRSMergedInline)) {
        G1CardSetInlinePtr ptr(container);
        //cl 封装了遍历后的处理逻辑
        ptr.iterate(cl, _config->inline_ptr_bits_per_card());
      }
      return;
    }
    //其他代码
 }
}

记忆集的优化

深入理解 Garbage-First (G1) 记忆集前面的文章提到过 Java 将对 G1 年轻代的记忆集优化成一个，将也是就说所有年轻代 region 共享一个记忆集。这是基于两方面的原因：

年轻代之间跨 region 引用并不需要记录。
GC 时会回收所有年轻代 region。

优化之后将大大减少内存消耗，具体的代码见 8336086: G1: Use one G1CardSet instance for all young regions。

主要是逻辑是：

在堆中申明全局的 G1CardSet 类型的字段 _young_regions_cardset，并且在在堆初始化的时候创建好。
当申请一个新的 region 时，将堆中全局的 G1CardSet 赋值给新的 region。

class G1CollectedHeap : public CollectedHeap {
    G1CardSet _young_regions_cardset;
}
//堆初始化的时候创建
_young_regions_cardset(card_set_config(), &_young_regions_cardset_mm)

void G1CollectedHeap::set_region_short_lived_locked(G1HeapRegion* hr) {
  _eden.add(hr);
  _policy->set_region_eden(hr);
    //堆中全局的 `G1CardSet` 赋值给新的 region。
  hr->install_group_cardset(young_regions_cardset());
}

Evacuate Collection Set

回收阶段包括三个步骤：1. merge heap roots 2. root scan. 3. evacuate region.

// Actually do the work...
void G1YoungCollector::evacuate_initial_collection_set(G1ParScanThreadStateSet* per_thread_states,bool has_optional_evacuation_work) {
    rem_set()->merge_heap_roots(true /* initial_evacuation */);
    G1RootProcessor root_processor(_g1h, num_workers);
    G1EvacuateRegionsTask g1_par_task(_g1h,per_thread_states,task_queues(),&root_processor,num_workers,has_optional_evacuation_work);
    task_time = run_task_timed(&g1_par_task);
}

merge heap roots

第一步 prepare_for_merge_heap_roots 初始化记录状态的结构，第二步创建 G1MergeHeapRootsTask进行任务封装。

void G1RemSet::merge_heap_roots(bool initial_evacuation) {
    _scan_state->prepare_for_merge_heap_roots();
    //注意 initial_evacuation 参数为 true
    G1MergeHeapRootsTask cl(_scan_state, num_workers, initial_evacuation);
    workers->run_task(&cl, num_workers);
}

来到 G1MergeHeapRootsTask 的 work 方法。

virtual void work(uint worker_id) {
    //暂且不看大对象处理
    // 2. collection set
    G1MergeCardSetClosure merge(_scan_state);
    G1ClearBitmapClosure clear(g1h);
    G1CombinedClosure combined(&merge, &clear)
    
    if (_initial_evacuation)  G1HeapRegionRemSet::iterate_for_merge(g1h->young_regions_cardset(), merge);
    g1h->collection_set_iterate_increment_from(&combined, nullptr, worker_id)
}

遍历年轻代记忆集

g1h->young_regions_cardset()获取年轻代唯一的记忆集，merge是 G1MergeCardSetClosure类型，后面的代码使用 G1HeapRegionRemSetMergeCardClosure 再次进行封装。

// merge 是 G1MergeCardSetClosure 类型
G1HeapRegionRemSet::iterate_for_merge(g1h->young_regions_cardset(), merge);
 
void G1HeapRegionRemSet::iterate_for_merge(G1CardSet* card_set, CardOrRangeVisitor& cl) {
  G1HeapRegionRemSetMergeCardClosure<CardOrRangeVisitor, G1ContainerCardsOrRanges> cl2(card_set, cl,....);
  
  card_set->iterate_containers(&cl2, true /* at_safepoint */);
}

iterate_containers前文已经提到过是对记忆集进行遍历操作。我们直接看处理记忆集元素的代码。cl 类型是 G1HeapRegionRemSetMergeCardClosure，在它的 do_containerptr方法中再次对 _cl进行封装，_cl是 G1MergeCardSetClosure类型。

cl->do_containerptr(value->_region_idx, value->_num_occupied, value->_container);

void do_containerptr(uint card_region_idx, size_t num_occupied, G1CardSet::ContainerPtr container) override {
//类型是 G1ContainerCardsOrRanges
CardOrRanges<Closure> cl(_cl,.....);
_card_set->iterate_cards_or_ranges_in_container(container, cl);

iterate_cards_or_ranges_in_container遍历某个源 region 中的所有 card。最终调用到的是 mark_card，将对应 card 进行标记。

class G1ContainerCardsOrRanges {
  //计算 _region_base_idx 
  bool start_iterate(uint tag) {
    return _cl.start_iterate(tag, _region_idx);
  }
  void operator()(uint card_idx) {
    //_cl 是 G1MergeCardSetClosure
    _cl.do_card(card_idx + _offset);
  }
}

class G1MergeCardSetClosure : public G1HeapRegionClosure {
    void do_card(uint const card_idx) {
      G1CardTable::CardValue* to_prefetch = _ct->byte_for_index(_region_base_idx + card_idx);
      G1CardTable::CardValue* to_process = _merge_card_set_cache.push(to_prefetch);
      mark_card(to_process);
    }
}

void mark_card(G1CardTable::CardValue* value) {
  if (_ct->mark_clean_as_dirty(value)) {
    _scan_state->set_chunk_dirty(_ct->index_for_cardvalue(value));
  }
}

遍历 Cset

遍历逻辑由 G1CombinedClosure 封装，调用的是 do_heap_region方法。 G1MergeCardSetClosure的 do_heap_region将 region 添加到 _scan_state 的一个集合中待后面使用，其次是遍历 region 的记忆集。G1ClearBitmapClosure的 do_heap_region是清理 region 的 bitmap ，bitmap 用户记录移动失败的对象。

bool do_heap_region(G1HeapRegion* hr) {
//_closure1 是 G1MergeCardSetClosure _closure2 是 G1ClearBitmapClosure
  return _closure1->do_heap_region(hr) || _closure2->do_heap_region(hr);
}

virtual bool do_heap_region(G1HeapRegion* r) { //G1MergeCardSetClosure
  _scan_state->add_all_dirty_region(r->hrm_index());
  merge_card_set_for_region(r);
  return false;
}

bool do_heap_region(G1HeapRegion* hr) { // G1ClearBitmapClosure
  if (should_clear_region(hr)) {
    _g1h->clear_bitmap_for_region(hr);
    _g1h->concurrent_mark()->reset_top_at_mark_start(hr);
  }
  return false;
  }

root scan

G1ParScanThreadState

G1ParScanThreadState的属性 _closures 封装了 oop（ordinary object pointer）的处理逻辑。遍历对象时会根据所处的位置使用不同的处理逻辑。

 _closures = G1EvacuationRootClosures::
     create_root_closures(_g1h,this,collection_set->only_contains_young_regions())

res = new G1EvacuationClosures(g1h, pss, process_only_dirty_klasses);

class G1EvacuationClosures : public G1EvacuationRootClosures {
  G1SharedClosures<false> _closures;

  public:
    G1EvacuationClosures(G1CollectedHeap* g1h,
                       G1ParScanThreadState* pss,
                       bool in_young_gc) :
      _closures(g1h, pss, in_young_gc) {}
    OopClosure* strong_oops() { return &_closures._oops; }
    CLDClosure* weak_clds()             { return &_closures._clds; }
    CLDClosure* strong_clds()           { return &_closures._clds; }
    NMethodClosure* strong_nmethods()   { return &_closures._nmethods; }
    NMethodClosure* weak_nmethods()     { return &_closures._nmethods; }
}

class G1SharedClosures {
public:
  G1ParCopyClosure<G1BarrierNone, should_mark> _oops;
  G1ParCopyClosure<G1BarrierCLD,  should_mark> _oops_in_cld;
  G1ParCopyClosure<G1BarrierNoOptRoots, should_mark> _oops_in_nmethod;

  G1CLDScanClosure                _clds;
  G1NMethodClosure                _nmethods;

  G1SharedClosures(G1CollectedHeap* g1h, G1ParScanThreadState* pss, bool process_only_dirty) :
    _oops(g1h, pss),
    _oops_in_cld(g1h, pss),
    _oops_in_nmethod(g1h, pss),
    _clds(&_oops_in_cld, process_only_dirty),
    _nmethods(pss->worker_id(), &_oops_in_nmethod, should_mark) {}
}

scan root

G1RootProcessor 封装 GC root 遍历的逻辑，通过调用 G1EvacuateRegionsTask 父类 G1EvacuateRegionsBaseTask的 work方法来执行的。

void G1YoungCollector::evacuate_initial_collection_set(G1ParScanThreadStateSet* per_thread_states,......) {
    G1RootProcessor root_processor(_g1h, num_workers);
    G1EvacuateRegionsTask g1_par_task(_g1h,per_thread_states,task_queues(),
                                      &root_processor,num_workers,
                                      has_optional_evacuation_work);
    task_time = run_task_timed(&g1_par_task)
}

//G1EvacuateRegionsBaseTask
void work(uint worker_id) {
    start_work(worker_id);
    {
      ResourceMark rm;
      G1ParScanThreadState* pss = _per_thread_states->state_for_worker(worker_id);
      pss->set_ref_discoverer(_g1h->ref_processor_stw());
      scan_roots(pss, worker_id);
      evacuate_live_objects(pss, worker_id);
    }
    end_work(worker_id);
}

void scan_roots(G1ParScanThreadState* pss, uint worker_id) {
    _root_processor->evacuate_roots(pss, worker_id);
    _g1h->rem_set()->scan_heap_roots(pss, worker_id, G1GCPhaseTimes::ScanHR,G1GCPhaseTimes::ObjCopy, _has_optional_evacuation_work);
    _g1h->rem_set()->scan_collection_set_regions(pss, worker_id, G1GCPhaseTimes::ScanHR, G1GCPhaseTimes::CodeRoots, G1GCPhaseTimes::ObjCopy);
}

evacuate roots

process_java_roots 负责处理 Java 层面的根对象，主要是线程栈中的对象。process_vm_roots负责处理 JVM 层面的根对象。本文重点看下 process_java_roots。

void G1RootProcessor::evacuate_roots(G1ParScanThreadState* pss, uint worker_id) {

  G1EvacuationRootClosures* closures = pss->closures();
  process_java_roots(closures, phase_times, worker_id);

  process_vm_roots(closures, phase_times, worker_id);
}

process java roots

G1 对所有线程进行遍历，最终调用 Thread::oops_do方法遍历栈区和非栈区。

//closures->strong_oops()  是 G1ParCopyClosure<G1BarrierNone, should_mark> _oops;
//closures->strong_nmethods() 是 G1NMethodClosure _nmethods;
Threads::possibly_parallel_oops_do(is_par, closures->strong_oops(), closures->strong_nmethods())

void Thread::oops_do(OopClosure* f, NMethodClosure* cf) {
  oops_do_no_frames(f, cf);
  oops_do_frames(f, cf);
}

本文介绍 G1 对栈区域的遍历，以 JavaThread为例。首先对栈帧进行遍历，并且根据不同的类型做不同的处理，最终都会调用到 closures 对应的方法。

void JavaThread::oops_do_frames(OopClosure* f, NMethodClosure* cf) {
  // Traverse the execution stack
  for (StackFrameStream fst(this, true /* update */, false /* process_frames */); !fst.is_done(); fst.next()) {
    fst.current()->oops_do(f, cf, fst.register_map());
  }
}

对象操作

很多细节读者可以自己去看，接下来看一下 closures->strong_oops() 也就是 G1ParCopyClosure<G1BarrierNone, should_mark> _oops 的处理逻辑。

class G1ParCopyClosure : public G1ParCopyHelper {
  virtual void do_oop(oop* p)       { do_oop_work(p); }
  virtual void do_oop(narrowOop* p) { do_oop_work(p); }
};

void G1ParCopyClosure<barrier, should_mark>::do_oop_work(T* p) {
   T heap_oop = RawAccess<>::oop_load(p);
   oop obj = CompressedOops::decode_not_null(heap_oop);
   const G1HeapRegionAttr state = _g1h->region_attr(obj);
   if (state.is_in_cset()) {
    oop forwardee;
    markWord m = obj->mark();
    if (m.is_forwarded()) {
      forwardee = m.forwardee();
    } else {
      forwardee = _par_scan_state->copy_to_survivor_space(state, obj, m);
    }
    
    RawAccess<IS_NOT_NULL>::oop_store(p, forwardee);
    if (barrier == G1BarrierCLD) {
      do_cld_barrier(forwardee);
    }
  }
}

do_oop_work 首先根据指针获取到实际的对象以及对象所在的 region。如果 region 在回收集里面，则将对象拷贝到 survivor 区。旧对象仍然留在原地，等待其他指向旧对象的指针更新。

下图是 GC 过程中，不同 gc root 对同一个对象的引用。gc root1 指向的是旧对象，可以根据旧对象中 mark word 的 forward 找到新对象，更新为指向新的对象。gc root2 直接指向新的对象。

问题：此时对象哈希吗和分代年龄存在哪里？新对象中

对象拷贝

首先获取对象的大小，其次分配分配，其次将旧对象拷贝到新的存储空间中，最后将旧对象指向新对象。

//copy_to_survivor_space -> do_copy_to_survivor_space
oop G1ParScanThreadState::do_copy_to_survivor_space(G1HeapRegionAttr const region_attr, oop const old,markWord const old_mark) {
     Klass* klass = old->klass();
     const size_t word_sz = old->size_given_klass(klass);
     //其他代码....
       HeapWord* obj_ptr = _plab_allocator->plab_allocate(dest_attr, word_sz, node_index);

  // PLAB allocations should succeed most of the time, so we'll
  // normally check against null once and that's it.
      if (obj_ptr == nullptr)  obj_ptr = allocate_copy_slow(&dest_attr, old, word_sz, age, node_index);
      Copy::aligned_disjoint_words(cast_from_oop<HeapWord*>(old), obj_ptr, word_sz);
      const oop obj = cast_to_oop(obj_ptr);
      const oop forward_ptr = old->forward_to_atomic(obj, old_mark, memory_order_relaxed)
}

Promotion-Local Allocation Buffers (PLABs) 是一种对象晋升的优化手段。

next_region_attr 根据对象年龄分配不同类型的 region。

G1HeapRegionAttr G1ParScanThreadState::next_region_attr(G1HeapRegionAttr const region_attr, markWord const m, uint& age) {
  if (region_attr.is_young()) {
    age = !m.has_displaced_mark_helper() ? m.age()
                                         : m.displaced_mark_helper().age();
    if (age < _tenuring_threshold) return region_attr;
  }
  // young-to-old (promotion) or old-to-old; destination is old in both cases.
  return G1HeapRegionAttr::Old;
}

G1ScanEvacuatedObjClosure::do_oop_work 函数参数 p 是 obj 中的某个属性，如果 p 在记忆集中，则将加入到处理队列中。注意上面的逻辑是：与 gc root 直接关联的对象如果在记忆集中就会被拷贝到 survivor 区中，同时将它们属性指向的对象加入在待处理队列中。

//_scanner封装了更新卡表的逻辑
obj->oop_iterate_backwards(&_scanner, klass);

inline void G1ScanEvacuatedObjClosure::do_oop_work(T* p) {
  T heap_oop = RawAccess<>::oop_load(p);

  if (CompressedOops::is_null(heap_oop)) {
    return;
  }
  oop obj = CompressedOops::decode_not_null(heap_oop);
  const G1HeapRegionAttr region_attr = _g1h->region_attr(obj);
  if (region_attr.is_in_cset()) {
      //加入处理队列
    prefetch_and_push(p, obj);
  } else if (!G1HeapRegion::is_in_same_region(p, obj)) {
    handle_non_cset_obj_common(region_attr, p, obj);
    assert(_skip_card_enqueue != Uninitialized, "Scan location has not been initialized.");
    if (_skip_card_enqueue == True) {
      return;
    }
    _par_scan_state->enqueue_card_if_tracked(region_attr, p, obj);
  }
}

下面这段代码用于增加对象年龄，和统计相同年龄对象总内存的大小。回忆一下我们背诵的八股文，当对象的年龄超过某个阈值（默认15）或者相同年龄的对象总内存大小超过一半时，超过这个年龄的对象都会被晋升到老年代。

if (dest_attr.is_young()) {
  if (age < markWord::max_age) {
    age++;
    obj->incr_age();
  }
  _age_table.add(age, word_sz);
}

handle_evacuation_failure_par 函数针对拷贝失败的处理，后面使用到再论述。

scan heap root

dirty region是在遍历记忆集的时候加入进来的，此时需要遍历它，遍历的逻辑在封装在 G1ScanHRForRegionClosure中。

G1ScanHRForRegionClosure cl(_scan_state, pss, worker_id, scan_phase, remember_already_scanned_cards);
  _scan_state->iterate_dirty_regions_from(&cl, worker_id);
  
void add_dirty_region(uint const region) {
    _next_dirty_regions->add_dirty_region(region);
}

G1 对 region 中所有的 dirty card 进行遍历。G1ScanCardClosure 对遍历逻辑进行封装。

//do_heap_region->scan_heap_roots->do_claimed_block->scan_memregion
HeapWord* scan_memregion(uint region_idx_for_card, MemRegion mr) {
    G1HeapRegion* const card_region = _g1h->region_at(region_idx_for_card);
    G1ScanCardClosure card_cl(_g1h, _pss, _heap_roots_found);

    HeapWord* const scanned_to = card_region->oops_on_memregion_seq_iterate_careful<true>(mr, &card_cl);
    return scanned_to;
}

对 dirty card 区域进行遍历时，需要区分解析区和非解析区，非解析区对象是非紧密分布的，需要 bitmap 遍历对象。解析区对象时紧密分布的，直接使用对象长度遍历。oop_iterate 遍历对象的所有属性。

//oops_on_memregion_seq_iterate_careful->oops_on_memregion_iterate
//非紧密对象分布遍历
cur = oops_on_memregion_iterate_in_unparsable<Closure>(mr_in_unparsable, cur, cl);

//紧密对象遍历
while (true) {
    oop obj = cast_to_oop(cur);
    cur += obj->size();
    obj->oop_iterate(cl);
    //省略其他代码....
}

注意：上面的对象是记忆集中 region 的对象，这些对象处在 dirty card 中，它们的属性指向的对象可能在回收集中。

G1ScanCardClosure 负责处理对象的属性所指向的对象。 do_oop_work 函数会将扫描到的在 cset 中的对象加入到处理队列中，在后续的流程进行处理。

inline void G1ScanCardClosure::do_oop_work(T* p) {
  T o = RawAccess<>::oop_load(p);
  oop obj = CompressedOops::decode_not_null(o);

  const G1HeapRegionAttr region_attr = _g1h->region_attr(obj);
  if (region_attr.is_in_cset()) {
    prefetch_and_push(p, obj);
    _heap_roots_found++;
}

_par_scan_state->push_on_queue(ScannerTask(p));
// ||
//  V
_task_queue->push(task);

scan collection set regions

遍历 cset 的逻辑封装在 G1ScanCollectionSetRegionClosure 中，直接看 do_heap_region 方法。这段代码主要是处理JIT 编译生成的方法关联的对象，最后还是会调用 G1ParCopyClosure<barrier, should_mark>::do_oop_work，和 java root 处理一致。这里不再赘述。

void G1RemSet::scan_collection_set_regions(......){
    G1ScanCollectionSetRegionClosure cl(_scan_state, pss, worker_id, scan_phase, coderoots_phase);
    _g1h->collection_set_iterate_increment_from(&cl, worker_id)
}

bool do_heap_region(G1HeapRegion* r) {
    //省略其他
    { // Scan code root remembered sets.
      G1ScanAndCountNMethodClosure cl(_pss->closures()->weak_nmethods());
      // Scan the code root list attached to the current region
      r->code_roots_do(&cl);
    }
}

evacuate live objects

scan root 操作已经将与 gc root 间接关联的对象都加入到处理队列中，接下来就是循环处理队列中的对象，并且将与之关联的对象加入到队列中，直到任务队列为空。

void evacuate_live_objects(G1ParScanThreadState* pss, uint worker_id) {
    G1EvacuateRegionsBaseTask::evacuate_live_objects(pss, worker_id, ....);
}

{
    G1ParEvacuateFollowersClosure cl(_g1h, pss, _task_queues, &_terminator, objcopy_phase);
    cl.do_void()
}

void do_void() {
    G1ParScanThreadState* const pss = par_scan_state();
    pss->trim_queue();
    do {
      pss->steal_and_trim_queue(queues());
} while (!offer_termination());

void G1ParScanThreadState::steal_and_trim_queue(G1ScannerTasksQueueSet* task_queues) {
  ScannerTask stolen_task;
  while (task_queues->steal(_worker_id, stolen_task)) {
    dispatch_task(stolen_task);
    // Processing stolen task may have added tasks to our queue.
    trim_queue();
  }
}

steal_and_trim_queue 方法是核心处理逻辑，首先从队列中获取任务，然后对任务进行处理。任务队列是每个工作线程都有一个任务队列，工作线程会优先处理本地任务队列。对象处理和前文说的一致，贴一下代码，不再赘述。

void G1ParScanThreadState::do_oop_evac(T* p) {
  oop obj = RawAccess<IS_NOT_NULL>::oop_load(p);
  if (!region_attr.is_in_cset()) {
    return;
  }
  markWord m = obj->mark();
  if (m.is_forwarded()) {
    obj = m.forwardee();
  } else {
    obj = do_copy_to_survivor_space(region_attr, obj, m);
  }
  RawAccess<IS_NOT_NULL>::oop_store(p, obj);
  write_ref_field_post(p, obj);
}

总结

本文首先从记忆集说起，介绍了记忆集的结构、遍历、以及最新代码的优化，其次介绍了不同类型的 GC root 扫描，包括 Java root、Vm root、Rset root（记忆集）、nmethod root（JIT 编译的代码），最后介绍了 G1 处理任务队列的过程。G1 的工作线程在整个流程中都是并行执行的。