Go语言基础：垃圾回收（GC）其一｜豆包MarsCode AI 刷题垃圾回收机制（Garbage Collectio

1、什么是垃圾回收机制？

垃圾回收机制（Garbage Collection, GC）是现代编程语言中用于自动管理内存的一种技术。其主要目的是识别并释放程序不再使用的内存空间，从而防止内存泄漏和提高内存使用效率。在没有垃圾回收机制的语言中，开发者需要手动分配和释放内存，这不仅增加了编程的复杂度，也容易因为疏忽导致内存泄漏等问题。垃圾回收机制通常通过跟踪应用程序中的对象引用关系来工作。当一个对象没有任何指向它的有效引用时，就认为该对象不可达或“死亡”，可以被安全地回收。不同的编程语言及其运行环境可能采用不同类型的垃圾回收算法，比如引用计数、标记-清除、复制收集等。每种方法都有自己的优缺点，在选择具体实现时会考虑性能开销、停顿时间等因素。

引用计数：每个对象都维护一个计数器，用来记录有多少引用指向它。一旦这个数字降为零，则立即回收该对象所占用的空间。这种方法简单且即时，但难以处理循环引用问题。
标记-清除：分为两个阶段进行。首先是标记阶段，从根节点开始遍历整个对象图，将所有可访问到的对象做上标记；接着是清除阶段，未被标记的对象被视为垃圾并予以回收。这种方式能够很好地解决循环引用的问题，但是可能会造成内存碎片化。
复制收集：将可用内存划分为大小相等的两块区域，每次只使用其中一块。当这一块满了之后，就将仍然存活的对象复制到另一块空闲区域，并清空原来的那一半。这样做的好处是没有内存碎片问题，但代价是浪费了一半的内存资源。

此外，还有一些混合型策略，旨在结合上述各种方法的优点以达到更好的效果。随着计算机硬件的发展以及软件工程实践的进步，垃圾回收机制也在不断优化之中，以期提供更高效、更低延迟的服务于各类应用场景。

简单来说，垃圾回收（Garbage Collection，GC）是编程语言中提供的自动内存管理机制，GC自动释放不需要的内存对象，让出存储器资源，其释放的过程中不需要程序员手动执行。GC机制在很多现代编程语言得到了支持，针对GC性能的优劣，也是衡量一个编程语言的指标之一。

2、在Go中GC触发的时机

既然我们搞懂了什么是垃圾回收机制，哎，似乎我们在写代码的时候并不是很关注内存回收，那程序具体是在什么时候调用对应的垃圾回收机制呢？

我们都知道Go实现较为完备的垃圾回收机制，具体触发机制通常分为两大类：

手动触发：手动在项目中调用runtime.GC()方法来触发GC操作。
系统触发：运行时（runtime）根据其内置的条件，触发对应的GC机制，维护整个项目的可用性。

3、系统触发

在系统触发场景中，GC触发大致有三类，我们可以在`runtime/mgc.go`中可以找到对应源码：

type gcTriggerKind int

const (
	// gcTriggerHeap indicates that a cycle should be started when
	// the heap size reaches the trigger heap size computed by the
	// controller.
	gcTriggerHeap gcTriggerKind = iota

	// gcTriggerTime indicates that a cycle should be started when
	// it's been more than forcegcperiod nanoseconds since the
	// previous GC cycle.
	gcTriggerTime

	// gcTriggerCycle indicates that a cycle should be started if
	// we have not yet started cycle number gcTrigger.n (relative
	// to work.cycles).
	gcTriggerCycle
)

gcTriggerHeap：当所分配的堆大小达到阈值（由控制器计算的触发堆的大小）时，将会触发。
gcTriggerTime：当距离上一个 GC 周期的时间超过一定时间时，将会触发。-时间周期以runtime.forcegcperiod 变量为准，默认 2 分钟。
gcTriggerCycle：如果没有开启 GC，则启动 GC。在手动触发的 runtime.GC() 方法中涉及。

4、手动触发

使用Go中的`runtime.GC()`触发对应的垃圾回收机制，这个就没什么好说的，通常可能会在下面这些场景中用到：

在 Go 语言中，手动触发垃圾回收（GC）通常是为了优化性能或者解决特定的内存管理问题。以下是一些可能需要手动触发垃圾回收的场景：

4.1 内存使用高峰后

+ 当程序经历了一段时间的高内存使用后，可能会希望手动触发 GC，以释放不再使用的内存，降低内存占用。

4.2 长时间运行的服务

+ 对于长时间运行的服务（如 Web 服务器），在特定的时间点（例如低峰期）手动触发 GC 可以帮助进行内存整理，防止内存泄漏。

4.3 内存分配模式变化

+ 如果程序的内存分配模式发生了变化，例如从大量小对象分配转为少量大对象，可以通过手动触发 GC 来优化内存使用。

4.4 调试和性能分析

+ 在进行性能分析时，手动触发 GC 可以帮助开发者观察 GC 前后的性能变化，便于调试和优化代码。

4.5 特定的应用需求

+ 某些应用可能对内存使用有严格的要求，开发者可以根据具体需求选择在特定时刻手动触发 GC。

注意事项

+ **性能影响**：手动触发 GC 可能会导致短暂的性能下降，因此需要谨慎使用。 + **不应过度依赖**：Go 的垃圾回收机制设计得相对高效，通常不需要频繁手动触发。

总之，手动触发垃圾回收应根据具体情况进行，通常是在对内存管理有特殊需求时。

5、具体的GC流程

同样我们可以借助`runtime/mgc.go`的`GC()`方法来简要看看基本流程

// GC runs a garbage collection and blocks the caller until the
// garbage collection is complete. It may also block the entire
// program.
func GC() {
	// We consider a cycle to be: sweep termination, mark, mark
	// termination, and sweep. This function shouldn't return
	// until a full cycle has been completed, from beginning to
	// end. Hence, we always want to finish up the current cycle
	// and start a new one. That means:
	//
	// 1. In sweep termination, mark, or mark termination of cycle
	// N, wait until mark termination N completes and transitions
	// to sweep N.
	//
	// 2. In sweep N, help with sweep N.
	//
	// At this point we can begin a full cycle N+1.
	//
	// 3. Trigger cycle N+1 by starting sweep termination N+1.
	//
	// 4. Wait for mark termination N+1 to complete.
	//
	// 5. Help with sweep N+1 until it's done.
	//
	// This all has to be written to deal with the fact that the
	// GC may move ahead on its own. For example, when we block
	// until mark termination N, we may wake up in cycle N+2.

	// Wait until the current sweep termination, mark, and mark
	// termination complete.
	n := work.cycles.Load()
	gcWaitOnMark(n)

	// We're now in sweep N or later. Trigger GC cycle N+1, which
	// will first finish sweep N if necessary and then enter sweep
	// termination N+1.
	gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1})

	// Wait for mark termination N+1 to complete.
	gcWaitOnMark(n + 1)

	// Finish sweep N+1 before returning. We do this both to
	// complete the cycle and because runtime.GC() is often used
	// as part of tests and benchmarks to get the system into a
	// relatively stable and isolated state.
	for work.cycles.Load() == n+1 && sweepone() != ^uintptr(0) {
		Gosched()
	}

	// Callers may assume that the heap profile reflects the
	// just-completed cycle when this returns (historically this
	// happened because this was a STW GC), but right now the
	// profile still reflects mark termination N, not N+1.
	//
	// As soon as all of the sweep frees from cycle N+1 are done,
	// we can go ahead and publish the heap profile.
	//
	// First, wait for sweeping to finish. (We know there are no
	// more spans on the sweep queue, but we may be concurrently
	// sweeping spans, so we have to wait.)
	for work.cycles.Load() == n+1 && !isSweepDone() {
		Gosched()
	}

	// Now we're really done with sweeping, so we can publish the
	// stable heap profile. Only do this if we haven't already hit
	// another mark termination.
	mp := acquirem()
	cycle := work.cycles.Load()
	if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) {
		mProf_PostSweep()
	}
	releasem(mp)
}

除去大部分的注释其实这个流程还是比较好理清楚的，我们来逐步解释一下具体的流程。

func GC() {
    // 这行代码获取当前的GC周期编号，并将其存储在变量n中。
    // work.cycles是一个原子计数器，用于跟踪GC的周期。
    n := work.cycles.Load()

    // 此函数调用等待当前GC周期的标记阶段完成。
    // 在Go的GC中，标记阶段会遍历所有的对象，标记那些仍然被引用的对象。
    gcWaitOnMark(n)

    // 这行代码触发一个新的GC周期。
    // gcTriggerCycle表示这是一个由周期触发的GC。n + 1表示下一个GC周期的编号。
    gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1})

    // 这个就好理解啦，等我们新触发的第 n+1 个周期完成对应的标记阶段
    gcWaitOnMark(n + 1)
    
    // 这是一个循环，它会一直执行，直到当前GC周期的清扫阶段完成。
    // sweepone()函数尝试清扫一个对象，如果返回^uintptr(0)，则表示清扫完成。
    // Gosched()函数让出当前G（Goroutine）的执行权，允许其他G运行，
    // 这是为了在GC过程中保持程序的响应性。
    for work.cycles.Load() == n+1 && sweepone() != ^uintptr(0) {
        Gosched()
    }

    // 这个循环继续等待，直到确认清扫阶段完全结束。
    // isSweepDone()函数检查清扫是否完成。
    for work.cycles.Load() == n+1 && !isSweepDone() {
        Gosched()
    }

    // 获取当前执行的M（Machine，代表一个内核线程）。
    // acquirem函数确保在GC期间对当前M有独占访问权。
    // 使用acquirem()获取当前goroutine的互斥锁（mp），
    // 以确保在发布堆内存分析信息时的线程安全。
    mp := acquirem()

    // 再次获取当前的GC周期编号，以确保后续的判断基于最新的周期信息。
    cycle := work.cycles.Load()

    // 检查当前的GC周期编号cycle。如果它是n+1（表示我们刚刚完成的周期），
    // 或者如果当前处于标记阶段且周期编号是n+2
    // （这可能是因为我们在等待清扫完成时，GC已经自动进入了下一个周期），
    // 则调用mProf_PostSweep()来发布稳定的堆内存分析信息。
    if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) {
        mProf_PostSweep()
    }
    // 释放对应的互斥锁
    releasem(mp)
}

6、GC机制触发机制

6.1 监控线程

实质上在 Go 运行时（runtime）初始化时，会启动一个 goroutine，用于处理 GC 机制的相关事项。

我们可以在runtime/proc.go中找到这个代码，它的主要作用是监听和触发垃圾收集（GC）

// start forcegc helper goroutine
func init() {
	go forcegchelper()
}

func forcegchelper() {
	forcegc.g = getg()
	lockInit(&forcegc.lock, lockRankForcegc)
	for {
		lock(&forcegc.lock)
		if forcegc.idle.Load() {
			throw("forcegc: phase error")
		}
		forcegc.idle.Store(true)
		goparkunlock(&forcegc.lock, waitReasonForceGCIdle, traceBlockSystemGoroutine, 1)
		// this goroutine is explicitly resumed by sysmon
		if debug.gctrace > 0 {
			println("GC forced")
		}
		// Time-triggered, fully concurrent.
		gcStart(gcTrigger{kind: gcTriggerTime, now: nanotime()})
	}
}

goparkunlock(&forcegc.lock, waitReasonForceGCIdle, traceBlockSystemGoroutine, 1)：解锁并将当前goroutine挂起（park）。这个调用会释放forcegc.lock锁，并将当前goroutine置于等待状态，直到它被其他机制（如系统监控goroutine sysmon）唤醒。

//go:nowritebarrierrec
func sysmon() {
    ...
    ...
    ...

		// check if we need to force a GC
		if t := (gcTrigger{kind: gcTriggerTime, now: now}); t.test() && forcegc.idle.Load() {
			lock(&forcegc.lock)
			forcegc.idle.Store(false)
			var list gList
			list.push(forcegc.g)
			injectglist(&list)
			unlock(&forcegc.lock)
		}
		if debug.schedtrace > 0 && lasttrace+int64(debug.schedtrace)*1000000 <= now {
			lasttrace = now
			schedtrace(debug.scheddetail > 0)
		}
		unlock(&sched.sysmonlock)
	}
}

这段代码核心的行为就是不断地在 for 循环中，对 gcTriggerTime 和 now 变量进行比较，判断是否达到一定的时间（默认为 2 分钟）。

若达到意味着满足条件，会将 forcegc.g 放到全局队列中接受新的一轮调度，再进行对上面 forcegchelper 的唤醒。

6.2 堆内存申请阶段

关于堆内存申请相关的代码我们可以看`runtime/malloc.go`

// Allocate an object of size bytes.
// Small objects are allocated from the per-P cache's free lists.
// Large objects (> 32 kB) are allocated straight from the heap.
func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
    ...
    ...
    ...
	if size <= maxSmallSize-mallocHeaderSize {
		if noscan && size < maxTinySize {

        ...
        ...
			size = uintptr(class_to_size[sizeclass])
			spc := makeSpanClass(sizeclass, noscan)
			span = c.alloc[spc]
			v := nextFreeFast(span)
			if v == 0 {
				v, span, shouldhelpgc = c.nextFree(spc)
			}

        ...
        ...
		}
	} else {
		shouldhelpgc = true
    ...
    ...
	if shouldhelpgc {
		if t := (gcTrigger{kind: gcTriggerHeap}); t.test() {
			gcStart(t)
		}
	}

    ...
    ...

	return x
}

小对象：如果申请小对象时，发现当前内存空间不存在空闲跨度时，将会需要调用 nextFree 方法获取新的可用的对象，可能会触发 GC 行为。

大对象：如果申请大于 32k 以上的大对象时，可能会触发 GC 行为。

参考文章：Go 什么时候会触发 GC？作者：煎鱼（EDDYCJY）

Go语言基础：垃圾回收（GC）其一 ｜ 豆包MarsCode AI 刷题