map 源码解析

基本结构

根据hint创建数组，数组的元素为桶(基本格式如下图)。如果B<=4,则数组大小为 $2^B$ 。否则数组大小为 $2^B +2^{B-4}$ ，多出来的部分存储的是通过链表的方式解决冲突部分的数据。

扩容

扩容条件

元素数量较多

// Maximum average load of a bucket that triggers growth is 6.5.
// Represent as loadFactorNum/loadFactorDen, to allow integer math.
loadFactorNum = 13
loadFactorDen = 2
func overLoadFactor(count int, B uint8) bool {
	return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)
}

如果平均每个桶的数量大于6.5个元素，则说明元素数量较多，需要扩容以减少冲突。这个时候是增量扩容，数组翻倍。

溢出桶过多

// tooManyOverflowBuckets reports whether noverflow buckets is too many for a map with 1<<B buckets.
// Note that most of these overflow buckets must be in sparse use;
// if use was dense, then we'd have already triggered regular map growth.
func tooManyOverflowBuckets(noverflow uint16, B uint8) bool {
	// If the threshold is too low, we do extraneous work.
	// If the threshold is too high, maps that grow and shrink can hold on to lots of unused memory.
	// "too many" means (approximately) as many overflow buckets as regular buckets.
	// See incrnoverflow for more details.
	if B > 15 {
		B = 15
	}
	// The compiler doesn't see here that B < 16; mask B to generate shorter shift code.
	return noverflow >= uint16(1)<<(B&15)
}

溢出桶的个数约等于数组大小的的时候，说明了链表中桶的数量过大(如果map中元素过多，则会先进行增量扩容)，这个时候需要进行等量扩容。

hmap中的noverflow表示溢出桶的数量，类型是uint16(之所以用uin16是希望hmap保持较小)。所以在B大于16的时候，无法表示溢出桶的数量。这个时候就粗略的表示溢出桶的数量了。

// incrnoverflow increments h.noverflow.
// noverflow counts the number of overflow buckets.
// This is used to trigger same-size map growth.
// See also tooManyOverflowBuckets.
// To keep hmap small, noverflow is a uint16.
// When there are few buckets, noverflow is an exact count.
// When there are many buckets, noverflow is an approximate count.
func (h *hmap) incrnoverflow() {
	// We trigger same-size map growth if there are
	// as many overflow buckets as buckets.
	// We need to be able to count to 1<<h.B.
	if h.B < 16 {
		h.noverflow++
		return
	}
	// Increment with probability 1/(1<<(h.B-15)).
	// When we reach 1<<15 - 1, we will have approximately
	// as many overflow buckets as buckets.
	mask := uint32(1)<<(h.B-15) - 1
	// Example: if h.B == 18, then mask == 7,
	// and fastrand & 7 == 0 with probability 1/8.
	if fastrand()&mask == 0 {
		h.noverflow++
	}
}

由于基础数组大小是 $2^B$ , 并且uint16中最大的2的幂是 $2^{15}$ 。如果noverflow每次+1的可能性是 $2^{B-15}$ ，那么当noverflow增加到 $2^{15}$ 的时候，就相当于增加到了 $2^B$

扩容方式

渐进式扩容

func growWork(t *maptype, h *hmap, bucket uintptr) {
	// make sure we evacuate the oldbucket corresponding
	// to the bucket we're about to use
	evacuate(t, h, bucket&h.oldbucketmask())

	// evacuate one more oldbucket to make progress on growing
	if h.growing() {
		evacuate(t, h, h.nevacuate)
	}
}

每次assign或者delete的时候，迁移此时key对应的通以及overflow的所有桶。
从第一个桶开始迁移，每次迁移一个桶以及overflow的所有桶，通过参数hmap.nevacuate表示。当hmap.nevacuate到达老数组边界的时候，表示迁移已经完成。

bucket迁移的位置

对于增量扩容的场景，每个元素都是可能迁移到新数组中前半部分或者后半部分。可以通过计算 $hash \& 2^{B-1} != 0$ ，如果为true就放到新数组的后半部分，否则放到数组的前半部分。简单点说就是根据hash中后B位的最高位决定迁移的位置，如果最高位为1说明在数组的后半部分，否在在数组的前半部分。

golang map源码解析