Go1.20（Map）映射详解1. 前言映射是一种数据结构，用于存储一系列无序的键值对 2. 内部实现在 Go 中，

1. 前言

本人当前版本为go 1.20，查看源码：go/map.go at master · golang/go (github.com)

映射是一种数据结构，用于存储一系列无序的键值对

2. 内部实现

在 Go 中，map 是一种内置的数据结构，用于存储键值对。它是通过哈希表实现的

哈希表是由数组和链表组成的数据结构。当我们向 map 中添加一个键值对时，Go 会将键作为输入，使用哈希函数将其转换为索引，并在数组中查找该索引位置是否有已经存在的元素。如果该位置没有元素，则直接将该键值对插入到数组中；否则，将元素插入到该位置对应的链表中。

当我们从 map 中读取一个值时，Go 首先将键作为输入，计算出哈希函数的输出，然后根据该输出定位数组中的索引位置。在这个位置上，有可能有多个键值对，因为同一个索引位置可能对应多个哈希值相等的键。这时，Go 需要遍历链表来查找与所给键匹配的键值对。

需要注意的是，由于哈希表的设计，对于某些特定的输入，不同的键可能会映射到相同的哈希值，这种情况被称为哈希冲突。为了处理哈希冲突，Go 使用了链地址法，即将具有相同哈希值的键值对都放到同一个链表中，当需要查找某个键值对时，只需要在该链表中查找即可

2.1 哈希表

2.2 Map中散列函数

只要记住一件事：映射是一个存储键值对的无序集合

2.2.1 hash冲突

在go中解决hash冲突的方法有两种：

开放地址法（Open Addressing）
链地址法（Separate Chaining）

2.3 源码实现

2.3.1 内存模型

在源码中，表示 map 的结构体是 hmap

// A header for a Go map.
type hmap struct {
	// Note: the format of the hmap is also encoded in cmd/compile/internal/reflectdata/reflect.go.
	// Make sure this stays in sync with the compiler's definition.
	count     int // # live cells == size of map.  Must be first (used by len() builtin)
	flags     uint8
	B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
	noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
	hash0     uint32 // hash seed

	buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
	oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
	nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)

	extra *mapextra // optional fields
}

属性解释：

count：已经存储的键值对个数，可通过len()方法获取
B：记录桶的数目， buckets 数组的长度就是 2^B，因为在选择桶时使用的是&运算方法
hash0：哈希种子，计算key的哈希时会传入
buckets：指向buckets数组的指针，大小为2^B

以下是与扩容有关的字段

oldbuckets：记录旧桶的位置的指针
nevacuate：旧桶的迁移进度，记录的是下一次需要进行迁移的旧桶编号

接着源码继续向下翻会看到bmap的结构体：

bmap也就是hmap中属性buckets所指向的结构体，bmap就是我们常说的“桶”。

// A bucket for a Go map.
type bmap struct {
	// tophash generally contains the top byte of the hash value
	// for each key in this bucket. If tophash[0] < minTopHash,
	// tophash[0] is a bucket evacuation state instead.
	tophash [bucketCnt]uint8
	// Followed by bucketCnt keys and then bucketCnt elems.
	// NOTE: packing all the keys together and then all the elems together makes the
	// code a bit more complicated than alternating key/elem/key/elem/... but it allows
	// us to eliminate padding which would be needed for, e.g., map[int64]int8.
	// Followed by an overflow pointer.
}

但是，此bmap并非最终形态，在编译期间go会对它的内部进行填充加料：

        type bmap struct {
        
           tophash [bucketCnt]uint8
            
           //加料内容：
           keys        [bucketCnt]keyType
           values      [bucketCnt]valueType
           pad         uintptr
           overflowPtr uintptr
        }

overflowPtr：记录溢出桶地址的指针，设计目的是为了减少扩容的次数

此图来自幼麟实验室

当桶存满时就会向溢出桶

疑问那到底啥时候使用溢出桶啥时候进行扩容呢?

在 map 扩容时，Golang 会同时使用溢出桶和扩容策略，以确保高效地处理已存在的键和新插入的键。

2.3.2 何时扩容？

在go1.20 中map在是否需要扩容时的判断做了一些改动。

1.20以前的map源码：

// Maximum average load of a bucket that triggers growth is 6.5.
// Represent as loadFactorNum/loadFactorDen, to allow integer math. 
loadFactorNum = 13
loadFactorDen = 2

1.20版本：

// Maximum average load of a bucket that triggers growth is bucketCnt*13/16 (about 80% full)
// Because of minimum alignment rules, bucketCnt is known to be at least 8.
// Represent as loadFactorNum/loadFactorDen, to allow integer math.
   loadFactorDen = 2
   loadFactorNum = (bucketCnt * 13 / 16) * loadFactorDen

可以看到现在负载因子从原先的6.5，当键值对达到当前容量的65%时会进行扩容操作。

到了go1.20更换为新的计算方式(bucketCnt * 13 / 16) * loadFactorDen，大概就是容量到达80%的时候。

2.3.3 扩容过程

map在扩容时并不是一次性将旧的map中的键值都迁移到新的map中，因为当数据量大时会有较大的性能消耗。

为了解决了这一问题，go使用了渐进式扩容的方式，这就是为什么在hmap中有两个字段：

oldbuckets：记录旧桶的位置的指针
nevacuate：旧桶的迁移进度，记录的是下一次需要进行迁移的旧桶编号

在哈希表每次读写操作时，如果检测到当前处于扩容阶段就会进行一部分的键值对迁移任务，直到旧桶oldbuckets的数据迁移完成后才算是真正完成一次扩容。

总结

当LoadFactor负载因子超标时会进行一次翻倍扩容
当LoadFactor负载因子没有超标时，但溢出桶的数量较多时，会进行一次等量扩容，也就是将旧桶的数据迁移到新桶中，这样做的目的是我们在对map进行了删除操作后桶中的数据会变得松散，使用了等量扩容后就会变得紧凑，从而减少扩容的次数。

2.3.4 key的定位

知道了map的实现和扩容机制，现在来看看key是如何定位的。

3. Map操作

3.1 创建Map

// 创建一个映射，键的类型是 string，值的类型是 int
dict1 := make(map[string]int)
// 创建一个映射，键和值的类型都是 string
// 使用两个键值对初始化映射
dict2 := map[string]string{"Red": "#da1337", "Orange": "#e95a22"}

3.2 为映射赋值

// 创建一个空映射，用来存储颜色以及颜色对应的十六进制代码 
colors := map[string]string{} 
// 将 Red 的代码加入到映射
colors["Red"] = "#da1337

3.3 从映射获取值并判断键是否存在

 // 获取键 Blue 对应的值
value, exists := dict1["Blue"]
// 这个键存在吗？
if exists {
 fmt.Println(value)
}

// 获取键 Blue 对应的值 
value := colors["Blue"] 
// 这个键存在吗？ 
if value != "" { 
fmt.Println(value) 
}

3.4 使用 range 迭代映射

// 创建一个映射，存储颜色以及颜色对应的十六进制代码
	colors := map[string]string{
		"AliceBlue":   "#f0f8ff",
		"Coral":       "#ff7F50",
		"DarkGray":    "#a9a9a9",
		"ForestGreen": "#228b22",
	}
	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("Key: %s Value: %s\n", key, value)
	}

3.5 从映射中删除一项

	// 删除键为 Coral 的键值对
        delete(colors, "Coral")

4 在函数间传递映射

在函数间传递映射并不会制造出该映射的一个副本。实际上，当传递映射给一个函数，并对这个映射做了修改时，所有对这个映射的引用都会察觉到这个修改

4.1 在函数中删除

func TestMapDemo(t *testing.T) {

	// 创建一个映射，存储颜色以及颜色对应的十六进制代码
	colors := map[string]string{
		"AliceBlue": "#f0f8ff",
		"Coral":     "#ff7F50",
		"DarkGray":  "#a9a9a9",
	}
	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("Key: %s Value: %s\n", key, value)
	}
	// 调用函数来移除指定的键
	removeColor(colors, "Coral")
	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("after Key: %s Value: %s\n", key, value)
	}

}

// removeColor 将指定映射里的键删除
func removeColor(colors map[string]string, key string) {
	delete(colors, key)
}

Key: AliceBlue Value: #f0f8ff
Key: Coral Value: #ff7F50
Key: DarkGray Value: #a9a9a9
after Key: AliceBlue Value: #f0f8ff
after Key: DarkGray Value: #a9a9a9

4.2 在函数中添加

package maplearn

import (
	"fmt"
	"testing"
)

func TestMapDemo(t *testing.T) {

	// 创建一个映射，存储颜色以及颜色对应的十六进制代码
	colors := map[string]string{
		"AliceBlue": "#f0f8ff",
		"Coral":     "#ff7F50",
		"DarkGray":  "#a9a9a9",
	}
	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("Key: %s Value: %s\n", key, value)
	}
	// 调用函数来移除指定的键
	addColor(colors, "Tom")
	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("after Key: %s Value: %s\n", key, value)
	}

}

// addColor 将指定的值添加到map中
func addColor(colors map[string]string, key string) {
	colors[key] = key
}

Key: AliceBlue Value: #f0f8ff
Key: Coral Value: #ff7F50
Key: DarkGray Value: #a9a9a9
after Key: AliceBlue Value: #f0f8ff
after Key: Coral Value: #ff7F50
after Key: DarkGray Value: #a9a9a9
after Key: Tom Value: Tom

4.3 在函数中修改

package maplearn

import (
	"fmt"
	"testing"
)

func TestMapDemo(t *testing.T) {

	// 创建一个映射，存储颜色以及颜色对应的十六进制代码
	colors := map[string]string{
		"AliceBlue": "#f0f8ff",
		"Coral":     "#ff7F50",
		"DarkGray":  "#a9a9a9",
	}
	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("Key: %s Value: %s\n", key, value)
	}

	updateColor(colors, "Coral", "null")

	// 显示映射里的所有颜色
	for key, value := range colors {
		fmt.Printf("after Key: %s Value: %s\n", key, value)
	}

}

// updateColor 修改map中的值
func updateColor(colors map[string]string, key string, value string) {
	colors[key] = value
}

Key: AliceBlue Value: #f0f8ff
Key: Coral Value: #ff7F50
Key: DarkGray Value: #a9a9a9
after Key: DarkGray Value: #a9a9a9
after Key: AliceBlue Value: #f0f8ff
after Key: Coral Value: null

可以看到map与slice的区别，在函数中对map进行添加和删除都会影响到原始值，这也解释了在函数中传递map不是传递它的一个副本

参考书籍《Go语言实战》