sync.Pool

sync.Pool是一个并发安全的缓存池，可以并发且安全的存储、获取对象。常用于对象实例创建会消耗较多资源的场景。但它并不严格具有缓存作用，因为我们无法控制缓存对象的释放，对于使用者来说是一个黑盒。

使用sync.Pool可以安全的复用对象，既减少了内存的重复分配，也减少了GC次数。

sync.Pool的使用

// 创建一个实例
p := &sync.Pool{}

Pool里面存储的是interface{}类型，也就是所有类型的对象都可以存储。它提供的接口也非常简单，就两个。

// Put存放对象，Get取出对象
p.Put(1)
p.Get()

需要注意的是：使用者既不可以对Pool中的对象个数做假定，同时也无法获取Pool中的对象个数；Pool中的对象释放时机是随机的;Get方法也是随机的取出对象。

当调用Get()方法的时候Pool没有对象的时候，他会调用Pool.New函数创建对象并返回，并且这个New函数是用户自定义的。需要注意：sync.Pool并没有对这个方法做任何处理去确保这个New函数是并发安全的，如果有并发调用的可能，需要确保它是并发安全的。

使用场景

增加临时对象的重复使用率，在高并发业务下出现GC问题时，可以使用它减少GC负担；
不适合存储带状态的对象，因为对象的释放和获取都是随机的；
无法控制缓存对象的个数

实践

作为对象生成器

初始化sync.Pool实例，并配置并发安全的New方法
创建对象使用pool.Get()获取
Get后使用defer pool.Put()

不对池中的对象有任何假定，在调用Pool.Put前或者调用Pool.Get后对对象进行memset操作

gopool协程池

C++/Java基本都有专门的库去实现线程池，而go语言由于有goroutine的存在，并没有提供协程池这样的组件。字节开源的gopkg代码库中有一个叫gopool的协程池实现，简单也很有意思。

它的使用很方便，我们只需要做简单的替换即可:

go func()
//替换成如下形式
gopool.Go(func)

我们先查看pool.go文件中关于pool的定义：

// 这是Pool接口，实际的pool类型就是实现了这组接口
type Pool interface {
	// Name returns the corresponding pool name.
	Name() string
	// SetCap sets the goroutine capacity of the pool.
   //协程池中协程数量上限
	SetCap(cap int32)
	// Go executes f.
   // 执行传入的函数
	Go(f func())
	// CtxGo executes f and accepts the context.
   // 实际的执行逻辑，如果未传入context，会创建一个context.Background()
	CtxGo(ctx context.Context, f func())
	// SetPanicHandler sets the panic handler.
	SetPanicHandler(f func(context.Context, interface{}))
	// WorkerCount returns the number of running workers
  //实际工作的协程数量       
	WorkerCount() int32
}

我们接着看一下pool结构体的定义：

type pool struct {
	// The name of the pool
	name string

	// capacity of the pool, the maximum number of goroutines that are actually working
	cap int32
	// Configuration information
	config *Config
	// linked list of task
  // 链表用来保存任务
	taskHead  *task
	taskTail  *task
  // 互斥锁用来确保多个协程访问链表的并发安全
	taskLock  sync.Mutex
  // task数量
	taskCount int32

	// Record the number of running workers
  //协程数量
	workerCount int32

	// This method will be called when the worker panic
	panicHandler func(context.Context, interface{})
}

//用来保存用户传入的func的task
type task struct {
	ctx context.Context
	f   func()

	next *task
}

//下面两个方法可以关注下
//将task置空
func (t *task) zero() {
	t.ctx = nil
	t.f = nil
	t.next = nil
}

//先置空，再放到sync.Pool中
//这里的taskPool其实就是缓存池:var taskPool sync.Pool
func (t *task) Recycle() {
	t.zero()
	taskPool.Put(t)
}

这里的pool类型就是对外暴露的gopool.GO的实际执行任务的结构。我们接着查看gopool.go文件：

// defaultPool is the global default pool.
var defaultPool Pool

var poolMap sync.Map

func init() {
	defaultPool = NewPool("gopool.DefaultPool", math.MaxInt32, NewConfig())
}

// Go is an alternative to the go keyword, which is able to recover panic.
// gopool.Go(func(arg interface{}){
//     ...
// }(nil))
func Go(f func()) {
	CtxGo(context.Background(), f)
}

// CtxGo is preferred than Go.
func CtxGo(ctx context.Context, f func()) {
	defaultPool.CtxGo(ctx, f)
}

可以看到不管是Go（）方法还是CtxGo()方法都是调用了pool.CtxGo()方法。

gopool.Go()的实现逻辑

接下来我们跟着用户的视角，从CtxGo()的方法出发，看背后到底做了哪些工作。

func (p *pool) CtxGo(ctx context.Context, f func()) {
   // 这里的taskPool其实就是一个sync.Pool，用来缓存和复用task的
   // 完成task的初始化和赋值
	t := taskPool.Get().(*task)
	t.ctx = ctx
	t.f = f
   // 操作保存任务的链表前先加锁
	p.taskLock.Lock()
	if p.taskHead == nil {
		p.taskHead = t
		p.taskTail = t
	} else {
		p.taskTail.next = t
		p.taskTail = t
	}
	p.taskLock.Unlock()
  // 使用原子操作将记录task数量的变量+1
	atomic.AddInt32(&p.taskCount, 1)
	// The following two conditions are met:
	// 1. the number of tasks is greater than the threshold.
	// 2. The current number of workers is less than the upper limit p.cap.
	// or there are currently no workers.
   // 如果工作task数量大于阈值且工作goroutine数量小于上限 或者 goroutine数量为0
   // workPool也是一个sync.Pool，它用来保存和复用worker
   // 取出一个work goroutine，赋值
   // 启动run函数
	if (atomic.LoadInt32(&p.taskCount) >= p.config.ScaleThreshold && p.WorkerCount() < atomic.LoadInt32(&p.cap)) || p.WorkerCount() == 0 {
		p.incWorkerCount()
		w := workerPool.Get().(*worker)
		w.pool = p
		w.run()
	}
}

这里的worker goroutine其实就是包装了一个pool类型的指针。也就是说不同的work goroutine通过共享指向同一个pool的指针，也就共享了这个任务池。 到这里，我们基本能猜到run函数的大概过程：从任务队列中取出任务去执行。

func (w *worker) run() {
  // 这里使用go关键字起了一个goroutine去循环执行task
	go func() {
		for {
			var t *task
                  // 先加锁
			w.pool.taskLock.Lock()
                  // 取出一个task并修改链表，同时把计数-1
			if w.pool.taskHead != nil {
				t = w.pool.taskHead
				w.pool.taskHead = w.pool.taskHead.next
				atomic.AddInt32(&w.pool.taskCount, -1)
			}
                  // 如果任务队列为空
                  // worker计数-1
                  // 将worker置为nil，并放到sync.Pool中
			if t == nil {
				// if there's no task to do, exit
				w.close()
				w.pool.taskLock.Unlock()
				w.Recycle()
				return
			}
			w.pool.taskLock.Unlock()
			func() {
                          // 执行传入的任务函数，并且尝试捕获panic
				defer func() {
					if r := recover(); r != nil {
                                          // 这里的panicHandler也是用户自定义的
						if w.pool.panicHandler != nil {
							w.pool.panicHandler(t.ctx, r)
						} else {
							msg := fmt.Sprintf("GOPOOL: panic in pool: %s: %v: %s", w.pool.name, r, debug.Stack())
							logger.CtxErrorf(t.ctx, msg)
						}
					}
				}()
                          // 执行传入的函数
				t.f()
			}()
                        // 将task结构体中的元素置为nil,并缓存到sync.Pool中
			t.Recycle()
		}
	}()
}

至此，这个协程池的关键逻辑已经说完了，我们再稍微总结一下。

协程池gopool.Go的使用逻辑和原生的go一致，都是异步的。gopool.Go只是把任务放到任务链表上，实际的执行通过调用run()->go，实现了异步
实现了自定义阈值（超过一定数量才启动新协程）和协程数量上限
当协程池中没有任务的时，协程会返回；使用了两个sync.Pool来缓存和复用task和worker
每次使用sync.Pool缓存对象的时候，都会执行memset操作

字节开源库gopkg——gopool协程池

sync.Pool

sync.Pool的使用

使用场景

实践

gopool协程池

gopool.Go()的实现逻辑