本文涉及源码基于Go1.17.3版本

认识Context

type Context interface {
   Deadline() (deadline time.Time, ok bool)

   Done() <-chan struct{}

   Err() error

   Value(key interface{}) interface{}
}

Context实际上只是一个接口，任何实现了该接口下的四个方法都可以被看作是Context：

Deadline：只有当context能够被cancel时，该方法有效，返回cancel时刻的time和true，否则返回默认time和false。

Done：该方法返回一个仅用于观察该context是否被cancel的通道，如果该context被设定为无法cancel，则应返回nil

Err：如果该context被cancel，则返回cancel的原因，否则返回nil

Value：用于在context之间传递信息，根据key值读写value

以上四个方法在一定条件下皆为幂等。

Context的主要作用：

在并发过程中，能够通过设定超时时间或取消信号来控制携带context的goroutine的生命周期

通过Value传递上下文信息

Context预设类型：

emptyCtx：

type emptyCtx int

func (*emptyCtx) Deadline() (deadline time.Time, ok bool) {
   return
}

func (*emptyCtx) Done() <-chan struct{} {
   return nil
}

func (*emptyCtx) Err() error {
   return nil
}

func (*emptyCtx) Value(key interface{}) interface{} {
   return nil
}

可以看到系统的emptyCtx实际上是int类型，实现的方法都是返回默认值or nil，即非canceled的ctx。

Background() & TODO()

emptyCtx主要是root Context的类型，平常使用的context.Background()获取到的ctx就是emptyCtx：

var (
   background = new(emptyCtx)
   todo       = new(emptyCtx)
)

func Background() Context {
   return background
}

func TODO() Context {
   return todo
}

从源码可以看到，context.Background()返回的是一个全局变量background（emptyCtx的指针）

我们可以打印下看看：

ctx := context.Background()
fmt.Printf("ctx: %+v", ctx)

预期应该打印出来的是一个类似0xc233333333的地址，但实际上输出的是：

ctx: context.Background

原因：emptyCtx实现了String()方法

func (e *emptyCtx) String() string {
   switch e {
   case background:
      return "context.Background"
   case todo:
      return "context.TODO"
   }
   return "unknown empty Context"
}

这里实际上隐藏了background为一个emptyCtx指针的事实。

我们通常在主函数or首次请求or测试中初始化并获取backgroundCtx，并传递给下游

而todoCtx则是在不确定使用哪种context时，或是当前下游函数还未扩展到接受context参数时的临时写法。

valueCtx:

type valueCtx struct {
   Context
   key, val interface{}
}

与emptyCtx的int类型不同，valueCtx是一个自定义的结构体，除了实现Context的四种方法外，还保存着key和value。

WithValue()

先看一下valueCtx的初始化：

func WithValue(parent Context, key, val interface{}) Context {
   if parent == nil {
      panic("cannot create context from nil parent")
   }
   if key == nil {
      panic("nil key")
   }
   if !reflectlite.TypeOf(key).Comparable() {
      panic("key is not comparable")
   }
   return &valueCtx{parent, key, val}
}

首先入参必须要传一个父ctx，其次key值不能为nil，key的类型必须Comparable，否则直接panic。

Value()

接着来看下valueCtx实现的Value()方法：

func (c *valueCtx) Value(key interface{}) interface{} {
   if c.key == key {
      return c.val
   }
   return c.Context.Value(key)
}

如果传入的key为当前valueCtx的key，则直接返当前valueCtx的value，否则去该valueCtx的父ctx去找key值对应的value，也就是说可以看作ctx的value可以被继承的：

String()

然后是剩下两个打印函数：

func stringify(v interface{}) string {
   switch s := v.(type) {
   case stringer:
      return s.String()
   case string:
      return s
   }
   return "<not Stringer>"
}

func (c *valueCtx) String() string {
   return contextName(c.Context) + ".WithValue(type " +
      reflectlite.TypeOf(c.key).String() +
      ", val " + stringify(c.val) + ")"
}

这两个主要就是实现valueCtx的stringer的功能（打印父ctx和该ctx的type的类型和value的值），这里就不再赘述了。

cancelCtx：

type cancelCtx struct {
   Context

   mu       sync.Mutex            
   done     atomic.Value          
   children map[canceler]struct{} 
   err      error                 
}

cancelCtx除了实现Context的四种方法外，还拥有以下四个变量：

mu：主要是为了保护读写以下三个成员变量，而必须要有的同步锁

done：chan struct{}类型，首次读时才会初始化，当该context首次cancel时该管道关闭

children：cancelCtx会保存所有的可cancel的子context，当被cancel时处理完子ctx后会置为nil

err：cancelCtx首次cancel时会被置为non nil，并返回cancel原因信息

WithCancel()

首先来看下cancelCtx的初始化方法：

func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
   if parent == nil {
      panic("cannot create context from nil parent")
   }
   c := newCancelCtx(parent)
   propagateCancel(parent, &c)
   return &c, func() { c.cancel(true, Canceled) }
}

这里的入参和valueCtx一样必须要传一个ctx作为该cancelCtx的父ctx

接着来看一下newCancelCtx()：

func newCancelCtx(parent Context) cancelCtx {
   return cancelCtx{Context: parent} // 这里比较奇怪，newXXX为啥不直接返回&cancelCtx{Context: parent}
}

这里实际上就只是把父ctx作为cancelCtx.Context成员变量然后创建cancelCtx。

propagateCancel()

然后就是propagateCancel()，该步骤主要是将该cancelCtx添加到父ctx的children当中：

// propagateCancel arranges for child to be canceled when parent is.
func propagateCancel(parent Context, child canceler) {
   done := parent.Done()
   if done == nil {
      return // parent is never canceled
   }

   select {
   case <-done:
      // parent is already canceled
      child.cancel(false, parent.Err())
      return
   default:
   }

   if p, ok := parentCancelCtx(parent); ok {
      p.mu.Lock()
      if p.err != nil {
         // parent has already been canceled
         child.cancel(false, p.err)
      } else {
         if p.children == nil {
            p.children = make(map[canceler]struct{})
         }
         p.children[child] = struct{}{}
      }
      p.mu.Unlock()
   } else {
      atomic.AddInt32(&goroutines, +1)
      go func() {
         select {
         case <-parent.Done():
            child.cancel(false, parent.Err())
         case <-child.Done():
         }
      }()
   }
}

首先来分步骤解释：

done := parent.Done()
if done == nil {
   return 
}

首先获取parent.Done来获取父ctx的管道，这里主要是探测父ctx是否为能被cancel的context，若父ctx不能被cancel（done管道返回nil）那么就没有必要接下来的（将该cancelCtx添加到父ctx的children当中）操作了。

需要注意的是这里的父ctx是否为cancelCtx并不单单指该ctx的上一层的父ctx，也可能是祖父ctx或是更上层的可能出现的cancelCtx，例如：

虽然ctx2的父ctx：ctx1并不是可cancel的ctx，但ctx2.Done的返回的却不是nil

这是因为valueCtx并未实现Done()方法，这时如果ctx2去调用Done，则调用的是ctx1.Context.Done()，即ctx0的done()，因为ctx0为cancelCtx，所以这里返回的一定不是nil。

所以这里的 done == nil 语句可以看作是对该ctx所在的树的分支上是否拥有可cancel的ctx节点的判断。

接下来观察刚刚获取到的done管道：

select {
case <-done:
   child.cancel(false, parent.Err())
   return
default:
}

如果父ctx的管道已经被关闭了，那么这里的select就会走到case <-done:分支，这里会去直接调用cancelCtx的cancel()方法去cancel掉自己（之后介绍）。

走完上面两步，能够确认该cancelCtx的分支ctx中存在可cancel的ctx，且未被cancel，这时候就需要将该ctx添加到可cancel的父ctx的children下了。

那么如果该分支上存在多个可cancel的父ctx，这时候就需要找到离该ctx最近的一个可cancel的父ctx：

p, ok := parentCancelCtx(parent)

这里通过parentCancelCtx方法（后续介绍）获取到最近的一个可cancel的ctx

如果找到了最近的一个可cancel的ctx：

p.mu.Lock()
if p.err != nil {
   child.cancel(false, p.err)
} else {
   if p.children == nil {
      p.children = make(map[canceler]struct{})
   }
   p.children[child] = struct{}{}
}
p.mu.Unlock()

首先判断该父ctx是否已经被cancel，如果关闭，则子ctx直接调用cancel()方法去cancel掉自己（这里又再次做了判断父ctx是否被关闭，主要是为了兜底在这段时间内其他goroutine可能对该父ctx做了关闭操作，因为之前操作未加锁）

若这里判断父ctx仍未被cancel掉，则最终将该子ctx存储到父ctx的map中，存储方式：子ctx作为children.Map的key值，value为空结构体。

未找到最近的一个可cancel的ctx：

atomic.AddInt32(&goroutines, +1)
go func() {
   select {
   case <-parent.Done():
      child.cancel(false, parent.Err())
   case <-child.Done():
   }
}()

这种情况则单独创建一个goroutine去监控父ctx和当前ctx的cancel状态，如果父ctx先cancel，则当前的子ctx需要主动去cancel掉自己；若自己先cancel，则不做任何事，这里主要防止因为父ctx没有被cancel掉时，这个协程就会一直阻塞的情况。

parentCancelCtx()

现在来看一下ctx是怎么寻找到最近的父cancelCtx的：

func parentCancelCtx(parent Context) (*cancelCtx, bool) {
   done := parent.Done()
   if done == closedchan || done == nil {
      return nil, false
   }
   p, ok := parent.Value(&cancelCtxKey).(*cancelCtx)
   if !ok {
      return nil, false
   }
   pdone, _ := p.done.Load().(chan struct{})
   if pdone != done {
      return nil, false
   }
   return p, true
}

这里首先会判断该ctx的done是否为closedchan（最近的cancelCtx已经被关闭），或是直接为nil（ctx的分支树上不存在可cancel的ctx），若上述条件任意成立一条，便认为未找到最近的可cancel的父ctx

这里的closedchan为一个可重用的全局channel，在context包初始化时就会被关闭，在之后的cancel()方法会提及该变量：

var closedchan = make(chan struct{})

func init() {
   close(closedchan)
}

接着查看该ctx的key为cancelCtxKey地址的Value：

p, ok := parent.Value(&cancelCtxKey).(*cancelCtx)

其中cancelCtxKey的定义和获取Value的过程如下：

var cancelCtxKey int

func (c *cancelCtx) Value(key interface{}) interface{} {
   if key == &cancelCtxKey {
      return c
   }
   return c.Context.Value(key)
}

在这里可以看到cancelCtx重写了Value()，并且在该方法中拦截了key为&cancelCtxKey的情况，若key为cancelCtxKey这个全局变量的地址时，则直接返回ctx自己。

为什么parent.Value(&cancelCtxKey)这里可以获取到最近的可cancel的父ctx？

这里可以举个例子：

例如ctx4为本次新创建的cancelCtx，则传入到parentCancelCtx()中的parent即为ctx3。

这时候到ctx3中找到key为&cancelCtxKey的Value值，因为ctx3中找不到该key（这里基本上不会出现key值和&cancelCtxKey值相同的情况），所以会去ctx3的父ctx：ctx2中继续调用Value()（原因上面已经讲过：Go源码学习：Context ），这时候会发现ctx2是cancelCtx，它会对key为&cancelCtxKey的情况进行拦截，这时候会返回自己（ctx2），所以ctx2就是距离ctx4最近的cancelCtx。

如果遍历所有上层ctx后，仍未获取到key值为&cancelCtxKey的Value，则认为未找到最近的cancelCtx，这种情况只会发生在ctx链中只存在自定义的可cancel的Ctx。

如果取到了最近的cancelCtx：p，接着看后面的逻辑：

pdone, _ := p.done.Load().(chan struct{})
if pdone != done {
   return nil, false
}
return p, true

首先会获取最近的cancelCtx的done管道，这时候会拿该done管道和之前的parent.done来做对比：

如果两个管道不是同一个管道，则也认为未找到最近的cancelCtx，但是这种情况又有点特殊，其实是找到了最近的可cancel的ctx，但该ctx为自定义的可cancel的ctx，例如：这里同样把ctx4作为本次新创建的cancelCtx，parent仍为ctx3。

这里的ctx2则是自定义的可cancel的ctx（称为custom cancelCtx），所以parent.done实际上就是ctx2.done。但这时候遍历ctx中key为&cancelCtxKey的Value，则最终会来到ctx0，所以上述代码中的p.done实际上为ctx0.done。

这样就很清楚了：ctx2.done != ctx0.done

所以说这里并非是没有找到最近的可cancel的ctx，而是找到了最近的可cancel的ctx为自定义ctx（custom cancelCtx）

context库之所以这么设计也是为了让自定义的cancelCtx能够自己管理sub cancelCtxs。

最后如果判断两个done是同一个管道，那么则说明找到了最近的一个可cancel的ctx，且该ctx为预设的cancelCtx。

cancel()

在创建cancelCtx时，我们就能看到返回的变量中就存在cancel的逻辑：

func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
   c := newCancelCtx(parent)
   propagateCancel(parent, c)
   return c, func() { c.cancel(true, Canceled) }
}

var Canceled = errors.New("context canceled")

这里会返回cancelFunc给业务方，业务方主动调用cancelFunc也就相当于调用context包下的cancel()方法

接下来看下cancelCtx的cancel()究竟做了些什么：

func (c *cancelCtx) cancel(removeFromParent bool, err error) {
   if err == nil {
      // 首先cancel时必须要传cancel的原因，即error，否则直接panic
      panic("context: internal error: missing cancel error")
   }
   // 接着就是操作cancelCtx下的变量了，因为context本身是协程安全的，所以这里的操作都需要加锁。
   c.mu.Lock()
   if c.err != nil {
      // 走到这里说明当前ctx已经被cancel掉了
      c.mu.Unlock()
      return 
   }
   c.err = err
   d, _ := c.done.Load().(chan struct{})
   if d == nil {
      c.done.Store(closedchan)
   } else {
      close(d)
   }
   for child := range c.children {
      // NOTE: acquiring the child's lock while holding parent's lock.
      child.cancel(false, err)
   }
   c.children = nil
   c.mu.Unlock()

   if removeFromParent {
      removeChild(c.Context, c)
   }
}

以上步骤总结下来就是：

->加锁

->关闭done管道

->遍历sub cancelCtxs，将subCtx全部cancel掉

->将存储的subCtx的children置为nil

->若ctx需要从parent中移除，则调用removeChild()

->解锁

removeChild()

最后来看一下如何将subCtx从父ctx中移除掉：

func removeChild(parent Context, child canceler) {
   p, ok := parentCancelCtx(parent)
   if !ok {
      return
   }
   p.mu.Lock()
   if p.children != nil {
      delete(p.children, child)
   }
   p.mu.Unlock()
}

首先找到最近的cancelCtx（且最近的可cancel的ctx不是自定义ctx）

如果找到则将该cancelCtx中的child从map中删掉

timerCtx：

首先来看下timerCtx的结构：

type timerCtx struct {
   cancelCtx
   timer *time.Timer // Under cancelCtx.mu.

   deadline time.Time
}

由结构可知，timerCtx也属于可cancel的ctx（只不过这里是通过dealine来自动cancel的），这里实际上继承了cancelCtx的Done和Err方法

WithDeadline()

创建timerCtx的方式有两种：WithTimeout()和WithDeadline()

其中WithTimeout()最终会调用WithDeadline()：

func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
   return WithDeadline(parent, time.Now().Add(timeout))
}

所以只需要看WithDeadline()即可：

func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
   if parent == nil {
      panic("cannot create context from nil parent")
   }
   if cur, ok := parent.Deadline(); ok && cur.Before(d) {
      return WithCancel(parent)
   }
   c := &timerCtx{
      cancelCtx: newCancelCtx(parent),
      deadline:  d,
   }
   propagateCancel(parent, c)
   dur := time.Until(d)
   if dur <= 0 {
      c.cancel(true, DeadlineExceeded) // deadline has already passed
      return c, func() { c.cancel(false, Canceled) }
   }
   c.mu.Lock()
   defer c.mu.Unlock()
   if c.err == nil {
      c.timer = time.AfterFunc(dur, func() {
         c.cancel(true, DeadlineExceeded)
      })
   }
   return c, func() { c.cancel(true, Canceled) }
}

继续拆解：

if cur, ok := parent.Deadline(); ok && cur.Before(d) {
   return WithCancel(parent)
}

这里主要是发现父ctx存在deadline，且父ctx的deadline在当前ctx的deadline之前，这种情况下创建的不再是timerCtx而是cancelCtx了，

接着：

c := &timerCtx{
   cancelCtx: newCancelCtx(parent),
   deadline:  d,
}
propagateCancel(parent, c)

首先创建timerCtx：将parentCtx作为c.cancelCtx.Context

然后把该cancelCtx添加到父ctx的children当中（propagateCancel()可参考cancelCtx中的内容）

dur := time.Until(d)
if dur <= 0 {
   c.cancel(true, DeadlineExceeded) // deadline has already passed
   return c, func() { c.cancel(false, Canceled) }
}

如果这时候发现当前时间已经超过了dealine，这时候就会去主动调用timerCtx的cancel()（这里的cancel是timerCtx自己重写的方法，等会会介绍）去取消当前ctx。然后再返回该ctx

最后：

if c.err == nil {
   c.timer = time.AfterFunc(dur, func() {
      c.cancel(true, DeadlineExceeded)
   })
}
return c, func() { c.cancel(true, Canceled) }

这里的c.err实际上就是c.cancelCtx.err，若为nil则说明该ctx没有被cancel：

这时候设置计时器，在deadline时去调用timerCtx的cancel()，最后返回该ctx。

需要注意的是这里也会返回cancelFunc，说明timerCtx也是可以被手动cancel的，可根据ctx.Err()返回的信息来判断该ctx是被主动cancel还是因为计时器cancel：

"context deadline exceeded"：计时器cancel

"context canceled"：手动cancel

cancel()

timerCtx并没有使用cancelCtx的cancel，而是自己重写了cancel()，代码如下：

func (c *timerCtx) cancel(removeFromParent bool, err error) {
   c.cancelCtx.cancel(false, err)
   if removeFromParent {
      // Remove this timerCtx from its parent cancelCtx's children.
      removeChild(c.cancelCtx.Context, c)
   }
   c.mu.Lock()
   if c.timer != nil {
      c.timer.Stop()
      c.timer = nil
   }
   c.mu.Unlock()
}

首先在该方法中会去先调用cancelCtx的cancel将初始化时传入的ctx给cancel掉

接着根据removeFromParent参数判断是否需要将该ctx从父ctx的children中移除

最后再将timer关掉

其他

可cancel的ctx：

这里说的可cancel的ctx指的是实现了下面interface的ctx类型：

type canceler interface {
   cancel(removeFromParent bool, err error)
   Done() <-chan struct{}
}

只要实现了这两个方法，都可以看作是可cancel的ctx（duck typing）

所以可cancel的ctx包括且不限于：

cancelCtx

timerCtx

自定义Ctx（实现了canceler）

可cancel的ctx中的children

cancelCtx和timerCtx都有children，children中存储的都是距离其最近的可cancel的ctx（包括自定义ctx）

但自定义的可cancel的ctx，可以没有children。但是也需要去管理它的子cancelCtx，毕竟当自己取消时也得取消子cancelCtx，至于如何去管理就需要自己来决定了。

Go源码学习：Context

认识Context

Context预设类型：

emptyCtx：

Background() & TODO()

valueCtx:

WithValue()

Value()

String()

cancelCtx：

WithCancel()

propagateCancel()

parentCancelCtx()

cancel()

removeChild()

timerCtx：

WithDeadline()

cancel()

其他

可cancel的ctx：

可cancel的ctx中的children