前言

最近在学习golang的锁机制的底层实现，锁机制的设计有些很精妙的地方，自己也有些心得想记录下来，慢慢写起吧。

内存竞争，临界区，临界资源，信号量这些前置理论知识我就略过了，并不是什么很艰深的东西，感兴趣的可以自己找找资料。

互斥锁

先来看一下互斥锁的数据结构：

// A Mutex is a mutual exclusion lock.
// The zero value for a Mutex is an unlocked mutex.
//
// A Mutex must not be copied after first use.
// 互斥锁的零值代表一个未被锁定的锁
// 一个互斥锁被使用后不能被复制
type Mutex struct {
   state int32
   sema  uint32
}

可以看到，互斥锁的数据结构极为简单，由一个表示锁状态的int32型整型和表示信号量的无符号int32整型组成。

state字段里存有该互斥锁的是否被占用，是否被唤醒，是否是饥饿模式，以及被锁阻塞的goroutine数量等信息，下面来看一下这些信息是怎么存到一个32位的二进制码里的。

const (
   mutexLocked = 1 << iota // mutex is locked
   mutexWoken
   mutexStarving
   mutexWaiterShift = iota

   // Mutex fairness.
   //
   // Mutex can be in 2 modes of operations: normal and starvation.
   // In normal mode waiters are queued in FIFO order, but a woken up waiter
   // does not own the mutex and competes with new arriving goroutines over
   // the ownership. New arriving goroutines have an advantage -- they are
   // already running on CPU and there can be lots of them, so a woken up
   // waiter has good chances of losing. In such case it is queued at front
   // of the wait queue. If a waiter fails to acquire the mutex for more than 1ms,
   // it switches mutex to the starvation mode.
   //
   // In starvation mode ownership of the mutex is directly handed off from
   // the unlocking goroutine to the waiter at the front of the queue.
   // New arriving goroutines don't try to acquire the mutex even if it appears
   // to be unlocked, and don't try to spin. Instead they queue themselves at
   // the tail of the wait queue.
   //
   // If a waiter receives ownership of the mutex and sees that either
   // (1) it is the last waiter in the queue, or (2) it waited for less than 1 ms,
   // it switches mutex back to normal operation mode.
   //
   // Normal mode has considerably better performance as a goroutine can acquire
   // a mutex several times in a row even if there are blocked waiters.
   // Starvation mode is important to prevent pathological cases of tail latency.
   starvationThresholdNs = 1e6
)

mutexLocked：state字段的最低位，0值代表该互斥锁没有被占用，1值代表已经被占用，未获取到锁的goroutine需要阻塞等待锁释放事件发送信号量唤醒；
mutexWoken：state字段的第二位，0值代表该互斥锁没有唤醒协程，1值代表已经唤醒协程；
mutexStarving：state字段的第三位，0值代表该互斥锁没有进入饥饿模式，1值代表已经进入饥饿模式；
mutexWaiterShift：代表等待锁释放，被阻塞的goroutine数量；
starvationThresholdNs：值为1e6，单位为微妙，即一毫秒，代表该互斥锁进入饥饿模式的goroutine阻塞时间阈值，详细信息下面会有介绍；

Lock Fairness

一个互斥锁有两种模式：1. 普通模式；2. 饥饿模式；

在普通模式中，在等待队列里goroutine是以FIFO的顺序进行排队的，当锁释放时，等待队列里的goroutine会与刚到达的协程进行锁的争夺，但是刚到达的协程与等待队列中的协程相比有个巨大的优势，刚到达的协程是在cpu里运行的，他们存储在cpu的高速缓存中，所以，一个在等待队列中被唤醒的协程很有可能竞争不过刚到达的协程，这样就会造成goroutine饥饿的情况，即长时间获取不到需要的资源。

golang的互斥锁对这个问题的解决方案是提供一个饥饿模式。

之前提到，在正常模式下，锁竞争是由刚到达cpu的协程和在等待队列中被唤醒的协程一起参与的，那么当一个goroutine在1毫秒内没有获取到锁，这个协程就会把锁置为饥饿模式。

在饥饿模式中，当锁释放后，锁的使用权直接会被移交给等待队列中的队首协程，新到达的协程不会进行自旋去竞争锁，而是直接排到等待队列的队尾。

如果一个协程：1）在1毫秒内获取到了锁，2）是等待队列中的最后一个协程，那么它会把这个锁置为正常状态。

互斥锁的正常模式和饥饿模式各有利弊：

正常模式的锁效率更高，因为很有可能是在cpu中的协程拿到锁；
饥饿模式的锁实现了锁公平，使所有协程都有机会拿到锁；

互斥锁的用法

下面来看重头戏，互斥锁的加解锁流程。

go的互斥锁使用起来很简洁，代码如下：

lock := sync.Mutex{}
lock.Lock()
ids += strconv.Itoa(int(id)) + ","
lock.Unlock()

第一步：声明一个mutex锁；

第二步：在进入临界区时通过调用Lock()方法上锁，如果没有拿到锁，就阻塞等待其他协程把锁释放；

第三部：在离开临界区时，调用Unlock()方法解锁，唤醒其它阻塞的协程；

加锁流程

下面来看一下lock方法的底层实现：

// Lock locks m.
// If the lock is already in use, the calling goroutine
// blocks until the mutex is available.
// 如果锁已经被占用，那么调用这个锁的协程就会阻塞到被唤醒为止
func (m *Mutex) Lock() {
   // Fast path: grab unlocked mutex.
   if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
      if race.Enabled {
         race.Acquire(unsafe.Pointer(m))
      }
      return
   }
   // Slow path (outlined so that the fast path can be inlined)
   m.lockSlow()
}

lock方法里调用的CompareAndSwapInt32方法是一个原子操作，用于在一个原子操作里对一个变量的值进行比较，如果相等的话再对这个变量的值进行更改。

在这里就是将m.state的值和0进行比较，如果m.state==0，那么将m.state的值置为mutexLocked。

这个操作的意思就是，如果m.state为0，那么代表这个锁：1：没有被占用，2：没有被唤醒， 3：不是饥饿模式，4：没有阻塞的goroutine，如果是这样状态的话，这个协程就可以直接占用这个锁，所以将锁的状态置为被占用，然后lock方法就直接返回了。

如果m.state!=0，说明协程现在没有拿到锁，下一步是进入lockSlow方法，走自旋和循环获取锁的逻辑。

func (m *Mutex) lockSlow() {
   // 首先是初始化goroutine的一些信息
   //waitStartTime：协程获取锁等待开始时间
   var waitStartTime int64
   //协程饥饿状态
   starving := false
   //协程被唤醒状态
   awoke := false
   //协程自选次数
   iter := 0
   old := m.state
   for {
      // Don't spin in starvation mode, ownership is handed off to waiters
      // so we won't be able to acquire the mutex anyway.
      
      //当锁处于饥饿模式中，协程是不会进行自旋的，锁的占有权直接交给等待队列的队首协程
      //判断协程能否自旋的条件：
      //1. 自选次数小于4
      //2. 多核CPU
      //3. gomaxprocs大于1
      //4. 当前P的G队列不为空
      if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
         // Active spinning makes sense.
         // Try to set mutexWoken flag to inform Unlock
         // to not wake other blocked goroutines.
         
         //自旋时，如果发现锁没有唤醒其它协程，自己也处于未唤醒状态，而且还有其他协程在等待锁释放，就把锁置为唤醒状态，也把协程本身置为唤醒状态
         //把锁置为唤醒状态的目的是不唤醒其他阻塞的协程
         if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
            atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) {
            awoke = true
         }
         
         //doSpin方法主要是通过调用procyield系统调用，执行CPU PAUSE指令，使当前协程不挂起，占用CPU资源
         runtime_doSpin()
         iter++
         old = m.state
         continue
      }
      new := old
      // Don't try to acquire starving mutex, new arriving goroutines must queue.
      //如果锁没有处于饥饿状态，准备尝试加锁
      if old&mutexStarving == 0 {
         new |= mutexLocked
      }
      //如果锁没有处于饥饿状态，而且已被锁定，那么等待队列数量+1
      if old&(mutexLocked|mutexStarving) != 0 {
         new += 1 << mutexWaiterShift
      }
      // The current goroutine switches mutex to starvation mode.
      // But if the mutex is currently unlocked, don't do the switch.
      // Unlock expects that starving mutex has waiters, which will not
      // be true in this case.
      //如果协程本身处于饥饿状态，而且锁已被锁定，就把锁置为饥饿状态
      if starving && old&mutexLocked != 0 {
         new |= mutexStarving
      }
      if awoke {
         // The goroutine has been woken from sleep,
         // so we need to reset the flag in either case.
         if new&mutexWoken == 0 {
            throw("sync: inconsistent mutex state")
         }
         
         //走到这里的话，表示协程被唤醒了，那么协程只可能1：成功拿到锁；2：没拿到锁，阻塞等待锁释放，所以，可以把锁的新状态置为未唤醒，这样，当锁释放时，可能会唤醒等待队列中的协程。
         new &^= mutexWoken
      }
      
      //更新锁状态，如果更新成功的话，说明没有其他协程修改锁的状态，本协程会进入等待信号量的阶段，再接模式下会最终拿到锁，在普通模式下，要不就直接拿到锁，要不就会重新进入自旋。如果锁状态没更新成功的话，说明锁状态被其他协程改了，就再来一遍循环获取锁的流程。
      if atomic.CompareAndSwapInt32(&m.state, old, new) {
         //如果更新之前的锁状态已经是未被占用，又不是饥饿模式，那么协程直接拿到锁，退出循环。
         //自旋就是干这个用的，每次自选等待一段时间，看锁有没有被释放，被释放了的话，直接获取到锁。
         if old&(mutexLocked|mutexStarving) == 0 {
            break // locked the mutex with CAS
         }
         // If we were already waiting before, queue at the front of the queue.
         //如果走到了这里，说明协程还没拿到锁，需要等待锁释放，通知协程获取锁。
         //如果waitStartTime != 0，说明这个协程是已经在等待了，是被唤醒的，所以把queueLifo置为true，也就是把这个协程放到等待队列的队首
         //如果waitStartTime == 0，说明这个协程是新来的，就把这个协程放到等待队列的队尾
         queueLifo := waitStartTime != 0
         if waitStartTime == 0 {
            //设置好协程获取锁等待开始时间
            waitStartTime = runtime_nanotime()
         }
         //阻塞，等待锁释放，在等待队列队首的协程会拿到锁
         runtime_SemacquireMutex(&m.sema, queueLifo, 1)
         //拿到锁之后，需要判断协程是否进入饥饿状态
         //判断条件是获得锁的等待时间是否大于1ms，是的话，协程就是饥饿状态
         starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
         //再次获取最新的锁状态，因为锁状态可能被其他协程改过
         old = m.state
         if old&mutexStarving != 0 {
            // If this goroutine was woken and mutex is in starvation mode,
            // ownership was handed off to us but mutex is in somewhat
            // inconsistent state: mutexLocked is not set and we are still
            // accounted as waiter. Fix that.
            if old&(mutexLocked|mutexWoken) != 0 || old>>mutexWaiterShift == 0 {
               throw("sync: inconsistent mutex state")
            }
            delta := int32(mutexLocked - 1<<mutexWaiterShift)
            
            //这里是退出饥饿模式的逻辑，如果拿到锁的协程不是饥饿状态，即协程在1ms内拿到了锁，或者该协程是等待队列里的最后一个，锁就会退出饥饿模式。
            if !starving || old>>mutexWaiterShift == 1 {
               // Exit starvation mode.
               // Critical to do it here and consider wait time.
               // Starvation mode is so inefficient, that two goroutines
               // can go lock-step infinitely once they switch mutex
               // to starvation mode.
               delta -= mutexStarving
            }
            //更新锁状态
            atomic.AddInt32(&m.state, delta)
            break
         }
         //如果锁是正常状态，那么协程就只能在锁没占用的时候拿到锁，走到这一步的话，就代表协程没拿到锁，那么重置协程的自旋次数，重新去自旋获取锁。
         awoke = true
         iter = 0
      } else {
         //如果协程没有能够成功更新锁状态，说明锁状态被其他协程更新了，那么本协程就需要重新进行循环获取锁的流程。
         old = m.state
      }
   }

   if race.Enabled {
      race.Acquire(unsafe.Pointer(m))
   }
}

简单一点来说，互斥锁加锁大概分三种情况：

锁状态不冲突，直接通过cas获取锁。
锁状态冲突，进入自旋，每次自旋后如果锁是未被占用状态，直接获取锁。
锁状态冲突，而且已经过了自旋阶段，那么就根据waitStartTime的值，把协程放进等待队列的队首或者队尾。如果锁是饥饿状态，解锁后就直接把锁的所有权交给等待队列的队首协程，否则，就让协程重新开始自旋获取锁。

untitled (1).png

解锁流程

说完了互斥锁是怎样加锁的，现在来说说互斥锁如何解锁

// Unlock unlocks m.
// It is a run-time error if m is not locked on entry to Unlock.
//
// A locked Mutex is not associated with a particular goroutine.
// It is allowed for one goroutine to lock a Mutex and then
// arrange for another goroutine to unlock it.
func (m *Mutex) Unlock() {
   if race.Enabled {
      _ = m.state
      race.Release(unsafe.Pointer(m))
   }

   // Fast path: drop lock bit.
   //如果当前的锁状态只是被占用，没有等待队列，不是饥饿模式，也没有唤醒其它协程，在这种情况下直接完成解锁，直接返回，否则还需要走unlockSlow的逻辑
   new := atomic.AddInt32(&m.state, -mutexLocked)
   if new != 0 {
      // Outlined slow path to allow inlining the fast path.
      // To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
      m.unlockSlow(new)
   }
}

unlockSlow方法主要是处理在正常模式和饥饿模式下的协程唤醒逻辑。

func (m *Mutex) unlockSlow(new int32) {
   //如果当前锁没有处在没占用状态，那就会报panic，因为协程在对没有上锁的锁解锁
   if (new+mutexLocked)&mutexLocked == 0 {
      throw("sync: unlock of unlocked mutex")
   }
   //如果锁处在正常状态，就进到这个分支
   if new&mutexStarving == 0 {
      old := new
      for {
         // If there are no waiters or a goroutine has already
         // been woken or grabbed the lock, no need to wake anyone.
         // In starvation mode ownership is directly handed off from unlocking
         // goroutine to the next waiter. We are not part of this chain,
         // since we did not observe mutexStarving when we unlocked the mutex above.
         // So get off the way.
         //如果锁没有等待队列，或者已经被占用，或者已经唤醒其它协程，说明没必要再唤醒协程了，那么就不唤醒等待队列中的协程，直接返回，完成解锁流程。
         if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken|mutexStarving) != 0 {
            return
         }
         // Grab the right to wake someone.
         //走到这里的话，说明这个普通模式的锁得唤醒等待队列里的协程
         //设置锁的新状态，等待队列的数量-1，然后设置为已唤醒
         new = (old - 1<<mutexWaiterShift) | mutexWoken
        //如果能通过cas成功设置锁状态，就随机在等待队列里唤醒一个，然后返回，结束解锁流程
         if atomic.CompareAndSwapInt32(&m.state, old, new) {
            runtime_Semrelease(&m.sema, false, 1)
            return
         }
         old = m.state
      }
   } else {
      // Starving mode: handoff mutex ownership to the next waiter, and yield
      // our time slice so that the next waiter can start to run immediately.
      // Note: mutexLocked is not set, the waiter will set it after wakeup.
      // But mutex is still considered locked if mutexStarving is set,
      // so new coming goroutines won't acquire it.
      //如果锁处在饥饿状态，就直接唤醒等待队列的队首协程就行
      runtime_Semrelease(&m.sema, true, 1)
   }
}

解锁流程比较简单，主要分为两部分：

如果能直接解锁，就直接解锁，也不用唤醒其它协程，因为锁没有等待队列，也没有设置唤醒状态；
如果不能直接解锁，就判断锁状态，如果在饥饿模式，就直接唤醒等待队列的队首协程；如果在正常模式，再判断需不需要唤醒协程，需要的话就随机在等待队列里唤醒一个，流程就结束了。

以上就是golang互斥锁的加解锁底层逻辑。

学习Golang的锁（一）

前言

互斥锁

Lock Fairness

互斥锁的用法

加锁流程

解锁流程