GO语言-死锁、数据竞争、协程泄露开启掘金成长之旅！这是我参与「掘金日新计划 · 12 月更文挑战」的第17天，点击查看

开启掘金成长之旅！这是我参与「掘金日新计划 · 12 月更文挑战」的第18天，点击查看活动详情

在Go中编写并发程序时，一般会面对死锁、数据竞争、协程泄露的问题。

死锁

死锁问题认为比较好检测，如下：

package main

import (
	"fmt"
	"log"
)

func main() {
	ch := make(chan int)
	ch <- 7
	if n, ok := <-ch; ok {
		fmt.Println(n)
	} else {
		log.Fatal("读取失败", err)
	}
}

执行程序时候，会提示dead block。通过分析，ch <- 7时，便将main的go程给阻塞了，需要用其它协程接收完数据才能继续执行，所以简单的办法就是ch <- 7改为

go func() {
	ch <- 7
}()

此类问题，分析好哪个协程在哪被阻塞了，接触阻塞的条件是什么就能解决了。

数据竞争

数据竞争发生在两个协程并发地写一块数据或者读数据的协程和写数据的协程并发执行，数据竞争是产品实际运行中面临的主要问题，一般在小访问量时不会出现问题，在大访问量时才会暴露问题。

看下面代码：

...
func main() {
	counter := 0
	var wg sync.Waitgroup
	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			counter++
			wg.Done()
		}()
	}
	wg.Wait()
}

counter++并不是一个原子操作，它有“从内存中读数加载到cpu中”，"在cpu计算结果"，”将结果返回到内存中“，输出结果很可能不是1000，而会小于1000。针对此类问题，可以使用加锁，也可以使用go中提供的原子操作库。

使用锁

func main() {
	counter := 0
	var wg sync.WaitGroup
	lk := sync.Mutex{}

	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			lk.Lock()
			counter++
			lk.Unlock()
			wg.Done()
		}()
	}
	wg.Wait()
	fmt.Println(counter)
}

使用sync/atomic库

import (
	"fmt"
	"sync"
	"sync/atomic"
)

func main() {
	var counter int32 = 0
	var wg sync.WaitGroup

	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			atomic.AddInt32(&counter, 1)
			wg.Done()
		}()
	}
	wg.Wait()
	fmt.Println(counter)
}

简单介绍atomic的一些方法(注意atomic只能对int32和int64操作)

AddInt32, AddInt64：加法（使用较多）
CompareAndSwapInt32, CompareAndSwapInt64：比较加交换，如果指定的变量和第二个参数的值相等，则将其置为第三个参数的值。

success := atomic.CompareAndSwapInt32(&counter, 0, 1)

LoadInt32, LoadInt64

value := atomic.LoadInt32(&counter)

StoreInt32, StoreInt64

atomic.StoreInt32(&counter, 1)

SwapInt32, SwapInt64

oldValue := atomic.SwapInt32(&counter, 1)

协程泄露

通过计算协程增加的斜率来计算（粗略）是否有泄露，代码如下：

package main

import (
	"fmt"
	"log"
	"runtime"
	"time"
)

func consumer(ch <-chan int) {
	fmt.Println("consumer ")
	data, _ := <-ch
	println(data)
}

func DetectLeak() {
	fmt.Println("DetectLeak")
	record := make([]int, 0)
	//通过斜率计算是否有泄露

	for {
		record = append(record, runtime.NumGoroutine())
		time.Sleep(time.Second)
		if len(record) > 5 {
			// if record[len(record)-1]-record[len(record)-2] > 0 {
			// 	fmt.Println("leak")
			// 	break
			// } 这种做法是错误的，我们需要检测到全局的斜率上升

			currentNumofGoroutine := float32(record[len(record)-1])
			if currentNumofGoroutine/float32(len(record)) > 0.5 {
				log.Fatal("leak Happened")
			}
		}

	}
}

func main() {
	ch := make(chan int)
	go DetectLeak()
	for i := 0; i < 100; i++ {
		time.Sleep(time.Second)
		go consumer(ch)
	}
}

我们也可以通过leaktest库来检测，只需

import "github.com/fortytw2/leaktest"

如果没有安装leaktest，终端会提示安装命令，直接copy就行

leaktest的官方使用示例文件结构 -leak_test.go -go.mod

注意测试文件必须以 _test.go 收尾，测试函数必须以Test开头，参数只能是t *testing.T，如：

// Default "Check" will poll for 5 seconds to check that all
// goroutines are cleaned up
func TestPool(t *testing.T) {
    defer leaktest.Check(t)()

    go func() {
        for {
            time.Sleep(time.Second)
        }
    }()
}

这里注意，leaktest.Check会检测出睡眠5s的go程，如果有的话就判定是协程泄露了。终端输入go test就会对该函数进行测试了，显示线程泄露

下面是改写的示例：

func TestPool(t *testing.T) {
	defer leaktest.Check(t)()

	// ch := make(chan int)
	count := 0
	fmt.Println("start")
	go func() {
		for {
			count++
			time.Sleep(1 * time.Second)
			fmt.Println(count)
		}
	}()
}

输出结果

start
1
2
3
4
5
--- FAIL: TestPool (5.05s)
    leaktest.go:132: leaktest: timed out checking goroutines
    leaktest.go:150: leaktest: leaked goroutine: goroutine 20 [sleep]:
...

下面设置一个函数，它只对信道进行读，然而现在信道没写数据进去，它会一直等待，这种情况其实就是泄露，go程没有被正确结束，一直处于阻塞状态，垃圾回收机制也并未对其回收。

package main

import (
	"fmt"
	"testing"

	"github.com/fortytw2/leaktest"
)

func consumer(ch <-chan int) {
	fmt.Println("consumer ")
	data, _ := <-ch
	fmt.Println(data)
}

func TestPool(t *testing.T) {
	defer leaktest.Check(t)()

	ch := make(chan int)
	go func() {
		consumer(ch)
	}()
}

结果：

consumer 
--- FAIL: TestPool (5.06s)
    leaktest.go:132: leaktest: timed out checking goroutines
    leaktest.go:150: leaktest: leaked goroutine: goroutine 20 [chan receive]:
        Test.consumer(0x0?)
                C:/Users/饿了没/Desktop/Go/Test/leak_test.go:12 +0x6f
        Test.TestPool.func1()
                C:/Users/饿了没/Desktop/Go/Test/leak_test.go:21 +0x1d
        created by Test.TestPool
                C:/Users/饿了没/Desktop/Go/Test/leak_test.go:20 +0xa5
FAIL
exit status 1
FAIL    Test    5.339s

线程阻塞5s后，被检测到还在阻塞，发出泄露错误。