一、性能考察
- CPU利用率/负载
- 内存的消耗
- 当前协程数量
- GC的次数、频率
二、常见的问题
- 代码执行时间过长
- 内存占用过高【错误的用法】
- 意外阻塞【锁竞争、协程阻塞】
三、数据采集手段
-
net/http/pprof : 适用于Web环境,直接通过访问http请求获取数据
-
runtime/pprof : 代码中埋点记录监控数据
-
运行基准测试阶段:测试阶段生成采样数据
使用的分析工具
go tool pprof
案例:
一、使用基准测试分析内存使用情况
案例中写了两个常见字符串连接的方式“+”,“Sprintf”
func stringAdd() (str string) {
for i := 0; i < 20000; i++ {
str += strconv.Itoa(i)
}
return
}
func stringAddSprintf() (str string) {
for i := 0; i < 20000; i++ {
str = fmt.Sprintf("%s%d", str, i)
}
return
}
func BenchmarkStrAdd(b *testing.B) {
b.ResetTimer()
for i := 0; i < b.N; i++ {
_ = stringAdd()
}
b.StopTimer()
}
func BenchmarkStrAddSprintf(b *testing.B) {
b.ResetTimer()
for i := 0; i < b.N; i++ {
_ = stringAddSprintf()
}
b.StopTimer()
}
1、 首先收集内存数据
go test -bench=. -run=none -memprofile mem.out
goos: windowsgoarch: amd64pkg: github.com/oyjjpp/blog/action/performanceBenchmarkStrAdd-4 7 160143057 ns/opBenchmarkStrAddSprintf-4 5 242425080 ns/opPASSok github.com/oyjjpp/blog/action/performance 5.867s
$ lltotal 3265-rw-r--r-- 1 ouyangjun 197121 1886 4月 30 22:37 index.go-rw-r--r-- 1 ouyangjun 197121 146 4月 30 23:07 mem.out-rw-r--r-- 1 ouyangjun 197121 546 4月 30 23:06 performace_test.go-rwxr-xr-x 1 ouyangjun 197121 3334144 4月 30 23:07 performance.test.exe*
2、通过go tool pprof 分析生成的文件,查看当前内存占用情况
$ go tool pprof mem.out
Type: alloc_space
Time: Apr 30, 2020 at 11:16pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
// 选择内存消耗前十条的数据
(pprof) top
Showing nodes accounting for 27571.31MB, 99.88% of 27603.17MB total
Dropped 19 nodes (cum <= 138.02MB)
Showing top 10 nodes out of 14
flat flat% sum% cum cum%
11750.42MB 42.57% 42.57% 11750.92MB 42.57% github.com/oyjjpp/blog/action/performance.stringAdd
11029.21MB 39.96% 82.53% 15849.75MB 57.42% fmt.Sprintf
4789.67MB 17.35% 99.88% 4789.67MB 17.35% fmt.(*buffer).writeString
2MB 0.0072% 99.88% 15851.75MB 57.43% github.com/oyjjpp/blog/action/performance.stringAddV2
0 0% 99.88% 4789.67MB 17.35% fmt.(*fmt).fmtS
0 0% 99.88% 4789.67MB 17.35% fmt.(*fmt).padString
0 0% 99.88% 4803.03MB 17.40% fmt.(*pp).doPrintf
0 0% 99.88% 4789.67MB 17.35% fmt.(*pp).fmtString
0 0% 99.88% 4803.03MB 17.40% fmt.(*pp).printArg
0 0% 99.88% 11750.92MB 42.57% github.com/oyjjpp/blog/action/performance.BenchmarkStrAdd
(pprof) list stringAdd
Total: 26.96GB
ROUTINE ======================== github.com/oyjjpp/blog/action/performance.stringAdd in D:\golang\blog\action\performance\index.go
11.48GB 11.48GB (flat, cum) 42.57% of Total
. . 64: }
. . 65:}
. . 66:
. . 67:func stringAdd() (str string) {
. . 68: for i := 0; i < 20000; i++ {
11.48GB 11.48GB 69: str += strconv.Itoa(i)
. . 70: }
. . 71: return
. . 72:}
. . 73:
. . 74:func stringAddV2() (str string) {
ROUTINE ======================== github.com/oyjjpp/blog/action/performance.stringAddV2 in D:\golang\blog\action\performance\index.go
2MB 15.48GB (flat, cum) 57.43% of Total
. . 71: return
. . 72:}
. . 73:
. . 74:func stringAddV2() (str string) {
. . 75: for i := 0; i < 20000; i++ {
2MB 15.48GB 76: str = fmt.Sprintf("%s%d", str, i)
. . 77: }
. . 78: return
. . 79:}
. . 80:
. . 81:func stringAddV3() string {
还可以通过图形页面查看调用栈信息,需要事先安装graphviz,通过界面浏览;
3、通过top、list两条命令很明显看到了内存消耗的情况及代码调用的位置,常见Golang中字符串拼接使用“+”,因为Golang字符串是不可编辑的,每次拼接都相当于新创建一个字符串,对于频繁拼接字符串的情况我们还可以使用bytes.Buffer、strings.Builder,借助这种可变大小字节缓冲的方式进行拼接;
4、接下来我们采用bytes.Buffer,strings.Builder对代码进行优化
func stringAddBuffer() string {
var str bytes.Buffer
for i := 0; i < 20000; i++ {
str.WriteString(strconv.Itoa(i))
}
return str.String()
}
func stringAddBuilder() string {
var str strings.Builder
for i := 0; i < 20000; i++ {
str.WriteString(strconv.Itoa(i))
}
return str.String()
}
5、重新采集数据
$ go test -bench=. -run=none -memprofile mem.out
goos: windows
goarch: amd64
pkg: github.com/oyjjpp/blog/action/performance
BenchmarkStrAddBuffer-4 1090 1073421 ns/op
BenchmarkStrAddBuilder-4 1224 986927 ns/op
PASS
ok github.com/oyjjpp/blog/action/performance 3.073s
$ go tool pprof mem.out
Type: alloc_space
Time: May 1, 2020 at 12:16am (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 1084.51MB, 100% of 1084.51MB total
Showing top 10 nodes out of 14
flat flat% sum% cum cum%
490.03MB 45.18% 45.18% 490.03MB 45.18% strings.(*Builder).WriteString
306.18MB 28.23% 73.42% 306.18MB 28.23% bytes.makeSlice
198.50MB 18.30% 91.72% 198.50MB 18.30% strconv.formatBits
89.79MB 8.28% 100% 89.79MB 8.28% bytes.(*Buffer).String
0 0% 100% 306.18MB 28.23% bytes.(*Buffer).WriteString
0 0% 100% 306.18MB 28.23% bytes.(*Buffer).grow
0 0% 100% 493.48MB 45.50% github.com/oyjjpp/blog/action/performance.BenchmarkStrAddBuffer
0 0% 100% 591.03MB 54.50% github.com/oyjjpp/blog/action/performance.BenchmarkStrAddBuilder
0 0% 100% 493.48MB 45.50% github.com/oyjjpp/blog/action/performance.stringAddBuffer
0 0% 100% 591.03MB 54.50% github.com/oyjjpp/blog/action/performance.stringAddBuilder
6、通过分析优化后的代码明显看到申请的内存在下降,平时在写单元测试的时候如果发现运行时间较长的可以试着进行一些性能数据采集,进行分析,查看是否出现了大量内存申请的情况。
二、使用net/http/pprof分析CPU使用情况
1、源码
func CPUData(ctx *gin.Context) { fmt.Println(ctx.Request.URL) fib := 0 for i := 0; i < 5000; i++ { fib = Fib(20) fmt.Println("fib = ", fib) } str := RandomStr(RandomInt(100, 500)) str = fmt.Sprintf("Fib = %d; String = %s", fib, str) ctx.JSON(http.StatusOK, str)}
2、收集数据
首先通过wrk压测跑30秒程序
wrk -t12 -c400 -d30s http://domain/blog/performance/cpu\_data
3、通过go tool pprof进行在线分析CPU使用情况
(pprof) $ go tool pprof http://47.98.161.8:8080/debug/pprof/profile
Fetching profile over HTTP from http://47.98.161.8:8080/debug/pprof/profile
Saved profile in C:\Users\ouyangjun\pprof\pprof.main.samples.cpu.005.pb.gz
File: main
Type: cpu
Time: May 5, 2020 at 11:16pm (CST)
Duration: 30s, Total samples = 11.59s (38.63%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 11.40s, 98.36% of 11.59s total
Dropped 28 nodes (cum <= 0.06s)
Showing top 10 nodes out of 17
flat flat% sum% cum cum%
10.70s 92.32% 92.32% 10.70s 92.32% github.com/oyjjpp/blog/action/performance.Fib
0.64s 5.52% 97.84% 0.67s 5.78% syscall.Syscall
0.03s 0.26% 98.10% 0.07s 0.6% fmt.(*pp).printArg
0.03s 0.26% 98.36% 0.70s 6.04% syscall.write
0 0% 98.36% 0.07s 0.6% fmt.(*pp).doPrintln
0 0% 98.36% 0.87s 7.51% fmt.Fprintln
0 0% 98.36% 0.87s 7.51% fmt.Println
0 0% 98.36% 11.59s 100% github.com/gin-gonic/gin.(*Context).Next
0 0% 98.36% 11.59s 100% github.com/gin-gonic/gin.(*Engine).ServeHTTP
0 0% 98.36% 11.59s 100% github.com/gin-gonic/gin.(*Engine).handleHTTPRequest
(pprof) top5
Showing nodes accounting for 11.40s, 98.36% of 11.59s total
Dropped 28 nodes (cum <= 0.06s)
Showing top 5 nodes out of 17
flat flat% sum% cum cum%
10.70s 92.32% 92.32% 10.70s 92.32% github.com/oyjjpp/blog/action/performance.Fib
0.64s 5.52% 97.84% 0.67s 5.78% syscall.Syscall
0.03s 0.26% 98.10% 0.07s 0.6% fmt.(*pp).printArg
0.03s 0.26% 98.36% 0.70s 6.04% syscall.write
0 0% 98.36% 0.07s 0.6% fmt.(*pp).doPrintln
(pprof) list Fib
Total: 11.59s
ROUTINE ======================== github.com/oyjjpp/blog/action/performance.Fib in \data\project\blog\action\performance\util.go
10.70s 17.04s (flat, cum) 147.02% of Total
. . 1:package performance
. . 2:
3.93s 3.93s 3:func Fib(n int) int {
. . 4: switch n {
1.14s 1.14s 5: case 0:
. . 6: return 0
230ms 230ms 7: case 1:
230ms 230ms 8: return 1
360ms 360ms 9: case 2:
450ms 450ms 10: return 2
. . 11: default:
4.36s 10.70s 12: return Fib(n-1) + Fib(n-2)
. . 13: }
. . 14:}
(pprof)
通过查看以上分析结果,消耗CPU的主要是Fib函数
接下来会逐步根据CPU、内存、GC、协程、阻塞等各个点进行分析