pprof 性能分析

pprof 是 Go 内置的性能分析工具，能精确定位 CPU 热点、内存泄漏、goroutine 阻塞等问题。

采集方式

方式一：HTTP 接口（推荐用于服务）

import (
    "net/http"
    _ "net/http/pprof"  // 注册 pprof 路由
)

func main() {
    // 在独立端口暴露 pprof（不要暴露到公网！）
    go func() {
        http.ListenAndServe("localhost:6060", nil)
    }()

    // 你的服务代码
    startServer()
}

// 访问：
// http://localhost:6060/debug/pprof/
// http://localhost:6060/debug/pprof/heap
// http://localhost:6060/debug/pprof/goroutine
// http://localhost:6060/debug/pprof/profile?seconds=30

方式二：代码中直接采集

import "runtime/pprof"

// CPU 分析
f, _ := os.Create("cpu.prof")
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()

// 运行你的代码...

// 内存分析
f2, _ := os.Create("mem.prof")
runtime.GC()  // 先触发 GC，获得准确数据
pprof.WriteHeapProfile(f2)
f2.Close()

方式三：测试中采集

bash

# CPU 分析
go test -cpuprofile=cpu.prof -bench=BenchmarkMyFunc ./...

# 内存分析
go test -memprofile=mem.prof -bench=BenchmarkMyFunc ./...

# 分析结果
go tool pprof cpu.prof

分析 CPU 热点

bash

# 采集 30 秒 CPU profile
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# 进入交互式界面
(pprof) top10          # 显示 CPU 占用最高的 10 个函数
(pprof) top10 -cum     # 按累计时间排序
(pprof) list myFunc    # 显示函数的逐行 CPU 占用
(pprof) web            # 在浏览器中显示调用图（需要 graphviz）
(pprof) png            # 生成 PNG 图片

# 直接生成火焰图（推荐）
go tool pprof -http=:8081 http://localhost:6060/debug/pprof/profile?seconds=30
# 浏览器访问 http://localhost:8081，选择 Flame Graph

分析内存

bash

# 采集堆内存快照
go tool pprof http://localhost:6060/debug/pprof/heap

(pprof) top10                    # 内存占用最多的函数
(pprof) top10 -inuse_objects     # 按对象数量排序
(pprof) list myFunc              # 逐行内存分配

# 对比两个时间点的内存（找内存泄漏）
curl -o before.prof http://localhost:6060/debug/pprof/heap
# ... 运行一段时间 ...
curl -o after.prof http://localhost:6060/debug/pprof/heap

go tool pprof -base before.prof after.prof
(pprof) top10  # 显示增量

分析 goroutine

bash

# 查看所有 goroutine 的堆栈
go tool pprof http://localhost:6060/debug/pprof/goroutine

(pprof) top10    # goroutine 数量最多的调用栈
(pprof) traces   # 显示所有 goroutine 的完整堆栈

# 直接查看（文本格式）
curl http://localhost:6060/debug/pprof/goroutine?debug=2

分析阻塞和锁竞争

// 需要先开启采集
runtime.SetBlockProfileRate(1)    // 采集所有阻塞事件
runtime.SetMutexProfileFraction(1) // 采集所有锁竞争事件

bash

# 阻塞分析
go tool pprof http://localhost:6060/debug/pprof/block

# 锁竞争分析
go tool pprof http://localhost:6060/debug/pprof/mutex

实战：定位内存泄漏

// 模拟内存泄漏
var cache = make(map[string][]byte)
var mu sync.Mutex

func leakyHandler(w http.ResponseWriter, r *http.Request) {
    key := r.URL.Query().Get("key")
    mu.Lock()
    cache[key] = make([]byte, 1024*1024)  // 1MB，永不释放
    mu.Unlock()
    w.Write([]byte("ok"))
}

// 分析步骤：
// 1. 启动服务，开启 pprof
// 2. 发送大量请求
// 3. 采集堆内存：go tool pprof -http=:8081 http://localhost:6060/debug/pprof/heap
// 4. 查看 Flame Graph，找到分配最多的函数
// 5. 查看 top，找到 leakyHandler

实战：定位 CPU 热点

// 低效代码示例
func slowStringConcat(n int) string {
    result := ""
    for i := 0; i < n; i++ {
        result += fmt.Sprintf("%d", i)  // 每次都分配新字符串
    }
    return result
}

// 优化后
func fastStringConcat(n int) string {
    var sb strings.Builder
    for i := 0; i < n; i++ {
        fmt.Fprintf(&sb, "%d", i)
    }
    return sb.String()
}

// 通过 pprof 可以看到 slowStringConcat 的 CPU 占用远高于 fastStringConcat

基准测试 + pprof

func BenchmarkSlowConcat(b *testing.B) {
    for i := 0; i < b.N; i++ {
        slowStringConcat(1000)
    }
}

func BenchmarkFastConcat(b *testing.B) {
    for i := 0; i < b.N; i++ {
        fastStringConcat(1000)
    }
}

bash

# 运行基准测试并生成 profile
go test -bench=BenchmarkSlowConcat -cpuprofile=slow.prof -memprofile=slow_mem.prof

# 可视化分析
go tool pprof -http=:8081 slow.prof

性能分析工作流

先用 go test -bench 确认有性能问题
用 pprof 定位热点函数
用 list 命令找到具体的热点代码行
优化后再次基准测试验证效果
不要过早优化，先让代码正确，再让代码快

pprof 性能分析 ​

采集方式 ​

方式一：HTTP 接口（推荐用于服务） ​

方式二：代码中直接采集 ​

方式三：测试中采集 ​

分析 CPU 热点 ​

分析内存 ​

分析 goroutine ​

分析阻塞和锁竞争 ​

实战：定位内存泄漏 ​

实战：定位 CPU 热点 ​

基准测试 + pprof ​

pprof 性能分析

采集方式

方式一：HTTP 接口（推荐用于服务）

方式二：代码中直接采集

方式三：测试中采集

分析 CPU 热点

分析内存

分析 goroutine

分析阻塞和锁竞争

实战：定位内存泄漏

实战：定位 CPU 热点

基准测试 + pprof