I have a Go binary that I’m trying to profile, and I’m getting surprising results. The code has the following (truncated) in main.go, and the rest of the code is in the package monte:
package main
import (
"monte"
"runtime/pprof"
)
var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")
func main() {
flag.Parse()
if *cpuprofile != "" {
f, err := os.Create(*cpuprofile)
if err != nil {
log.Fatal(err)
}
pprof.StartCPUProfile(f)
}
monte.ExpensiveOperation()
pprof.StopCPUProfile()
}
I build my executable with go build src/main.go, then run it with ./main -cpuprofile=monte.prof. When I examine the output with go tool pprof main monte.prof, I get the following output:
(pprof) top10 --cum
Total: 346 samples
280 80.9% 80.9% 343 99.1% time.Time.Format
0 0.0% 80.9% 341 98.6% runtime.interhash
0 0.0% 80.9% 341 98.6% runtime.main
0 0.0% 80.9% 251 72.5% strconv.Unquote
13 3.8% 84.7% 31 9.0% strconv.roundShortest
11 3.2% 87.9% 18 5.2% strconv.fmtE
9 2.6% 90.5% 9 2.6% runtime.markallocated
1 0.3% 90.8% 8 2.3% math/rand.Float64
2 0.6% 91.3% 8 2.3% runtime.FixAlloc_Free
7 2.0% 93.4% 8 2.3% time.nextStdChunk
The function that has the largest cumulative time is time.Time.Format, which seems wrong to me (shouldn’t it be main?) and there’s no mention of monte at all, despite the fact that the “expensive operation” takes about 10 seconds to complete, plenty of time for the sampler to see it. If I pass the --focus=monte flag to go tool pprof, no samples are shown at all. I assume I’m missing some flag somewhere; does anyone have any ideas? Thanks!
It looks like a CPU-only profiler, so if your
ExpensiveOperationtakes time by doing I/O or sleep or something like that, it will be invisible.(That’s the problem with “cpu” profilers.)
In terms of what the numbers mean, it looks like there were 346 total samples.
It’s not too surprising that the numbers are a bit squishy, depending on how the profiler works, but if it were a true stack sampler the numbers would mean this:
341/346 samples had
mainandinterhashon the stack. You would expect all samples to have main on the stack, but that’s the squishy part.343/346 samples had
Formaton the stack. (Why there’s more of those thanmain, who knows?)251/346 samples had
Unquoteon the stack. So of those 251 samples, they also probably hadmain,interhash, andFormaton the stack.With this kind of detective work, you can slowly begin to piece together what the samples are telling you.
Of course, if you could actually see the stack samples, you wouldn’t have to see very many of them before you would know exactly what’s going on.
More about all that.