I couldn’t really get an answer to this question, so I’ll attempt to write a custom, although simple, profiler. Just to get started: suppose I need to find out, without recompiling, how much (and which) core is running my code. Suppose also I’d like to catch when a given function is executed. Finally, any thoughts about dealing with threads? Any other tips as to how to start? C is my language of choice, and I’m running Linux. Thanks.
Edit: Oprofile, CallGrind, Helgrind, gprof, papi, tau, and others I’ve analyzed seem not to match my needs.
You should try linux’s perf https://perf.wiki.kernel.org/index.php/Tutorial
This tool has direct support from kernel and knows about page-faults, CPU-migrations, context-switches (e.g. look at
perf statoutput). This stats can be aggregated per-process or per-cpu.perf recordcan be used like oprofile.For adding your simple profiling you can use
setitimer(the sampling signal is process-wide) ortimer_create(timer signal can be installed for thread). You can’t directly get information about physical cpu number used by thread, but at every sample you can per-thread run times withgetrusagewithRUSAGE_THREAD.