I am writing a c++ benchmarking program, which involves timing a number of function

Question

0

Asked: May 23, 20262026-05-23T16:54:03+00:00 2026-05-23T16:54:03+00:00

I am writing a c++ benchmarking program, which involves timing a number of function

0

I am writing a c++ benchmarking program, which involves timing a number of function calls. The functions are called repeatedly and each time is recorded for statistical analysis later. It is required that the functions be run simultaneously on multiple threads and thus to ensure accuracy and fairness of the benchmark, it is run on a real-time OS, with the scheduling behavior being controlled. The following are my concerns:

Are there deterministic ways of collecting the timing data? I have looked at printf and stringstream but neither seems to have deterministic behavior due to memory & buffer operations. They also do not perform in O(1) for the same reason, am I right? Currently I am using a large char array and a custom strcat function so that each time value can be collected in O(1). This array is then printed at the end of the test, when all data has been collected.

I am using clock_gettime for timings and clock_getres gives me a resolution of 1ns. Can this value be trusted?

Am I doing things right so far, and are there any other issues that I should be aware of when writing the benchmark?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T16:54:04+00:00

Calling high-frequency timers and writing samples into an output stream is a perfectly sensible way to get performance data. But there are a few tricky gotchas to be careful of.

Indeed you shouldn’t use printf and stringstream — not only because their execution time is variable and poorly defined, but also because they’re just darn slow, especially if you’re formatting your perf data into strings every microsecond! It’s much better to write binary data into a preallocated buffer, like an array of structures, and then format them later after your test is done. That will be faster and give you a more consistent write overhead.
clock_gettime with the high-resolution timer (eg CLOCK_PROCESS_CPUTIME_ID) should be reliable if the person who wrote your kernel wasn’t a dunce. You can look into the Performance Application Programming Interface library if you want to query the CPU timers directly, but that shouldn’t be necessary.
Multithreading can be inherently chaotic (in the determinism sense) because the threads are fighting each other for CPU cache and memory bandwidth. You can get stochastically varying results depending on whether simultaneously scheduled threads happen to be touching the same memory, or are evicting each other’s work from data caches all the time — and this will vary from run to run depending on exactly how the data is laid out in memory and which threads are running. But that’s fine: lots of processes in engineering are stochastic. Just run your benchmark many times and get a statistically significant average and deviation for your perf numbers.

Or, if you truly need to have 100% determinism, you’ll need to ensure that your threads schedule in the same order, run for the same quanta, and put their data in the same memory addresses for each run.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am writing a c++ benchmarking program, which involves timing a number of function

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply