Are you running on OS 3.0? I saw the same…

Question

0

Asked: May 11, 20262026-05-11T21:42:36+00:00 2026-05-11T21:42:36+00:00

I trying to get started with Google Perf Tools to profile some CPU intensive

0

I trying to get started with Google Perf Tools to profile some CPU intensive applications. It’s a statistical calculation that dumps each step to a file using `ofstream’. I’m not a C++ expert so I’m having troubling finding the bottleneck. My first pass gives results:

Total: 857 samples
     357  41.7%  41.7%      357  41.7% _write$UNIX2003
     134  15.6%  57.3%      134  15.6% _exp$fenv_access_off
     109  12.7%  70.0%      276  32.2% scythe::dnorm
     103  12.0%  82.0%      103  12.0% _log$fenv_access_off
      58   6.8%  88.8%       58   6.8% scythe::const_matrix_forward_iterator::operator*
      37   4.3%  93.1%       37   4.3% scythe::matrix_forward_iterator::operator*
      15   1.8%  94.9%       47   5.5% std::transform
      13   1.5%  96.4%      486  56.7% SliceStep::DoStep
      10   1.2%  97.5%       10   1.2% 0x0002726c
       5   0.6%  98.1%        5   0.6% 0x000271c7
       5   0.6%  98.7%        5   0.6% _write$NOCANCEL$UNIX2003

This is surprising, since all the real calculation occurs in SliceStep::DoStep. The “_write$UNIX2003” (where can I find out what this is?) appears to be coming from writing the output file. Now, what confuses me is that if I comment out all the outfile << "text" statements and run pprof, 95% is in SliceStep::DoStep and `_write$UNIX2003′ goes away. However my application does not speed up, as measured by total time. The whole thing speeds up less than 1 percent.

What am I missing?

Added:
The pprof output without the outfile << statements is:

Total: 790 samples
     205  25.9%  25.9%      205  25.9% _exp$fenv_access_off
     170  21.5%  47.5%      170  21.5% _log$fenv_access_off
     162  20.5%  68.0%      437  55.3% scythe::dnorm
      83  10.5%  78.5%       83  10.5% scythe::const_matrix_forward_iterator::operator*
      70   8.9%  87.3%       70   8.9% scythe::matrix_forward_iterator::operator*
      28   3.5%  90.9%       78   9.9% std::transform
      26   3.3%  94.2%       26   3.3% 0x00027262
      12   1.5%  95.7%       12   1.5% _write$NOCANCEL$UNIX2003
      11   1.4%  97.1%      764  96.7% SliceStep::DoStep
       9   1.1%  98.2%        9   1.1% 0x00027253
       6   0.8%  99.0%        6   0.8% 0x000274a6

This looks like what I’d expect, except I see no visible increase in performance (.1 second on a 10 second calculation). The code is essentially:

ofstream outfile("out.txt");
for loop:
  SliceStep::DoStep()
  outfile << 'result'
outfile.close()

Update: I timing using boost::timer, starting where the profiler starts and ending where it ends. I do not use threads or anything fancy.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T21:42:36+00:00

From my comments:

The numbers you get from your profiler say, that the program should be around 40% faster without the print statements.

The runtime, however, stays nearly the same.

Obviously one of the measurements must be wrong. That means you have to do more and better measurements.

First I suggest starting with another easy tool: the time command. This should get you a rough idea where your time is spend.

If the results are still not conclusive you need a better testcase:

Use a larger problem
Do a warmup before measuring. Do some loops and start any measurement afterwards (in the same process).

Tiristan: It’s all in user. What I’m doing is pretty simple, I think… Does the fact that the file is open the whole time mean anything?

That means the profiler is wrong.

Printing 100000 lines to the console using python results in something like:

for i in xrange(100000):
    print i

To console:

time python print.py
[...]
real    0m2.370s
user    0m0.156s
sys     0m0.232s

Versus:

time python test.py > /dev/null

real    0m0.133s
user    0m0.116s
sys     0m0.008s

My point is:
Your internal measurements and time show you do not gain anything from disabling output. Google Perf Tools says you should. Who’s wrong?

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions