Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 252821
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T21:42:36+00:00 2026-05-11T21:42:36+00:00

I trying to get started with Google Perf Tools to profile some CPU intensive

  • 0

I trying to get started with Google Perf Tools to profile some CPU intensive applications. It’s a statistical calculation that dumps each step to a file using `ofstream’. I’m not a C++ expert so I’m having troubling finding the bottleneck. My first pass gives results:

Total: 857 samples
     357  41.7%  41.7%      357  41.7% _write$UNIX2003
     134  15.6%  57.3%      134  15.6% _exp$fenv_access_off
     109  12.7%  70.0%      276  32.2% scythe::dnorm
     103  12.0%  82.0%      103  12.0% _log$fenv_access_off
      58   6.8%  88.8%       58   6.8% scythe::const_matrix_forward_iterator::operator*
      37   4.3%  93.1%       37   4.3% scythe::matrix_forward_iterator::operator*
      15   1.8%  94.9%       47   5.5% std::transform
      13   1.5%  96.4%      486  56.7% SliceStep::DoStep
      10   1.2%  97.5%       10   1.2% 0x0002726c
       5   0.6%  98.1%        5   0.6% 0x000271c7
       5   0.6%  98.7%        5   0.6% _write$NOCANCEL$UNIX2003

This is surprising, since all the real calculation occurs in SliceStep::DoStep. The “_write$UNIX2003” (where can I find out what this is?) appears to be coming from writing the output file. Now, what confuses me is that if I comment out all the outfile << "text" statements and run pprof, 95% is in SliceStep::DoStep and `_write$UNIX2003′ goes away. However my application does not speed up, as measured by total time. The whole thing speeds up less than 1 percent.

What am I missing?

Added:
The pprof output without the outfile << statements is:

Total: 790 samples
     205  25.9%  25.9%      205  25.9% _exp$fenv_access_off
     170  21.5%  47.5%      170  21.5% _log$fenv_access_off
     162  20.5%  68.0%      437  55.3% scythe::dnorm
      83  10.5%  78.5%       83  10.5% scythe::const_matrix_forward_iterator::operator*
      70   8.9%  87.3%       70   8.9% scythe::matrix_forward_iterator::operator*
      28   3.5%  90.9%       78   9.9% std::transform
      26   3.3%  94.2%       26   3.3% 0x00027262
      12   1.5%  95.7%       12   1.5% _write$NOCANCEL$UNIX2003
      11   1.4%  97.1%      764  96.7% SliceStep::DoStep
       9   1.1%  98.2%        9   1.1% 0x00027253
       6   0.8%  99.0%        6   0.8% 0x000274a6

This looks like what I’d expect, except I see no visible increase in performance (.1 second on a 10 second calculation). The code is essentially:

ofstream outfile("out.txt");
for loop:
  SliceStep::DoStep()
  outfile << 'result'
outfile.close()

Update: I timing using boost::timer, starting where the profiler starts and ending where it ends. I do not use threads or anything fancy.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-11T21:42:36+00:00Added an answer on May 11, 2026 at 9:42 pm

    From my comments:

    The numbers you get from your profiler say, that the program should be around 40% faster without the print statements.

    The runtime, however, stays nearly the same.

    Obviously one of the measurements must be wrong. That means you have to do more and better measurements.

    First I suggest starting with another easy tool: the time command. This should get you a rough idea where your time is spend.

    If the results are still not conclusive you need a better testcase:

    • Use a larger problem
    • Do a warmup before measuring. Do some loops and start any measurement afterwards (in the same process).

    Tiristan: It’s all in user. What I’m doing is pretty simple, I think… Does the fact that the file is open the whole time mean anything?

    That means the profiler is wrong.

    Printing 100000 lines to the console using python results in something like:

    for i in xrange(100000):
        print i
    

    To console:

    time python print.py
    [...]
    real    0m2.370s
    user    0m0.156s
    sys     0m0.232s
    

    Versus:

    time python test.py > /dev/null
    
    real    0m0.133s
    user    0m0.116s
    sys     0m0.008s
    

    My point is:
    Your internal measurements and time show you do not gain anything from disabling output. Google Perf Tools says you should. Who’s wrong?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 124k
  • Answers 124k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Have a look at which glDepthFunc you're using, perhaps you're… May 12, 2026 at 1:16 am
  • Editorial Team
    Editorial Team added an answer You could run a parent php process that forks a… May 12, 2026 at 1:16 am
  • Editorial Team
    Editorial Team added an answer Viewbox is quite useful if you need the content of… May 12, 2026 at 1:16 am

Related Questions

So I'm wiring up my first MasterPage, and everything is working great except for
I've been trying to understand what Ant is used for but i still don't
My objective is to get the answer from up to 6000 Urls in the
Issue: Just started today, all references to any assembly outside of the solution fail

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.