Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3323394
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T23:17:51+00:00 2026-05-17T23:17:51+00:00

I have a Perl script that takes about 30 minutes to run, so of

  • 0

I have a Perl script that takes about 30 minutes to run, so of course I run Devel::NYTProf. Great profiler. For many of my subs, I’m getting some data that doesn’t make sense to me.

I’m running with perl 5.10.0 on Linux using the default NYTProf settings.

In the HTML output, each of the subs has a summary section stating how much time is spent in the sub and its children and then goes on to give me line information.

The line statistics don’t add up to the total spent in the function. What gives?

For example, I have a function that’s reported to use 233s (57+166). The line-by-line number report has one line that uses 20s, another that uses 4 and one that uses 2. The other lines are <1s and the function is not that long.

What can I do to resolve this mismatch?

I could move to Perl 5.12 but that would take some work to install the dependencies. I’m happy to run it in a slower mode. Is there a way to increase the sampling frequency? Run on a slower machine?

Click here for a sample: my NYTProf output. In this case, the sub is reported to use 225 seconds, but adding all of the numbers yields 56 seconds. This run had optimization turned off:

setenv NYTPROF optimize=0:file=nytprof.optout

Update I’ve rerun with Perl 5.12 using the findcaller=1 option flag as suggested with more or less the same results. (I ran on a different dataset)

Update Tim B is right. I have changed some of my key subs to do caching themselves instead of using memoize and the NYTProf results are useful again. Thank you Tim.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T23:17:52+00:00Added an answer on May 17, 2026 at 11:17 pm

    I’ve just added this to the NYTProf documentation:

    =head2 If Statement and Subroutine Timings Don’t Match

    NYTProf has two profilers: a statement
    profiler that’s invoked when perl
    moves from one perl statement to
    another, and a subroutine profiler
    that’s invoked when perl calls or
    returns from a subroutine.

    The individual statement timings for a
    subroutine usually add up to slightly
    less than the exclusive time for the
    subroutine. That’s because the
    handling of the subroutine call and
    return overheads is included in the
    exclusive time for the subroutine. The
    difference may only be a new
    microseconds but that may become
    noticeable for subroutines that are
    called hundreds of thousands of times.

    The statement profiler keeps track how
    much time was spent on overheads, like
    writing statement profile data to
    disk. The subroutine profiler
    subtracts the overheads that have
    accumulated between entering and
    leaving the subroutine in order to
    give a more accurate profile. The
    statement profiler is generally very
    fast because most writes get buffered
    for zip compression so the profiler
    overhead per statement tends to be
    very small, often a single ‘tick’. The
    result is that the accumulated
    overhead is quite noisy. This becomes
    more significant for subroutines that
    are called frequently and are also
    fast. This may be another, smaller,
    contribution to the discrepancy
    between statement time and exclusive
    times.

    That probably explains the difference between the sum of the statement time column (31.7s) and the exclusive time reported for the subroutine (57.2s). The difference amounts to approximately 100 microseconds per call (which seems a little high, but not unreasonably so).

    The statement profiler keeps track of how much time was spent on overheads, like writing statement profile data to disk. The subroutine profiler subtracts the difference in overheads between entering and leaving the subroutine in order to give a more accurate profile.

    The statement profiler is generally very fast because most writes get buffered for zip compression so the profiler overhead per statement tends to be very small, often a single ‘tick’. The result is that the accumulated overhead is quite noisy. This becomes more significant for subroutines that are called frequently and are also fast (in this case 250303 calls at 899µs/call). So I suspect this is another, smaller, contribution to the discrepancy between statement time and exclusive times.

    More importantly, I’ve also added this section:

    =head2 If Headline Subroutine Timings Don’t Match the Called Subs

    Overall subroutine times are reported
    with a headline like “spent 10s (2+8)
    within …”. In this example, 10
    seconds were spent inside the
    subroutine (the “inclusive time”) and,
    of that, 8 seconds were spent in
    subroutines called by this one. That
    leaves 2 seconds as the time spent in
    the subroutine code itself (the
    “exclusive time”, sometimes also
    called the “self time”).

    The report shows the source code of
    the subroutine. Lines that make calls
    to other subroutines are annotated
    with details of the time spent in
    those calls.

    Sometimes the sum of the times for
    calls made by the lines of code in the
    subroutine is less than the
    inclusive-exclusive time reported in
    the headline (10-2 = 8 seconds in the
    example above).

    What’s happening here is that calls to
    other subroutines are being made but
    NYTProf isn’t able to determine the
    calling location correctly so the
    calls don’t appear in the report in
    the correct place.

    Using an old version of perl is one
    cause (see below). Another is calling
    subroutines that exit via “goto
    &sub;” – most frequently encountered
    in AUTOLOAD subs and code using the Memoize
    module.

    In general the overall subroutine
    timing is accurate and should be
    trusted more than the sum of statement
    or nested sub call timings.

    The Memoize module is primary the cause of the discrepancy in your report. The calls to Memoize::__ANON__[...] execute a sub generated by Memoize that looks like sub { unshift @_, $cref; goto &_memoizer; }. That goto &_memoizer is implemented by perl as a kind of return to the caller followed by a call to the specified sub, and that’s the way NYTProf profiles it.

    The confusion is caused by the fact that, although add_bit_to_map is being recorded as the caller of _memoizer so the time in the call gets added to add_bit_to_map, the file and line number location of the call is recorded as the location of the goto.

    It may be possible to improve this in a future release.

    Thank you for prompting me to investigate this and improve the documentation.

    Tim Bunce.

    p.s. I recommend asking questions about NYTProf on the mailing list.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a perl cgi script that's fairly resource intensive (takes about 2 seconds
I have a huge Perl script (1500+ lines) that takes about 8 hours to
I have a Perl script that takes user input and creates another script that
I have a Perl script that takes text values from a MySQL table and
I have a Perl script that requires two command line arguments that takes a
I have a perl script that takes 1 argument, stores the result in an
I have a Perl script that uses WWW::Mechanize to read from a file and
I have a Perl script that launches 2 threads,one for each processor. I need
I have a Perl script that requires a couple of plugins, for istance nmap.
I have a Perl script that will execute three applications. All of it have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.