Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6540975
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T11:02:48+00:00 2026-05-25T11:02:48+00:00

I am trying to optimize a bit of code, and am puzzled about information

  • 0

I am trying to optimize a bit of code, and am puzzled about information from summaryRprof(). In particular, it looks like a number of calls are made to external C programs, but I’m not able to pin down which C program, from which R function. I am planning to resolve this through a bunch of slicing and dicing of the code, but wondered if I am overlooking some better way to interpret the profiling data.

The highest-consuming function is .Call, which is apparently a generic description for calls to C code; the next leading functions appear to be assignment operations:

$by.self
                             self.time self.pct total.time total.pct
".Call"                        2281.0    54.40     2312.0     55.14
"[.data.frame"                  145.0     3.46      218.5      5.21
"initialize"                    123.5     2.95      217.5      5.19
"$<-.data.frame"                121.5     2.90      121.5      2.90
"as.vector"                     110.5     2.64      416.0      9.92

I decided to focus on the .Call to see how this arises. I looked through the profiling file to find those entries with .Call in the call stack, and the following are the top entries in the call stack (by count of # of appearances):

13640 "eval"
11252 "["
7044 "standardGeneric"
4691 "<Anonymous>"
4658 "tryCatch"
4654 "tryCatchList"
4652 "tryCatchOne"
4648 "doTryCatch"

This list is as clear as mud: I have <Anonymous> and standardGeneric in there.

I believe this is due to calls to functions in the Matrix package, but that’s because I’m looking at the code and that package appears to be the only possible source of C code. However, a lot of different functions from Matrix are called in this package, and it seems very difficult to determine which function is consuming this time.

So, my question is pretty basic: is there some way of deciphering and attributing these calls (e.g. .Call, <Anonymous>, etc.) in some other way? The plot of the call graph for this code is rather tricky to render, given the # of functions involved.

The fallback tactics I see are to either (1) comment out bits of code (and hack around to make the code work with this) to see where the time consumption occurs, or to (2) wrap certain operations inside of other functions and see when those functions appear on the call stack. The latter is inelegant, but it seems like it’s the best way to add a tag to the call stack. The former is unpleasant because it takes quite some time to run the code, and iteratively uncommenting code and rerunning is unpleasant.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T11:02:48+00:00Added an answer on May 25, 2026 at 11:02 am

    It seems that the short answer is “No” and the long answer is “Yes, but you’re not going to enjoy this.” Even answering this question is going to take some time (so stick around, I may be updating it).

    There are several basic things to get one’s head around when working with profiling in R:

    First, there are many different ways to think about profiling. It is quite typical to think in terms of a call stack. At any given instant, this is the sequence of function calls that are active, essentially nested within each other (subroutines, if you will). This is quite useful for understanding the state of evaluations, where functions will return, and lots of other things that are important for seeing things as the computer / interpreter / OS may see them. Rprof does call stack profiling.

    Second, a different perspective is that I’ve got a bunch of code and a particular call is taking a long time: which line in my code caused that call to be made? This is line profiling. R doesn’t have line profiling, as far as I can tell. This is in contrast with Python and Matlab, which both have line profilers.

    Third, the map from from lines to calls is surjective, but it is not bijective: given a particular call stack we cannot guarantee that we can map it back to the code. In fact, call stack analyses often summarize the calls completely out of context of the whole stack (i.e. cumulative times are reported no matter where that call was on all of the different stacks in which it occurred).

    Fourth, even though we have these constraints, we can put on our statistical hats and analyze the call stack data carefully and see what we can make of it. The call stack information is data and we like data, don’t we? 🙂

    Just a quick intro to a call stack. Let’s just assume that our call stack looked like this:

    "C" "B" "A"
    

    This means that function A called B which then calls C (the order is reversed), and the call stack is 3 levels deep. In my code, the call stack gets to as many as 41 levels deep. Since the stacks can be so deep and are presented in reverse order, this is more interpretable by software than by a human. Naturally, we begin cleaning and transforming this data. 🙂

    Now, our data really comes along looking like:

    ".Call" "subCsp_cols" "[" "standardGeneric" "[" "eval" "eval" "callGeneric"
    "[" "standardGeneric" "[" "myFunc2" "myFunc1" "eval" "eval" "doTryCatch"
    "tryCatchOne" "tryCatchList" "tryCatch" "FUN" "lapply" "mclapply"
    "<Anonymous>" "%dopar%"
    

    Miserable, isn’t it? It even has duplicates of things like eval, some guy called <Anonymous> – probably some darn hacker. (Anonymous is legion, by the way. :-))

    The first step in transforming this into something useful was to split each line of Rprof() output and reverse the entries (via strsplit and rev). The first 12 entries (last 12 if you look at the raw call stack, rather than the post-rev version) were the same for every line (of which there were about 12000, the sampling interval was 0.5 seconds – so about 100 minutes of profiling), and these can be discarded.

    Remember, we’re still interested in knowing which line(s) led to .Call, which took so much time. Before we get to that question, we put on our statistical caps: the profiling reports, e.g. from summaryRprof, profr, ggplot, etc., only reflect the cumulative time spent for a given call or for calls beneath a given call. What does this cumulative information not tell us? Bingo: whether that call was made many times, or a few, and whether the time spent was constant over all invocations of that call or whether there are some outliers. A particular function might be executed 100 times or 100K times, but all of the cost may come from a single invocation (it shouldn’t, but we don’t know until we look at the data).

    This only begins to describe the fun. The A->B->C example doesn’t reflect the way things may really appear, such as A->B->C->D->B->E. Now, “B” may be counted a couple of times. What’s more, suppose that a lot of time is spent in the C level, but we never sample at precisely that level, only seeing its child calls in the stack. We may see a sizable time for “total.time”, but not for “self.time”. If there are lots of different child calls under C, we may lose sight of what to optimize – should we take out C altogether or tweak the children, B, D, and E?

    Just to account for the time spent, I took the sequences and ran them through digest, storing counts for the digested values, via hash. I also split up the sequences, storing {(A),(A,B), (A,B,C), etc.}. This doesn’t seem so interesting, but removing singletons from the counts helps a lot in cleaning up the data. We can also store the time spent in each call by using rle(). This is useful for analyzing the distribution of time spent for a given call.

    Still we’re nowhere closer to finding the actual time spent per line of code. We’ll never get lines of code from the call stack. A simpler way to do this is to store a list of times throughout the code, which stores the output of proc.time(), for a given invocation. Taking the difference of these times reveals which lines or sections of code are taking a long time. (Hint: that’s what we’re really looking for, not the actual calls.)

    However, we have this call stack and we might as well do something useful. Going up the stack is somewhat interesting, but if we rewind the profile information to a little earlier, we can find which calls tend to precede the longer running calls. This allows us to look for landmarks in the call stack – positions where we can tie a call to a particular line of code. This makes it a bit easier to map more calls back to code, if all we have is the call stack, rather than instrumented code. (As I keep mentioning: out of context, there isn’t a 1:1 mapping, but at a fine enough granularity, especially in repeatedly hit calls that are distinctive, you may be able to find landmarks in the calls that map to code.)

    Altogether, I was able to find which calls were taking a lot of time, whether that was based on 1 long interval or many small ones, what the distribution of time spent was like, and, with some effort, I was able to map the most important & time consuming calls back to the code and discover which parts of the code could benefit the most from rewriting or from a change in algorithms.

    Statistical analyses of the call stack is loads of fun, but investigating a particular call based on cumulative time consumption is not a very good way to go. The cumulative time consumed by a call is informative on a relative basis, but it doesn’t enlighten us as whether one or many calls consumed this time, nor the depth of the call in the stack, nor the section of code responsible for the invocations. The first two things can be addressed via a bit more R code, while the latter is best pursued through instrumented code.

    As R doesn’t yet have line profilers like Python and Matlab, the simplest way to handle this is to just instrument one’s code.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

While trying to optimize a code, I'm a bit puzzled by differences in profiles
I'm working on a bit of code and I'm trying to optimize it as
I am trying to learn a bit about optimizing my jQuery code... Is there
I am trying to optimize a piece of .NET 2.0 C# code that looks
I'm trying to optimize some code where I have a large number of arrays
I'm trying to optimize my C++ code. I've searched the internet on using dynamically
I'm trying to optimize some bit packing and unpacking routines. In order to do
I am trying to optimize a piece of code that clones an object: #region
I have been trying to optimize some code which handles raw pixel data. Currently
I am trying to optimize my mysql table a little bit in order to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.