Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6633971
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T22:52:31+00:00 2026-05-25T22:52:31+00:00

As a toy example, suppose that we have a function called ‘my_func’ (the code

  • 0

As a toy example, suppose that we have a function called ‘my_func’ (the code is below) that takes two parameters ‘n’ and ‘p’. Our function, ‘my_func’, will generate a random matrix ‘x’ with ‘n’ rows and ‘p’ columns and do something expensive in both runtime and memory usage, such as computing the sum of the singular values of ‘x’. (Of course, the function is a one-liner, but I am shooting for readability here.)

my_func <- function(n, p) {
  x <- replicate(p, rnorm(n))
  sum(svd(x)$d)
}

If we wish to compute ‘my_func’ for several values of ‘n’, and for each value of ‘n’ we have several values of ‘p’, then vectorizing the function and then applying it the combinations to ‘my_func’ is straightforward:

n <- 10 * seq_len(5)
p <- 100 * seq_len(10)
grid <- expand.grid(n = n, p = p)
my_func <- Vectorize(my_func)
set.seed(42)
do.call(my_func, grid)
[1]   98.61785  195.50822  292.21575  376.79186  468.13570  145.18359
[7]  280.67456  421.03196  557.87138  687.75040  168.42994  340.42452
[13]  509.65528  683.69883  851.29063  199.08474  400.25584  595.18311
[19]  784.21508  982.34591  220.73215  448.23698  669.02622  895.34184
[25] 1105.48817  242.52422  487.56694  735.67588  976.93840 1203.25949

Notice that each call to ‘my_func’ can be painfully slow for large ‘n’ and ‘p’ (try n = 1000 and p = 2000 for starters).

Now, in my actual application with a similarly constructed function, the number of rows in ‘grid’ is much larger than given here. Hence, I am trying to understand vectorizing in R a little better.

First question: In the above example, are the calls to ‘my_func’ performed sequentially so that the the memory usage in one call is garbage collected before the next call? I use vectorization often but have never stopped to ask this question.

Second question: (This question may depend on the first) Assuming that the number of calls is large enough and that ‘my_func’ is slow enough, is parallelization warranted here? I am presuming yes. My real question is: is parallelization warranted here if instead ‘my_func’ had the same large matrix passed to it for each call? For sake of argument, assume the matrix is called ‘y’, has 1000 rows and 5000 columns and is calculated on-the-fly. Of course, passing the matrix ‘y’ to each of the parallel nodes will incur some lag.

I understand that the answer to the second question may be “It depends on…” If that is the case, please let me know, and I will try to give more detail.

Also, I appreciate any advice, feedback, or OMFG WTF N00B YOU HAVEN’T SEEN THIS OTHER OBSCURE SOMEWHAT RELEVANT DISCUSSION??!!!111oneone1

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T22:52:32+00:00Added an answer on May 25, 2026 at 10:52 pm

    The answer to the first question is pretty clearly yes: almost everything in R is by default serial. (A very few things internally start to use OpenMP, but R as an engine will likely remain single-threaded).

    So for the second question: Yes, do try that. I don’t use Vectorize() much, but I do like the *apply() family. Solve it with lapply(), then load the multicore package and use mclapply() to run it over as many cores as yo u have. Here is an example:

    R> system.time(res <- lapply(1:nrow(grid), 
    +                            function(i) my_func(grid[i,1],grid[i,2])))
       user  system elapsed 
      0.470   0.000   0.459 
    R> system.time(res <- mclapply(1:nrow(grid), 
    +                              function(i) my_func(grid[i,1], grid[i,2])))
       user  system elapsed 
      0.610   0.140   0.135 
    R> 
    

    Notice how elapsed time is now about 29% (= 0.135/0.459) of the original.

    From here you can generalize further with parallel execution across several machines–the Task View on High-Performane Computing with R has further pointers. R 2.14.0 due October 31 will have a new package ‘parallel’ which combines parts of multicore and snow.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Suppose I have a function f that takes a vector v and returns a
How would you fix the following bad code that passes too many parameters around?
When implementing a class with multiple properties (like in the toy example below), what
I have a toy program: $ cat a.hs main = putStrLn Toy example $
I've got this toy code, works fine, using MySQL var r = new SimpleRepository(DB,
I have been making a little toy web application in C# along the lines
I have the following two class: //file FruitTree.h @interface FruitTree : NSObject { Fruit
There are many toy examples of logging. I am looking for a large example,
I have a class that contains a member object. I would like to call
I have tried the ReSharper Power toy Zen Coding and found it can only

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.