Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8220525
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T13:32:55+00:00 2026-06-07T13:32:55+00:00

According to Creating an R dataframe row-by-row , it’s not ideal to append to

  • 0

According to Creating an R dataframe row-by-row, it’s not ideal to append to a data.frame using rbind, as it creates a copy of the whole data.frame each time. How do I accumulate data in R resulting in a data.frame without incurring this penalty? The intermediate format doesn’t need to be a data.frame.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T13:33:00+00:00Added an answer on June 7, 2026 at 1:33 pm

    First approach

    I tried accessing each element of a pre-allocated data.frame:

    res <- data.frame(x=rep(NA,1000), y=rep(NA,1000))
    tracemem(res)
    for(i in 1:1000) {
      res[i,"x"] <- runif(1)
      res[i,"y"] <- rnorm(1)
    }
    

    But tracemem goes crazy (e.g. the data.frame is being copied to a new address each time).

    Alternative approach (doesn’t work either)

    One approach (not sure it’s faster as I haven’t benchmarked yet) is to create a list of data.frames, then stack them all together:

    makeRow <- function() data.frame(x=runif(1),y=rnorm(1))
    res <- replicate(1000, makeRow(), simplify=FALSE ) # returns a list of data.frames
    library(taRifx)
    res.df <- stack(res)
    

    Unfortunately in creating the list I think you will be hard-pressed to pre-allocate. For instance:

    > tracemem(res)
    [1] "<0x79b98b0>"
    > res[[2]] <- data.frame()
    tracemem[0x79b98b0 -> 0x71da500]: 
    

    In other words, replacing an element of the list causes the list to be copied. I assume the whole list, but it’s possible it’s only that element of the list. I’m not intimately familiar with the details of R’s memory management.

    Probably the best approach

    As with many speed or memory-limited processes these days, the best approach may well be to use data.table instead of a data.frame. Since data.table has the := assign by reference operator, it can update without re-copying:

    library(data.table)
    dt <- data.table(x=rep(0,1000), y=rep(0,1000))
    tracemem(dt)
    for(i in 1:1000) {
      dt[i,x := runif(1)]
      dt[i,y := rnorm(1)]
    }
    # note no message from tracemem
    

    But as @MatthewDowle points out, set() is the appropriate way to do this inside a loop. Doing so makes it faster still:

    library(data.table)
    n <- 10^6
    dt <- data.table(x=rep(0,n), y=rep(0,n))
    
    dt.colon <- function(dt) {
      for(i in 1:n) {
        dt[i,x := runif(1)]
        dt[i,y := rnorm(1)]
      }
    }
    
    dt.set <- function(dt) {
      for(i in 1:n) {
        set(dt,i,1L, runif(1) )
        set(dt,i,2L, rnorm(1) )
      }
    }
    
    library(microbenchmark)
    m <- microbenchmark(dt.colon(dt), dt.set(dt),times=2)
    

    (Results shown below)

    Benchmarking

    With the loop run 10,000 times, data table is almost a full order of magnitude faster:

    Unit: seconds
              expr        min         lq     median         uq        max
    1    test.df()  523.49057  523.49057  524.52408  525.55759  525.55759
    2    test.dt()   62.06398   62.06398   62.98622   63.90845   63.90845
    3 test.stack() 1196.30135 1196.30135 1258.79879 1321.29622 1321.29622
    

    benchmarks

    And comparison of := with set():

    > m
    Unit: milliseconds
              expr       min        lq    median       uq      max
    1 dt.colon(dt) 654.54996 654.54996 656.43429 658.3186 658.3186
    2   dt.set(dt)  13.29612  13.29612  15.02891  16.7617  16.7617
    

    Note that n here is 10^6 not 10^5 as in the benchmarks plotted above. So there’s an order of magnitude more work, and the result is measured in milliseconds not seconds. Impressive indeed.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am creating some PDF reports using iText in Java. According to the requirements,
According to iOS's Using and Creating Error Objects , one can display a error
I'm creating a stored procedure for searching some data in my database according to
I am creating Accordion menu using JQuery. I want to hide links according to
I am creating a database based on a ERD i have designed according to
According to the documentation : [ java.lang.reflect. ] Proxy provides static methods for creating
I am creating a pricing program. I need to calculate the amounts according to
I'm creating a mobile site next to our normal html site. Using rails 3.1.
I am creating a multidimensional array for sections / rows based on json data
I'm creating an HttpCookie, setting only the name and value and not the expires

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.