I have a dataframe running into about 500,000 rows. One of these columns contains

Question

0

Asked: June 7, 20262026-06-07T06:53:15+00:00 2026-06-07T06:53:15+00:00

I have a dataframe running into about 500,000 rows. One of these columns contains

0

I have a dataframe running into about 500,000 rows. One of these columns contains positive integer values, say column A. let there be another column B

I now need to create a second dataframe with number of rows equal to sum(dataframe$A). this is done.

A question of performance arises when i need to fill this new data frame up with data. I am trying to create a column A2 for this second frame as follows:

A2<-vector() 
for (i in 1:nrow(dataframe)){
  A2<-c(A2,rep(dataframe$B[i],dataframe$A[i]))
}

The external loop is obviously very slow for the large number of rows being processed. Any suggestions on how to achieve this task with faster processing.

Thanks for responses

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T06:53:16+00:00

You simply do not need the loop at all. rep is already vectorized.

A2 <- rep(dataframe$B, dataframe$A)

Should work. As a reproducible example, here is your way using the built in mtcars dataset.

x <- vector()
for(i in 1:nrow(mtcars)) {x <- c(x, rep(mtcars$cyl[i], mtcars$gear[i]))}
> x
  [1] 6 6 6 6 6 6 6 6 4 4 4 4 6 6 6 8 8 8 6 6 6 8 8 8 4 4 4 4 4 4 4 4 6 6 6 6 6
 [38] 6 6 6 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8
 [75] 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8 8 8 8 8 6 6 6 6 6 8 8
[112] 8 8 8 4 4 4 4

and vectorized, it is:

x2 <- rep(mtcars$cyl, mtcars$gear)
> x2
  [1] 6 6 6 6 6 6 6 6 4 4 4 4 6 6 6 8 8 8 6 6 6 8 8 8 4 4 4 4 4 4 4 4 6 6 6 6 6
 [38] 6 6 6 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8
 [75] 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 4 4 4 4 4 4 4 4 4 8 8 8 8 8 6 6 6 6 6 8 8
[112] 8 8 8 4 4 4 4

which will be orders of magnitude faster than using a loop.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a dataframe running into about 500,000 rows. One of these columns contains

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply