I am running a simulation trying to find the probability of something taking place

Question

0

Asked: May 27, 20262026-05-27T11:33:01+00:00 2026-05-27T11:33:01+00:00

I am running a simulation trying to find the probability of something taking place

0

I am running a simulation trying to find the probability of something taking place in a number of binomial trials. I start with specifying the data

iter=5000
data=data.frame(prob=runif(300), value=runif(300))
data<-data[sample(nrow(data), iter, replace=T),]

then I add the trials

cols <- c("one","two","three","four","five","six",
          "seven","eight","nine","ten","eleven","twelve")
data[,cols] <- NA

one contains the results of only one binomial trials, two contains the results of two binomial trials and so on. If a binomial event takes place in any of the one, two, three, …, twelve, the cell is marked 1 else 0.

Then I run the trials for iter=5000 simulations

for (col in 3:14) {
  for (i in 1:iter) if (sum(rbinom((col-2),1,data[i,1]))>0) data[i,col]<-1 else data[i,col]<-0
}

Then I evaluate the mean(data$value[data$one==0] till … mean(data$value[data$twelve==0]

My problem is that the simulation code takes forever for iter>15000.

  for (col in 3:14) {
    for (i in 1:iter)
      data[i,col] <- if (sum(rbinom((col-2),1,data[i,1]))>0) 1 else 0
  }

Any ideas?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T11:33:01+00:00

sim2 <- function(iter) {
    dat <- data.frame(prob=runif(300), value=runif(300))
    dat <- dat[sample(nrow(dat), iter, replace=TRUE),]
    cols <- c("one","two","three","four","five","six",
              "seven","eight","nine","ten","eleven","twelve")
    dat[,cols] <- 0

    for (col in 3:14) {
        dat[,col] <- as.numeric(vapply(dat[,1],
                                       function(p) {sum(rbinom((col-2), 1, p))>0},
                                       FUN.VALUE = TRUE))
    }
    vapply(3:14, function(col) {mean(dat$value[dat[,col]==0])}, FUN.VALUE=1)
}

For iter of 16000, this runs in 2.29s on my machine, compared to an (estimated) 1781s for the ordering in your original algorithm. In general, don’t assign individual elements in the data frame when you can assign the whole column at once. There may be more improvements possible, but I’ll stop at >750x speedup (and changing the algorithm from running time of O(n^2) to O(n)).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am running a simulation trying to find the probability of something taking place

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply