I have a problem with appending values to a data frame using parallel processing.
I have a function that will do some calculation and return a dataframe, including these calculation is a random sampling.
so what i did is :
randomizex <- function(testdf)
{
foreach(ind=1:1000)%dopar%
{
testdf$X = sample(testdf$X,nrow(testdf), replace=FALSE)
fit = lm(X ~ Y, testdf)
newdf <- rbind(newdf, data.frame(pc=ind, err=sum(residuals(fit)^2) ))
}
return(newdf)
}
resdf = randomizex(mydf)
when i view the result of resdf, it’s empty
if i replace %dopar% with %do% the result is calculated correctly but it’s too slow ..
is there anyway to boost this a bit ??
I think you need to read the docs for
foreach. Your code block should compute a single part, then you should use the.combineoption to say how to join them all together. Look at the examples in thehelp(foreach)for more guidance. Its not a straight replacement for aforloop.For example: