I have a list of data.frames . Within each data.frame I want to split

Question

0

Asked: June 18, 20262026-06-18T03:06:26+00:00 2026-06-18T03:06:26+00:00

I have a list of data.frames . Within each data.frame I want to split

0

I have a list of data.frames. Within each data.frame I want to split by a grouping (z) run a function, put the results back together, then put all the results of the of the nested lapply together in a data.frame, then flatten the list of result data.frames into one data.frame.

library(plyr)
df <- data.frame(x = sample(1:200, 30000, replace = TRUE), 
                y = sample(1:200, 30000, replace = TRUE), 
                z = sample(LETTERS, 30000, replace = TRUE))

alist <- list(df,df,df) # longer in real life
answer <- lapply(alist, function(q) {
    a <- split(q,q$z)
    result.1 <- lapply(a, function(w) {
        neww <- cbind(w[,1],w[,2])
        result.2 <- colSums(neww)
    })
    ldply(result.1)
})
# cor(neww) can actually be a variey of foos I just use cor() for easy reproducibility
ldply(answer)

This has some really tough memory usage and is also slow. Thanks to @Andrie I know how to clear my workspace before I start like:

 rm(list=setdiff(ls(), "alist"))

But is there a way to modify my approach like junking w in the second lapply etc to try reduce memory usage and speed things up? In this case foo likes a matrix and so data.table won’t be my answer. In other foos I will need all w and class will need to be a data.frame

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T03:06:27+00:00

Editorial Team

2026-06-18T03:06:27+00:00Added an answer on June 18, 2026 at 3:06 am

Try something like this:

ldply(alist, ddply, "z", summarize, xy.foo = foo(x, y))

If you want x and y to show up in your final data.frame, replace summarize with transform. Also, looking at your foo usage, you might have to replace (x, y) with cbind(x, y).

Also, I would recommend you profile your code. In the end, foo might be what is slowing you down, not the split/combine part.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a list of data.frames . Within each data.frame I want to split

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply