I am having a terrible time running ‘ddply’ over two variables in what seems

Question

0

Editorial Team

Asked: June 15, 20262026-06-15T22:15:03+00:00 2026-06-15T22:15:03+00:00

I am having a terrible time running ‘ddply’ over two variables in what seems

0

I am having a terrible time running ‘ddply’ over two variables in what seems like it should be a simple command.

Sample data (df):

Brand    Day     Rev     RVP              
  A      1        2535.00  195.00 
  B      1        1785.45  43.55 
  C      1        1730.87  32.66 
  A      2        920.00   230.00
  B      2        248.22   48.99 
  C      3        16466.00 189.00      
  A      1        2535.00  195.00 
  B      3        1785.45  43.55 
  C      3        1730.87  32.66 
  A      4        920.00   230.00
  B      5        248.22   48.99 
  C      4        16466.00 189.00

I am using the command:

df2<-ddply(df, .(Brand, Day), summarize, Rev=mean(Rev), RVP=sum(RVP))

My dataframe has about 2600 observations, and there are 45 levels of “Brand” and up to 300 levels of “Day” (which is coded using ‘difftime’).

I am able to easily use ‘ddply’ when simply grouping by “Day,” but when I also try to group by “Brand,” my computer freezes up.

Thoughts?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T22:15:04+00:00

You should read through the help pages for aggregate, by, ave, and tapply, paying close attention to the types of the arguments each one of them expects and the names of the arguments as well. Then run all of the examples or demo(). The main thing @hadley did with pkg:plyr and reshape/reshape2 was to impose some degree of regularity, but it was at the expense of speed. I do understand why he did it, especially when I try to use the base::reshape function, but also when I forget as I repeatedly do, which of these requires a list, which requires the FUN= argument label, which needs interaction() for the grouping variable, …. since they are all somewhat different.

> aggregate(df[3:4], df[1:2], function(d) mean(d) )
   Brand Day       Rev    RVP
1      A   1  2535.000 195.00
2      B   1  1785.450  43.55
3      C   1  1730.870  32.66
4      A   2   920.000 230.00
5      B   2   248.220  48.99
6      B   3  1785.450  43.55
7      C   3  9098.435 110.83
8      A   4   920.000 230.00
9      C   4 16466.000 189.00
10     B   5   248.220  48.99

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am having a terrible time running ‘ddply’ over two variables in what seems

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply