I have a data.frame with 20 columns. The first two are factors, and the

Question

0

Asked: May 20, 20262026-05-20T23:51:04+00:00 2026-05-20T23:51:04+00:00

I have a data.frame with 20 columns. The first two are factors, and the

0

I have a data.frame with 20 columns. The first two are factors, and the rest are numeric. I’d like to use the first two columns as split variables and then apply the mean() to the remaining columns.

This seems like a quick and easy job for ddply(), however, the results for the output data.frame are not what I am looking for. Here is a minimal example with just one column of data:

Aa <- c(rep(c("A", "a"), each = 20))
Bb <- c(rep(c("B", "b", "B", "b"), each = 10))
x <- runif(40)
df1 <- data.frame(Aa, Bb, x)

ddply(df1, .(Aa, Bb), mean)

The output is:

  Aa Bb         x
1 NA NA 0.5193275
2 NA NA 0.4491907
3 NA NA 0.4848128
4 NA NA 0.4717899
Warning messages:
1: In mean.default(X[[1L]], ...) :
  argument is not numeric or logical: returning NA

The warning is repeated 8 times, presumably once for each call to mean(). I’m guessing this comes from trying to take the mean of a factor. I could write this as:

ddply(df1, .(Aa, Bb), function(df1) mean(df1$x))

or

ddply(df1, .(Aa, Bb), summarize, x = mean(x))

both of which do work (not giving NAs), but I would rather avoid writing out 18 such x = mean(x) statements, one for each of my numeric columns.

Is there a general solution? I’m not wedded to ddply if there is a better answer elsewhere.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-20T23:51:04+00:00

Since you are reducing hte number of rows, you need to use summarise:

> ddply(df1, .(Aa, Bb), summarise, mean_x =mean(x) )
  Aa Bb    mean_x
1  a  b 0.3790675
2  a  B 0.4242922
3  A  b 0.5622329
4  A  B 0.4574471

It’s just as easy to use aggregate in this instance. Let’s say you had two variables:

> aggregate(df1[-(1:2)], df1[1:2], mean)
  Aa Bb         x         y
1  a  b 0.4249121 0.4639192
2  A  b 0.6127175 0.4639192
3  a  B 0.4522292 0.4826715
4  A  B 0.5201965 0.4826715

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a data.frame with 20 columns. The first two are factors, and the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply