I am pretty sure I am complicating things. I have a data frame with p variables (here: v1 to v3) and two factor variable (here: sex and unemp):
> head(df)
sex unemp v1 v2 v3
1 0 0 2 4 4
2 0 0 2 1 1
3 1 0 3 3 5
4 1 1 2 3 5
5 0 0 1 2 5
6 1 0 3 5 4
I now would like to modify (i.e. compute median and mean and then rearrange the summary table) my data in such way that the resulting data frame looks like this (for men or women):
> df.res.men
median.unemp.1 median.unemp.0 mean.unemp.1 mean.unemp.0
v1 2.0 2.0 2.666667 2.391304
v2 2.0 3.5 2.500000 3.369565
v3 4.5 3.0 4.166667 2.956522
Here is the full code:
library(plyr)
## generate data
set.seed(1)
df <- data.frame(sex=rbinom(100, 1, 0.5),
unemp=rbinom(100, 1, 0.2),
v1=sample(1:5, 100, replace=TRUE),
v2=sample(1:5, 100, replace=TRUE),
v3=sample(1:5, 100, replace=TRUE)
)
head(df)
## compute mean and median for all variables by sex and unemp
df.mean <- ddply(df, .(unemp, sex), .fun=colMeans, na.rm=TRUE)
df.mean
df.median <- ddply(df, .(unemp, sex), .fun=function(x)apply(x,2,median, na.rm=TRUE))
df.median
## rearrange summary table
df.res.men <- cbind(t(subset(df.median, sex==0 & unemp==1)),
t(subset(df.median, sex==0 & unemp==0)),
t(subset(df.mean, sex==0 & unemp==1)),
t(subset(df.mean, sex==0 & unemp==0)))
df.res.men <- df.res.men[-c(1:2),]
colnames(df.res.men) <- c("median.unemp.1", "median.unemp.0",
"mean.unemp.1", "mean.unemp.0")
df.res.men
Here is one approach