I am pretty sure I am complicating things. I have a data frame with

Question

0

Asked: May 27, 20262026-05-27T10:59:40+00:00 2026-05-27T10:59:40+00:00

I am pretty sure I am complicating things. I have a data frame with

0

I am pretty sure I am complicating things. I have a data frame with p variables (here: v1 to v3) and two factor variable (here: sex and unemp):

> head(df)
  sex unemp v1 v2 v3
1   0     0  2  4  4
2   0     0  2  1  1
3   1     0  3  3  5
4   1     1  2  3  5
5   0     0  1  2  5
6   1     0  3  5  4

I now would like to modify (i.e. compute median and mean and then rearrange the summary table) my data in such way that the resulting data frame looks like this (for men or women):

> df.res.men
   median.unemp.1 median.unemp.0 mean.unemp.1 mean.unemp.0
v1            2.0            2.0     2.666667     2.391304
v2            2.0            3.5     2.500000     3.369565
v3            4.5            3.0     4.166667     2.956522

Here is the full code:

library(plyr)
## generate data
set.seed(1)
df <- data.frame(sex=rbinom(100, 1, 0.5),
                 unemp=rbinom(100, 1, 0.2),
                 v1=sample(1:5, 100, replace=TRUE),
                 v2=sample(1:5, 100, replace=TRUE),
                 v3=sample(1:5, 100, replace=TRUE)
                 )
head(df)

## compute mean and median for all variables by sex and unemp
df.mean <- ddply(df, .(unemp, sex), .fun=colMeans, na.rm=TRUE)
df.mean
df.median <- ddply(df, .(unemp, sex), .fun=function(x)apply(x,2,median, na.rm=TRUE))
df.median

## rearrange summary table
df.res.men <- cbind(t(subset(df.median, sex==0 & unemp==1)),
                 t(subset(df.median, sex==0 & unemp==0)),
                 t(subset(df.mean, sex==0 & unemp==1)),
                 t(subset(df.mean, sex==0 & unemp==0)))
df.res.men <- df.res.men[-c(1:2),]
colnames(df.res.men) <- c("median.unemp.1", "median.unemp.0", 
                          "mean.unemp.1", "mean.unemp.0")
df.res.men

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T10:59:40+00:00

Here is one approach

library(plyr); library(reshape2)
dfm <- melt(df, id = c('sex', 'unemp'))
df2 <- ddply(dfm, .(variable, unemp, sex), summarize, 
  avg = mean(value), med = median(value))

df2m <- melt(df2, id = 1:3, variable.name = 'sum_fun')
df_0 <- dcast(df2m, sex + variable ~ sum_fun + unemp, subset = .(sex == 0))

   sex variable    avg_0  avg_1 med_0 med_1
1   0       v1 2.794872 3.0000     3   3.5
2   0       v2 3.102564 2.8750     3   3.0
3   0       v3 3.205128 3.1875     3   4.0

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am pretty sure I am complicating things. I have a data frame with

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply