Background Now, I have a data frame shaped like: example = structure(list(sid = c(39,

Question

0

Editorial Team

Asked: June 15, 20262026-06-15T19:50:56+00:00 2026-06-15T19:50:56+00:00

Background Now, I have a data frame shaped like: example = structure(list(sid = c(39,

0

Background

Now, I have a data frame shaped like:

example = structure(list(sid = c(39, 40, 41, 42, 42, 43, 43, 44, 45, 45, 
46, 46, 47, 48, 49, 49, 50, 51, 52, 52, 53), monthday = c("42", 
"44", "46", "410", "428", "423", "49", "411", "416", "430", "418", 
"426", "419", "420", "420", "53", "421", "424", "425", "53", 
"511")), .Names = c("sid", "monthday"), row.names = c(301L, 300L, 
298L, 296L, 282L, 288L, 297L, 295L, 294L, 281L, 293L, 285L, 292L, 
291L, 290L, 278L, 289L, 287L, 286L, 279L, 270L), class = "data.frame")

In other words, it is tall:

sid   monthday  
39     42        
40     44         
41     46        
42    410        
42    428
43    423        
43     49

Ultimately, I would like to make it into a wide format:

sid   monthday1  monthday2
39     42         NA
40     44         NA
41     46         NA
42    410        428
43    423        49

etc

I’ve been trying things with reshape and reshape2 packages and also with aggregate like:

library(reshape2)
temp = melt(example,id.vars=c("sid"))
data.wide <- dcast(temp, sid ~ variable, value.var="value")

But can’t wrapp my brain around it. It occurs to me that if I could identify the occurance of each sid, I could solve my problem.

Immediate Problem

So how can take the tall data sid column above I make a new variable that indicates the occruence of each sid:

sid   occur 
39     1   
40     1   
41     1   
42     1
42     2 
43     1
43     2

the occur variable is indicating that sid values 39, 40, and 41 only appear once while 42 and 43 have first and second instances. If I only ever had two instances, I could use duplicated() and convert that to numeric, but what is a solution that can generalize to an arbitrary number of instances?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T19:50:58+00:00

You can use ave to generate your “times”:

example$time <- ave(example$sid, example$sid, FUN = seq_along)
head(example)
#     sid monthday time
# 301  39       42    1
# 300  40       44    1
# 298  41       46    1
# 296  42      410    1
# 282  42      428    2
# 288  43      423    1
reshape(example, direction = "wide", idvar="sid", timevar="time")
#    sid monthday.1 monthday.2
# 301  39         42       <NA>
# 300  40         44       <NA>
# 298  41         46       <NA>
# 296  42        410        428
# 288  43        423         49
# 295  44        411       <NA>
# 294  45        416        430
# 293  46        418        426
# 292  47        419       <NA>
# 291  48        420       <NA>
# 290  49        420         53
# 289  50        421       <NA>
# 287  51        424       <NA>
# 286  52        425         53
# 270  53        511       <NA>

Or, with dcast from “reshape2” after adding your time variable:

dcast(example, sid ~ time, value.var="monthday")

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Background Now, I have a data frame shaped like: example = structure(list(sid = c(39,

Background

Immediate Problem

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply