In a data management step of my analyses I incurred into the following problem.
In practice, each id is recorded up to 5 times, and I have a time-varying variable of interest, tv = 1, 2, 3, 4. Suppose my data are:
dat <- read.table(text = "
id tv
1 2
1 2
1 1
1 4
2 4
2 1
2 4
3 1
3 2
3 3
3 3
3 2",
header=TRUE)
What I need to do is to create two newly sets of variables starting from tv, in order to obtain:
id tv tv1 tv2 tv3 tv4 tv5 dur1 dur2 dur3 dur4 dur5
1 2 2 1 4 0 0 2 1 1 0 0
1 2 2 1 4 0 0 2 1 1 0 0
1 1 2 1 4 0 0 2 1 1 0 0
1 4 2 1 4 0 0 2 1 1 0 0
2 4 4 1 4 0 0 1 1 1 0 0
2 1 4 1 4 0 0 1 1 1 0 0
2 4 4 1 4 0 0 1 1 1 0 0
3 1 1 2 3 2 0 1 1 2 1 0
3 2 1 2 3 2 0 1 1 2 1 0
3 3 1 2 3 2 0 1 1 2 1 0
3 3 1 2 3 2 0 1 1 2 1 0
3 2 1 2 3 2 0 1 1 2 1 0
For each id, in tv1–tv5 we have the ordered sequence of distinct (non-repeated) records of tv, while in dur1–dur5 we have the number of times the respective distinct records are present in the original dataset dat.
I really don’t know how to proceed here.. Any help will be greatly appreciated.
This should do it: