I have data arranged like this in R:
indv time val
A 6 5
A 10 10
A 12 7
B 8 4
B 10 3
B 15 9
For each individual (indv) at each time, I want to calculate the change in value (val) from the initial time. So I would end up with something like this:
indv time val val_1 val_change
A 6 5 5 0
A 10 10 5 5
A 12 7 5 2
B 8 4 4 0
B 10 3 4 -1
B 15 9 4 5
Can anyone tell me how I might do this? I can use
ddply(df, .(indv), function(x)x[which.min(x$time), ])
to get a table like
indv time val
A 6 5
B 8 4
However, I cannot figure out how to make a column val_1 where the minimum values are matched up for each individual. However, if I can do that, I should be able to add column val_change using something like:
df['val_change'] = df['val_1'] - df['val']
EDIT: two excellent methods were posted below, however both rely on my time column being sorted so that small time values are on top of high time values. I’m not sure this will always be the case with my data. (I know I can sort first in Excel, but I’m trying to avoid that.) How could I deal with a case when the table appears like this:
indv time value
A 10 10
A 6 5
A 12 7
B 8 4
B 10 3
B 15 9
Here’s a plyr solution using
ddplyTo get your second table try this:
Edit 1
To deal with unsorted data, like the one you posted in your edit try the following
Now you can apply the procedure described above to this sorted dataframe
Edit 2
A shorter way to sort your dataframe is using
sortByfunction from doBy packageEdit 3
You can even sort your df using
ddply