i would like to categorize a column in a dataframe based on a comparison of subsequent rows.
for:
DF <- data.frame(respondent=rep(letters[1:2], each=5), response=c(1,1,2,2,1,3,1,1,1,1))
respondent response
1 a 1
2 a 1
3 a 2
4 a 2
5 a 1
6 b 3
7 b 1
8 b 1
9 b 1
10 b 1
I would like to add a new column (eg: check) that turns to 1 if response in row1 = response in row2 and 0 in case they are not the same. This should be done separately for each respondent.
This would give me
respondent response check
1 a 1 1
2 a 1 0
3 a 2 1
4 a 2 0
5 a 1
6 b 3 0
7 b 1 1
8 b 1 1
9 b 1 1
10 b 1
I think i can figure this out with a foor loop but it seems to be a suitable problem for ddply … I just do not see how to address comparisons over adjacent rows …
This is a good candidate for plyr since it splits a data and applies a function then returns the data. In this case, you need to consider the whole
responsevector and a similar vector shifted by one.The way I have approached this problem in the past is: