I have a large data frame that has three identifiers. For example:
df <- data.frame(year=c(1999,1999,2000,2000,2000), country=c('K','K','M','M','S'),
site=c('di','se','di','di','di'))
Which will produce a data frame like this:
year country site
1999 K di
1999 K se
2000 M di
2000 M di
2000 S di
I want to add an additional column to the data frame and have a ‘unique id’ assigned by using the entries for ‘year’, ‘country’, and ‘site’. It would look something like this:
year country site unique_id
1999 K di 1
1999 K se 2
2000 M di 3
2000 M di 3
2000 S di 4
Any suggestions on how to do this would be greatly appreciated. I’m thinking it could somehow be done using the plyr package?
This should work quite nicely. (It takes advantage of the fact that unique levels of a factor are each actually stored as integers, and uses
as.numeric()to access/extract those integer values).