I have data like this:
ID ATTRIBUTE START END
1 A 01-01-2000 15-03-2010
1 B 05-11-2001 06-02-2002
2 B 01-02-2002 08-05-2008
2 B 01-06-2008 01-07-2008
I now want to count the number of different IDs having a certain attribute per year.
A result could look like this:
YEAR count(A) count(B)
2000 1 0
2001 1 1
2002 1 2
2003 1 1
2004 1 1
2005 1 1
2006 1 1
2007 1 1
2008 1 1
2009 1 0
2010 1 0
I the second step of counting the occurences is probably easy.
But how would I split my data into years?
Thank you in advance!
Here is an approach using a few of Hadley’s packages.
EDIT: If the original
data.frameis large, thenadplymight take a lot of time. A useful alternate in such cases is to use thedata.tablepackage. Here is how we can replace theadplycall usingdata.table.