I have data like this: ID ATTRIBUTE START END 1 A 01-01-2000 15-03-2010 1

Question

0

Asked: May 26, 20262026-05-26T12:25:11+00:00 2026-05-26T12:25:11+00:00

I have data like this: ID ATTRIBUTE START END 1 A 01-01-2000 15-03-2010 1

0

I have data like this:

ID    ATTRIBUTE        START          END
 1            A   01-01-2000   15-03-2010
 1            B   05-11-2001   06-02-2002
 2            B   01-02-2002   08-05-2008
 2            B   01-06-2008   01-07-2008

I now want to count the number of different IDs having a certain attribute per year.

A result could look like this:

YEAR    count(A)    count(B)
2000          1           0
2001          1           1
2002          1           2
2003          1           1
2004          1           1
2005          1           1
2006          1           1
2007          1           1
2008          1           1
2009          1           0
2010          1           0

I the second step of counting the occurences is probably easy.

But how would I split my data into years?

Thank you in advance!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T12:25:12+00:00

Here is an approach using a few of Hadley’s packages.

library(lubridate); library(reshape2); library(plyr)

# extract years from start and end dates after converting them to date
dfr2 = transform(dfr, START = year(dmy(START)), END = year(dmy(END)))

# for every row, construct a sequence of years from start to end
dfr2 = adply(dfr2, 1, transform, YEAR = START:END)

# create pivot table of year vs. attribute with number of unique values of ID
dcast(dfr2, YEAR ~ ATTRIBUTE, function(x) length(unique(x)), value_var = 'ID')

EDIT: If the original data.frame is large, then adply might take a lot of time. A useful alternate in such cases is to use the data.table package. Here is how we can replace the adply call using data.table.

require(data.table)
dfr2 = data.table(dfr2)[,list(YEAR = START:END),'ID, ATTRIBUTE']

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have data like this: ID ATTRIBUTE START END 1 A 01-01-2000 15-03-2010 1

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply