I have a data.frame from this code:
my_df = data.frame("read_time" = c("2010-02-15", "2010-02-15",
"2010-02-16", "2010-02-16",
"2010-02-16", "2010-02-17"),
"OD" = c(0.1, 0.2, 0.1, 0.2, 0.4, 0.5) )
which produces this:
> my_df
read_time OD
1 2010-02-15 0.1
2 2010-02-15 0.2
3 2010-02-16 0.1
4 2010-02-16 0.2
5 2010-02-16 0.4
6 2010-02-17 0.5
I want to average the OD column over each distinct read_time (notice some are replicated others are not) and I also would like to calculate the standard deviation, producing a table like this:
> my_df
read_time OD stdev
1 2010-02-15 0.15 0.05
5 2010-02-16 0.3 0.1
6 2010-02-17 0.5 0
Which are the best functions to deal with concatenating such values in a data.frame?
The plyr package is popular for this, but the base functions
by()andaggregate()will also help.You can add the missing bit to return 0 instead of NA for the last std.dev.
Also, you don’t need the quotes (on the variables) you had in the data.frame construction.