I have a data with two columns. In one column it is date and in another column it is flow data.
I was able to read the data as date and flow data. I used the following code:
creek <- read.csv("creek.csv")
library(ggplot2)
creek[1:10,]
colnames(creek) <- c("date","flow")
creek$date <- as.Date(creek$date, "%m/%d/%Y")
The link to my data is https://www.dropbox.com/s/eqpena3nk82x67e/creek.csv
Now, I want to find the summary of each year. I want to especially know mean, median, maximum etc.
Thanks.
Regards,
Jdbaba
Base R
Here are two methods from base R.
The first uses
cut,splitandlapplyalong withsummary.This creates a
list. You can view the summaries of different years by accessing the corresponding list index or name.The second uses
aggregate:Be careful with the
aggregatesolution though: All of the summary information is a single matrix. Viewstron the output to see what I mean.xtsThere are, of course other ways to do this. One way is to use the
xtspackage.First, convert your data to
xts:Then, use
apply.yearlyand whatever functions you are interested in.Here is the yearly mean:
And the yearly maximum:
Or, put them together like this:
apply.yearly(creekx, function(x) cbind(mean(x), sum(x), max(x)))data.tableThe
data.tablepackage may also be of interest for you, particularly if you are dealing with a lot of data. Here’s adata.tableapproach. The key is to useas.IDateon your “date” column while you are reading your data in: