Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9242887
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T08:40:59+00:00 2026-06-18T08:40:59+00:00

I have a data with two columns. In one column it is date and

  • 0

I have a data with two columns. In one column it is date and in another column it is flow data.

I was able to read the data as date and flow data. I used the following code:

creek <- read.csv("creek.csv")
library(ggplot2)
creek[1:10,]
colnames(creek) <- c("date","flow")
creek$date <- as.Date(creek$date, "%m/%d/%Y")

The link to my data is https://www.dropbox.com/s/eqpena3nk82x67e/creek.csv

Now, I want to find the summary of each year. I want to especially know mean, median, maximum etc.

Thanks.

Regards,
Jdbaba

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T08:41:00+00:00Added an answer on June 18, 2026 at 8:41 am

    Base R

    Here are two methods from base R.

    The first uses cut, split and lapply along with summary.

    creekFlowSummary <- lapply(split(creek, cut(creek$date, "1 year")), 
                               function(x) summary(x[2]))
    

    This creates a list. You can view the summaries of different years by accessing the corresponding list index or name.

    creekFlowSummary[1]
    # $`1999-01-01`
    #       flow       
    #  Min.   :0.3187  
    #  1st Qu.:0.3965  
    #  Median :0.4769  
    #  Mean   :0.6366  
    #  3rd Qu.:0.5885  
    #  Max.   :7.2560  
    # 
    creekFlowSummary["2000-01-01"]
    # $`2000-01-01`
    #       flow       
    #  Min.   :0.1370  
    #  1st Qu.:0.1675  
    #  Median :0.2081  
    #  Mean   :0.2819  
    #  3rd Qu.:0.2837  
    #  Max.   :2.3800  
    

    The second uses aggregate:

    aggregate(flow ~ cut(date, "1 year"), creek, summary)
    #    cut(date, "1 year") flow.Min. flow.1st Qu. flow.Median flow.Mean flow.3rd Qu. flow.Max.
    # 1           1999-01-01    0.3187       0.3965      0.4770    0.6366       0.5885    7.2560
    # 2           2000-01-01    0.1370       0.1675      0.2081    0.2819       0.2837    2.3800
    # 3           2001-01-01    0.1769       0.2062      0.2226    0.2950       0.2574    2.9220
    # 4           2002-01-01    0.1279       0.1781      0.2119    0.5346       0.4966   14.3900
    # 5           2003-01-01    0.3492       0.4761      0.7173    1.0350       1.0840   10.1500
    # 6           2004-01-01    0.4178       0.5379      0.6524    0.9691       0.9020   11.7100
    # 7           2005-01-01    0.4722       0.6094      0.7279    1.2340       1.0900   17.7200
    # 8           2006-01-01    0.2651       0.3275      0.4282    0.5459       0.5758    3.3510
    # 9           2007-01-01    0.2784       0.3557      0.4041    0.6331       0.6125    9.6290
    # 10          2008-01-01    0.4131       0.5430      0.6477    0.8792       0.9540    4.5960
    # 11          2009-01-01    0.3877       0.4572      0.5945    0.8465       0.8309    6.3830
    

    Be careful with the aggregate solution though: All of the summary information is a single matrix. View str on the output to see what I mean.

    xts

    There are, of course other ways to do this. One way is to use the xts package.

    First, convert your data to xts:

    library(xts)
    creekx <- xts(creek$flow, order.by=creek$date)
    

    Then, use apply.yearly and whatever functions you are interested in.

    Here is the yearly mean:

    apply.yearly(creekx, mean)
    #                 [,1]
    # 1999-12-31 0.6365604
    # 2000-12-31 0.2819057
    # 2001-12-31 0.2950348
    # 2002-12-31 0.5345666
    # 2003-12-31 1.0351742
    # 2004-12-31 0.9691180
    # 2005-12-31 1.2338066
    # 2006-12-31 0.5458652
    # 2007-12-31 0.6331271
    # 2008-12-31 0.8792396
    # 2009-09-30 0.8465300
    

    And the yearly maximum:

    apply.yearly(creekx, max)
    #              [,1]
    # 1999-12-31  7.256
    # 2000-12-31  2.380
    # 2001-12-31  2.922
    # 2002-12-31 14.390
    # 2003-12-31 10.150
    # 2004-12-31 11.710
    # 2005-12-31 17.720
    # 2006-12-31  3.351
    # 2007-12-31  9.629
    # 2008-12-31  4.596
    # 2009-09-30  6.383
    

    Or, put them together like this: apply.yearly(creekx, function(x) cbind(mean(x), sum(x), max(x)))

    data.table

    The data.table package may also be of interest for you, particularly if you are dealing with a lot of data. Here’s a data.table approach. The key is to use as.IDate on your “date” column while you are reading your data in:

    library(data.table)
    DT <- data.table(date = as.IDate(creek$date), creek[-1])
    DT[, list(mean = mean(flow),
              tot = sum(flow),
              max = max(flow)), 
       by = year(date)]
    #     year      mean      tot    max
    #  1: 1999 0.6365604 104.3959  7.256
    #  2: 2000 0.2819057 103.1775  2.380
    #  3: 2001 0.2950348 107.6877  2.922
    #  4: 2002 0.5345666 195.1168 14.390
    #  5: 2003 1.0351742 377.8386 10.150
    #  6: 2004 0.9691180 354.6972 11.710
    #  7: 2005 1.2338066 450.3394 17.720
    #  8: 2006 0.5458652 199.2408  3.351
    #  9: 2007 0.6331271 231.0914  9.629
    # 10: 2008 0.8792396 321.8017  4.596
    # 11: 2009 0.8465300 231.1027  6.383
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have two files. file1 has the data like belowing containing only one column.
I have two data frames with two columns each. The first column is timestamps
I have a data frame with two columns. First column contains categories such as
I have two columns in a data frame, and I have been able to
I have a table containing several columns, one of which is a date column
I have two tables. One contains default data for 3 columns, (value is 1
I have three sets of data and each has two columns. The first column
If I have a data capture form with either tapstrips and or two columns
I have some data I am querying. The table is composed of two columns
I have two columns. ColA and ColB contains char(10) with data 20090520 and 20090521.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.