Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8485633
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T20:42:10+00:00 2026-06-10T20:42:10+00:00

I would like to use data.table to calculate a summary statistic, and then based

  • 0

I would like to use data.table to calculate a summary statistic, and then based on that result, calculate a statistic on a second column.

Here is an example using the Air Quality data.

Set up the data

(pretend it came this way)

library(data.table)
dt = as.data.table(airquality)
dt[ , Season:=ifelse(Month>7, 'Fall', 'Summer')]

Some months have high wind

## The range of monthly Wind values
dt[ , list(MinWind=min(Wind), MaxWind=max(Wind)), 
        by=c('Season', 'Month')]

---- R OUTPUT:
   Season Month MinWind MaxWind
1: Summer     5     5.7    20.1
2: Summer     6     1.7    20.7
3: Summer     7     4.1    14.9
4:   Fall     8     2.3    15.5
5:   Fall     9     2.8    16.6
>

Goal: Calculate the average seasonal Solar Radiation grouped by months that had Wind greater than or less than 20.

Can I do this in one step?

## Add a column to indicate if it was a high wind month
dt[, HighWind:=any(Wind>20), by=Month]
## Aggregate based on both HighWind and Season
dt[, list(AveSolarR=mean(Solar.R, na.rm=TRUE)), by=c("HighWind","Season")]

---- R OUTPUT:
   HighWind season AveSolarR
1:     TRUE Summer  185.9649
2:    FALSE Summer  216.4839
3:    FALSE   Fall  169.5690
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T20:42:12+00:00Added an answer on June 10, 2026 at 8:42 pm

    Why not combine both into one list?

    dt[,list(HighWind=any(Wind>20),AveSolarR=mean(Solar.R,na.rm=T)),by=Month]
       Month HighWind AveSolarR
    1:     5     TRUE  181.2963
    2:     6     TRUE  190.1667
    3:     7    FALSE  216.4839
    4:     8    FALSE  171.8571
    5:     9    FALSE  167.4333
    

    For the modified problem, you need to do the HighWind calculation in the by statement, but I think it makes it more convoluted.

    dt[,list(AveSolarR=mean(Solar.R,na.rm=T)),
      by=list(HighWind=Month%in%Month[Wind>20],Season)]
       HighWind Season AveSolarR
    1:     TRUE Summer  185.9649
    2:    FALSE Summer  216.4839
    3:    FALSE   Fall  169.5690
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I would like to use the data.table package in R to dynamically generate aggregations,
I would like to use the data you get when calling the following URL
I am gathering data from youtube's gdata API and would like to use the
I would like to know how could I use JSON for passing data from
I would like to use a Groovy closure to process data coming from a
I'm using the gem datagrid to display some data. I would like to use
I have a table that contains partial data that is of no use to
I have an array of Integers in Java, I would like use only a
I would like to use R to extract the speaker out of scripts formatted
I would like to use Maven's password encryption such as it uses for nodes

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.