Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4581236
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T20:53:49+00:00 2026-05-21T20:53:49+00:00

I have a data frame as follows, > mydata date station treatment subject par

  • 0

I have a data frame as follows,

> mydata
date  station  treatment  subject   par
A       a         0         R1      1.3    
A       a         0         R1      1.4    
A       a         1         R2      1.4   
A       a         1         R2      1.1    
A       b         0         R1      1.5    
A       b         0         R1      1.8     
A       b         1         R2      2.5     
A       b         1         R2      9.5    
B       a         0         R1      0.3    
B       a         0         R1      8.2    
B       a         1         R2      7.3    
B       a         1         R2      0.2    
B       b         0         R1      9.4    
B       b         0         R1      3.2    
B       b         1         R2      3.5    
B       b         1         R2      2.4 
....

where:

date is a factor with 2 levels A/B;
station is a factor with 2 levels a/b;
treatment is a factor with 2 levels 0/1;

subject are the replicates R1 to R20 assigned to treatment (10 to treatment 0 and 10 to treatment 1);

and
par is my parameter, which is a repeated measurement of particle size for each subject at at each date and station

What i need to do is:
divide par in 10 equal bins and count the number in each bin. This has to be done in subsets of mydata definded by a combination of date station and subject. The final outcome has to be a daframe myres as follow:

> myres
    date  station  treatment  bin.centre  freq
    A       a         0         1.2        4 
    A       a         0         1.3        3    
    A       a         0         1.4        2 
    A       a         0         1.5        1    
    A       a         1         1.2        4    
    A       a         1         1.3        3    
    A       a         1         1.4        2     
    A       a         1         1.5        1    
    B       b         0         2.3        5   
    B       b         0         2.4        4    
    B       b         0         2.5        3    
    B       b         0         2.6        2   
    B       b         1         2.3        5   
    B       b         1         2.4        4   
    B       b         1         2.5        3   
    B       b         1         2.6        2
    ....

this is what i’ve done so far:

#define the number of bins
num.bins<-10

#define the width of each bins
bin.width<-(max(par)-min(par))/num.bins

#define the lower and upper boundaries of each bins
bins<-seq(from=min(par), to=max(par), by=bin.width)

#define the centre of each bins
bin.centre<-c(seq(min(bins)+bin.width/2,max(bins)-bin.width/2,by=bin.width))

#create a vector to store the frequency in each bins

  freq<-numeric(length(length(bins-1)))

 # this is the loop that counts the frequency of particles between the lower and upper boundaries
 of each bins and store the result in freq

 for(i in 1:10){
    freq[i]<-length(which(par>=bins[i] &
    par<bins[i+1]))
     }

 #create the data frame with the results
 res<-data.frame(bin.centre,res)

my first approach was to subset mydata manually, using subset(),for each combination of subject station and date, and apply the above sequence of commands for each subsets, then build the final dataframe combining each single res using rbind(), but this procedure was very convoluted and subject to the propagation of errors.
What i would like to do, is to automate the above procedure so that it calculates the binned frequency distribution for each subject. My intuition is that the best way to do this is by creating a function for estimating this particle distribution, and then applying it to each subject via a for loop. However, I am not sure of how to do this. Any suggestions would be really appreciated.

thanks
matteo.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T20:53:50+00:00Added an answer on May 21, 2026 at 8:53 pm

    You can do this in a few steps using the functionality in the plyr package. This allows you to split your data into the desired chunks, apply a statistic to each chunk, and combine the results.

    First I set up some dummy data:

    set.seed(1)
    n <- 100
    dat <- data.frame(
        date=sample(LETTERS[1:2], n, replace=TRUE),
        station=sample(letters[1:2], n, replace=TRUE),
        treatment=sample(0:1, n, replace=TRUE),
        subject=paste("R", sample(1:2, n, replace=TRUE), sep=""),
        par=runif(n, 0, 5)
    )
    head(dat)
    
      date station treatment subject       par
    1    A       b         0      R2 3.2943880
    2    A       a         0      R1 0.9253498
    3    B       a         1      R1 4.7718907
    4    B       b         0      R1 4.4892425
    5    A       b         0      R1 4.7184853
    6    B       a         1      R2 3.6184538
    

    Now I use the function in base called cut to divide your par into equal sized bins:

    dat$bin <- cut(dat$par, breaks=10)
    

    Now for the fun bit. Load package plyr and use the function ddply to split, apply and combine. Because you want a frequency count, we can use the function length to count the number of times each replicate appeared in that bin:

    library(plyr)
    res <- ddply(dat, .(date, station, treatment, bin), 
      summarise, freq=length(treatment))
    head(res)
    
      date station treatment             bin freq
    1    A       a         0 (0.00422,0.501]    1
    2    A       a         0   (0.501,0.998]    2
    3    A       a         0      (1.5,1.99]    4
    4    A       a         0     (1.99,2.49]    2
    5    A       a         0     (2.49,2.99]    2
    6    A       a         0     (2.99,3.48]    1
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a date column in a data frame in chr format as follows:
I have a data.frame as follows: dat <- structure(list(id = 1:4, date = structure(list(sec
I have data frame with some numerical variables and some categorical factor variables. The
In R, I have a data frame with columns for Seat (factor), Party (factor)
I have a data frame that looks as follows (8 columns - the myPOSIX
If I have data like this DF <- data.frame( date = seq(Sys.Date()-1000, len=1000, by=1
I have two data.frames as follows: df1 <- data.frame(A=c(lee,eeu,ees), B=c(lee,ggu,1su), C=c(1,1,1) A B C
Suppose I have: R> str(data) 'data.frame': 4 obs. of 2 variables: $ datetime: Factor
I have a data.frame with 2 columns: Node A, Node B. Each entry in
I have a data frame with two columns. First column contains categories such as

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.