Possible Duplicate:
Averaging column values for specific sections of data corresponding to other column values
I would like to analyze a dataset by group. The data is set up like this:
Group Result cens
A 1.3 1
A 2.4 0
A 2.1 0
B 1.2 1
B 1.7 0
B 1.9 0
I have a function that calculates the following
sumStats = function(obs, cens) {
detects = obs[cens==0]
nondetects= obs[cens=1]
mean.detects=mean(detects)
return(mean.detects) }
This of course is a simple function for illustration purpose. Is there a function in R that will allow me to use this home-made function that needs 2 variables input to analyze the data by groups.
I looked into the by function but it seems to take in 1 column data at a time.
Import your data:
Though there are many ways to do this, using
byspecifically you could do something like this (assuming your dataframe is calledtest):which will give you the mean of all the
Resultsvalues within each group which havecens==1Output looks like:
To help you understand how this might work with your function, consider this:
If you just ask the
bystatement toreturnthe contents of each group, you will get:…which is actually 2 data frames with only the rows for each group, stored as a list:
This means you can access parts of the data.frames for each group as you would before they they were split up. The
xin the above functions is referring to the whole sub-dataframe for each of the groups. I.e. – you can use individual variables as part ofxto pass to functions – a basic example:Now, to finally get around to answering your specific query!
If you take an example function which gets the mean of two inputs separately:
You could call this using
byto get the mean of bothResultandcenslike so:Hope that is helpful.