Apologies if this has been answered before, but I find it very difficult to get answers for my R problems!
My problem relates to how I can store the results of multiple anovas in a useful way.
I am performing anovas on subsets of a data frame using ‘aov’, comparing two data frames at a time, using the below function:
doAnova = function(first, second) {
aov(number ~ factor1+factor2, data=rbind(first, second))
}
This is used to compare each subset against a ‘base’ case, to check for significant differences. To perform this over the multiple datasets, I use it in a loop:
for (name in names) {
result = summary(doAnova(base,subject))
}
I want this result to be stored in a data frame with each row containing the ‘name’ and the ‘result’ values.
So far I have tried both storing lists and vectors of the names and results, and then trying to create data frames from those, but haven’t managed to get this right.
I know this is probably pretty simple, but anyone able to help solve this?
Thanks
You seem to be doing an end-around on the more standard practice of analyzing all the data and then doing post-hoc testing to examine subset comparisons. Statisticians would generally consider this to be unprincipled data dredging. Also the help page for
aovsays :"Note
aovis designed for balanced designs, and the results can be hard to interpret without balance: beware that missing values in the response(s) will likely lose the balance."So I think you should be coding your subsets with identifying factor variables and using the facilities that R provides for analysis of unbalanced designs, namely
lm. Only after you have examined the estimated effects in a global fashion should you be turning to appropriate post-hoc tests that allow a principled correction for the multiple comparisons issues.