R package randomForest reports mean squared errors for each tree in the forest. I need, however, a measure of confidence for each case in the data. Since randomForest calculates the casewise predictions by averaging the predictions of the single trees, I guess that it should also be possible to calculate a casewise standard error and thus a confidence interval. Can this be done using the output randomForest object (if so: how?) or do I have to dig into the source code?
R package randomForest reports mean squared errors for each tree in the forest. I
Share
No need to dig into the source code. You only need to read the documentation.
?predict.randomForeststates that one of its arguments is calledpredict.all:So setting that to
TRUEwill keep a prediction for each case, for each tree, which you can then use to calculate standard error for each case.I have recently been made aware of this paper by Stefan Wager, Trevor Hastie and Brad Efron which investigates more rigorously the idea of standard errors for the predictions generated by random forests (and other bagged predictors).