I’m trying to use the rfcv function in the randomForest package. I’m getting an error message as follows:
> rfcv1 <- rfcv(x[1:18750,], testClass[1:18750], cv.fold=2)
Error in cut.default(trainy, c(-Inf, quantile(trainy, 1:4/5), Inf)) :
'breaks' are not unique
> nrow(unique(x[1:18750,]))
[1] 18719
> length(unique(testClass[1:18750])) ## just 0's and 1's
[1] 2
> head(x)
rfPred prediction
3 0.34776664 0.30138045
5 0.22345507 0.11159273
7 0.03478699 0.02156816
17 0.01008994 0.01071626
24 0.01738253 0.01546157
25 0.01143016 0.01278491
> range(x)
[1] 0.003907361 0.966005867
Anything seem off? I tried shrinking the data so that the unique values was divisible by 5, but still get the same message. I also tried various cv.fold= values without effect.
I’m just guessing here, but in the code for
rfcv, we see:If you’re doing classification, it just uses your
trainyargument, otherwise it tries to cut the variable. So my guess is that you have a vector of integer 0’s and 1’s that you need to convert to a factor.