I am using cross valind function on a very small data… However I observe that it gives me incorrect results for the same. Is this supposed to happen ?
I have Matlab R2012a and here is my output
crossvalind('KFold',1:1:11,5)
ans =
2
5
1
3
2
1
5
3
5
1
5
Notice the absence of set 4.. Is this a bug ? I expected atleast 2 elements per set but it gives me 0 in one… and it happens a lot that is the values are not uniformly distributed in the sets.
The help for crossvalind says that the form you are using is:
crossvalind(METHOD, GROUP, ...). In this case, GROUP is the e.g. the class labels of your data. So 1:11 as the second argument is confusing here, because it suggests no two examples have the same label. I think this is sufficiently unusual that you shouldn’t be surprised if the function does something strange.I tried doing:
and it reliably gave
5as a result, which is what I would expect; my example would correspond to a two-class problem (I would guess that, as a general rule, you’d want something likenumel(unique(group)) <= numel(group) / folds) – my hypothesis would be that it tries to have one example of each class in the Kth fold, and at least 2 examples in every other, with a difference between fold sizes of no more than 1 – but I haven’t looked in the code to verify this.It is possible that you mean to do:
which would compute 5 folds for 11 data points – this doesn’t attempt to do anything clever with labels, so you would be sure that there will be K folds.
However, in your problem, if you really have very few data points, then it is probably better to do leave-one-out cross validation, which you could do with:
although a better method would be: