K-fold crossvalidation
Hi all i have a small data set of 90 rows i am using cross validation in my process but i am confused to decide on number of K folds
.I tried 3 ,5,10 and the 3 fold cross validation performed better could you please help me how to choose k.I am little biased on choosing 3 as it is small .
Answers
-
Hi,
please post your process XML and describe the problem/data a bit. Your question is way too vague to be responded right now or to be useful to other people.
Best,
Sebastian
0 -
Hi @k_vishnu772
You should keep in mind that cross-validation is intended to estimate an averaged model performance, but choosing different k by itself will not make your model perform better. I think it is always better to look at the performance on a test holdout set but I am also afraid that dataset of only 90 rows is still too small to get a good performance estimation.
1 -
Hi!
I agree with kypexin. Cross validation is not about getting the best performance.
Going with 10 is a good approach, as that has been shown again and again as a number providing stable results (not a lot of differences in the performance to 9 or 11 folds).
If you have that little data, you could even try the leave-one-out validation which would be a kind of 90-fold validation on your data set. It won't give you the best performance measure (as that came out from 3-fold validation by chance) but the most stable and reliable estimation for the performance you can expect.
Regards,
Balázs
0