Is it logical that testing error be lower than training error?

njasaj
njasaj New Altair Community Member
edited November 5 in Community Q&A
Hi Rapidminer Community,
I used SVM (Libsvm) operator for making a regression model. After training by 10 fold cross validation the resulted  correlation coefficient was  84 and RMSE  was 0.048. By applying this model on the test data set i got correlation coefficient of 88.5 and RMSE of 0.037. Now i need to know is it possible or logical that testing error be lower than training error?
Thanks.
Tagged:

Answers

  • fras
    fras New Altair Community Member
    Hi,
    yes, this is possible. You have to keep in mind that using only one testset does
    _not_ deliver representative results. That's why we have Cross-Validation where
    averaging over more than one testset is done. So trust Cross-Validation for choosing the
    right SVM-parameters and finally train your model on full data.
    Cheers, Frank
  • njasaj
    njasaj New Altair Community Member
    Thanks for your reply. Would you mind explain a bit more about this? I think that cross validation is only for train data set and after finding the model parameters just apply the model on test data set. As i understand you recommend use cross validation on test  data ? So what happen for the other part of data which was splitted by CV for training?
    Thank you.