Production model vs Model

User36964
User36964 New Altair Community Member
edited November 5 in Community Q&A
When I search the difference between the model and the production model I found that  "The ‘production model’ is using exactly the same preprocessing, feature sets, optimized parameters etc. - but is uses ALL labeled data for training.  This is the model you should use in production and it makes use of all available information."

But If we use all labeled data in the training phase, how could we tell if the model overfits or not? As far as I know, the reason behind not using all the labeled data for training is to avoid overfitting. And of course to be able to measure the prediction performance metrics for the model. 


Tagged:

Best Answer

  • BalazsBarany
    BalazsBarany New Altair Community Member
    Answer ✓
    Hi!

    The general assumption behind cross validation is that a model built from all the data is not worse than the average of the models built from the validation subsets. With 10-fold cross validation you build models on 90 % of the data and validate them on the remaining 10 %, then do this again with a different subset. An overfitted model would give you suboptimal results in this scenario compared with a non-overfitted one.

    When doing 10-fold cross validation and connecting the mod output, an eleventh model is built on all the data. This is the "production model". 

    Regards,
    Balázs 

Answers

  • BalazsBarany
    BalazsBarany New Altair Community Member
    Answer ✓
    Hi!

    The general assumption behind cross validation is that a model built from all the data is not worse than the average of the models built from the validation subsets. With 10-fold cross validation you build models on 90 % of the data and validate them on the remaining 10 %, then do this again with a different subset. An overfitted model would give you suboptimal results in this scenario compared with a non-overfitted one.

    When doing 10-fold cross validation and connecting the mod output, an eleventh model is built on all the data. This is the "production model". 

    Regards,
    Balázs