Achieved decent accuracy with random dep variable values

tkaiser
tkaiser New Altair Community Member
edited November 5 in Community Q&A

I had a gradient boosted tree classification model, generated using the Auto Model, that produced a 70% f-measure for a given dependent variable value…but then I input random numbers for the dependent variable and ran a GBT model again, with the same exact example data, and the f-measure was 65%. So closer than I had expected, and wondering how that can be the case. Thank you. 

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Hi,

     

    what is the f-measure if you run a Default Model operator?

     

    BR,
    Martin

  • tkaiser
    tkaiser New Altair Community Member

    Sorry, but I am not sure where that would go in the auto model process. 

     

    And I have now uncovered a second perhaps more pressing problem. The auto model ran a 3 fold cross validation, thus validating the future accuracy of the predictive model, guarenteeing there is no overlap between training and test sets. F-measure was about 70%, accuracy 90%. But then i did a manual hold-out - essentially giving 90% of my original data set to the auto model (GBT again), and then testing the model on the 10% holdout data. Performance was a little lower, but close to original performance measures. But when i applied the hold-out test set, the model performed terribly. Would very much appreciate some guidance as I have now lost confidence in my model's ability to predict future data.