Achieved decent accuracy with random dep variable values
I had a gradient boosted tree classification model, generated using the Auto Model, that produced a 70% f-measure for a given dependent variable value…but then I input random numbers for the dependent variable and ran a GBT model again, with the same exact example data, and the f-measure was 65%. So closer than I had expected, and wondering how that can be the case. Thank you.
Answers
-
Hi,
what is the f-measure if you run a Default Model operator?
BR,
Martin0 -
Sorry, but I am not sure where that would go in the auto model process.
And I have now uncovered a second perhaps more pressing problem. The auto model ran a 3 fold cross validation, thus validating the future accuracy of the predictive model, guarenteeing there is no overlap between training and test sets. F-measure was about 70%, accuracy 90%. But then i did a manual hold-out - essentially giving 90% of my original data set to the auto model (GBT again), and then testing the model on the 10% holdout data. Performance was a little lower, but close to original performance measures. But when i applied the hold-out test set, the model performed terribly. Would very much appreciate some guidance as I have now lost confidence in my model's ability to predict future data.
0