"AdaBoost performance on new data (test dataset) MUCH worse than without AdaBoost"
miaque
New Altair Community Member
Hello,
I have the following problem:
I am working on dataset of data suitable for modeling the classification problem of digits recognition.
The database consists of 64 normal attributes + one for the class. It consists of nearly 5000 examples and is divided for training set (30 digit-writers) and test set (another, new 14 writers).
For my study project I am obliged to use the meta-learning operators. I faced the problem, that without use of AdaBoost operator, the results are aprox. 85% for the training set (X-Validation) and aprox. 80% for testing set (new data). When I try to implement AdaBoost, the results from X-Validation of training set are getting better - aprox. 90%, and MUCH WORSE for the new data - only 20% of accuracy!
Can anyone know what can be the issue here?
Thank you!
I have the following problem:
I am working on dataset of data suitable for modeling the classification problem of digits recognition.
The database consists of 64 normal attributes + one for the class. It consists of nearly 5000 examples and is divided for training set (30 digit-writers) and test set (another, new 14 writers).
For my study project I am obliged to use the meta-learning operators. I faced the problem, that without use of AdaBoost operator, the results are aprox. 85% for the training set (X-Validation) and aprox. 80% for testing set (new data). When I try to implement AdaBoost, the results from X-Validation of training set are getting better - aprox. 90%, and MUCH WORSE for the new data - only 20% of accuracy!
Can anyone know what can be the issue here?
Thank you!
Tagged:
0
Answers
-
seems like you overtrain, right?0