adaboost individual model performance
Thiru
New Altair Community Member
Im using adaboost + KNN for my data, which gives performance accuracy of 77.24. & precision, recall.
Adaboost is configured with 10 iterations.
is there any way to view the performance of model in each iteration and weights assigned in successive iterations
in rapidminer?
pl let me know. thanks
regds
thiru
Adaboost is configured with 10 iterations.
is there any way to view the performance of model in each iteration and weights assigned in successive iterations
in rapidminer?
pl let me know. thanks
regds
thiru
Tagged:
0
Best Answer
-
Hello @Thiru
1. Adaboost will try to improve an algorithm by taking misclassified samples in each iteration to build a classifier. So, this works on training side. The outcome of this training is an ensemble of decision trees, that are applied on testing data to check how well the trained algorithm performed. So Adaboost_1 to 10 are training performances, you can see the trained model is improving based on performances. But testing performance is only 67, which means you still need to tweak parameters or the model is overfitting.
2. Yes, you will have 20 if the "mod" port of the validation operator is connected. The reason for this is, the split operator runs the training side two times when the "mod" port of the validation operator is connected to any other operator or result. One time the training side is executed on 70% (In case of 70:30 Split) data (training data) and the other is to train on whole data after validation is complete. In order to avoid this, just remove the connection between "mod" port of the validation operator. If you want to use that, its simple to distinguish, the first 10 performances are related to 70% training data and the 11 to 20 performances are related to whole data.2
Answers
-
Hello @Thiru
Is this what you are looking for? The image is inside the Adaboost operator, we are calculating Training performance and storing it for each iteration using the "Store" operator. The naming convention used for the store operator is "Adaboost_%{execution_count}". The %{execution_count} macro will help in storing performance at each iteration. I am not sure if we can extract AdaBoost weights.
Do let us know if this helps0 -
hello @varunm1,
thanks for your reply. could you please elaborate on how to use "store" + "macro"s to get the performance
during each iteration. Im relatively new to rapidminer. In the process, Ive tried set/generate macros operator, but
it doesnot help. await your reply. thank you0 -
Hell@Thiru
You don't need to generate a macro. There are predefined macros, in this case I used %{execution_count} macro name in store operator. The reason for this is, the Adaboost iterates 10 times, which means you can get 10 training performances. As you need all the 10 performances, you need to save with a dynamic name that will update after every iteration. So to do this, I used "Adaboost_%{execution_count}" as a name for storing my performance. The %{execution_count} will count the number of times a particular operator executes, as the store operator is located inside AdaBoost, it will iterate 10 times and will name the performance as Adaboost_1, Adaboost_2, Adaboost_3,...
Please find the attached .rmp file. Import it to RM and check inside Adaboost operator.
1 -
hello @varunm1, thanks for your reply. where I can view the performance of all 10 models. U mean output of validation operator? we are getting in case adaboost + decision tree as the case used by you. If i go for adaboost + KNN - i couldnt view all the 10 models. could you pl look in to this. thanks
regds
thiru0 -
Hello @Thiru
You can't view them directly, you need to store them first using store operator. That is what I did in the attached process. You need to change the store location as the earlier one is linked to my repository. You need to name the results in store with macro as informed in my earlier post. Once done and run the process, the store operator will store results of adaboost_1, adaboost_2, .... in your repository that you mentioned in store operator
Attach store operator as I did. Then point it to a repository location and then give the name as Adaboost_%{execution_count} then run process and check in that repository location, you will find the results0 -
hello @varunm1
thanks for your reply. I only checked the file sent by you.
Ok , I got it. I retrieved those in store operator through new process and viewed the results.
1. the Adaboost_1 performance shows: 89.04% acc. adaboost_10 shows: 99.13%.
But the overall model performance is only: 67.74%.
Is it because -the adaboost_1 to adaboost -10 is performed on train data and not test data? & 67.74% is from test data?
2. The file sent by you. shows the count : adaboost_1 to Adaboost_20. whereas the no. if iterations in adaboost operator
is mentioned as 10. How do we get 20?
await your reply on the above. thanks
regds
thiru
0 -
Hello @Thiru
1. Adaboost will try to improve an algorithm by taking misclassified samples in each iteration to build a classifier. So, this works on training side. The outcome of this training is an ensemble of decision trees, that are applied on testing data to check how well the trained algorithm performed. So Adaboost_1 to 10 are training performances, you can see the trained model is improving based on performances. But testing performance is only 67, which means you still need to tweak parameters or the model is overfitting.
2. Yes, you will have 20 if the "mod" port of the validation operator is connected. The reason for this is, the split operator runs the training side two times when the "mod" port of the validation operator is connected to any other operator or result. One time the training side is executed on 70% (In case of 70:30 Split) data (training data) and the other is to train on whole data after validation is complete. In order to avoid this, just remove the connection between "mod" port of the validation operator. If you want to use that, its simple to distinguish, the first 10 performances are related to 70% training data and the 11 to 20 performances are related to whole data.2 -
thanks . it clarifies.
regds
thiru0