Classifier Accuracy with Grid Search is not similar to accuracy without Grid Search

Safa
New Altair Community Member
Hello guys I'm doing Grid Search for tuning Random Forest Parameters when the process ends it gives me a set of best parameters also the accuracy of the best parameters for RF, now my question is when I run the process without Grid Search by setting Random Forest parameters that i got from Grid Search I notice I get a downgrade accuracy??? Can anyone explain the difference because both approaches are the same the only difference is that the first approach is with Grid Search and the second time without Grid Search?
I have includes screenshots of my process
my dataset is Glass Type with 214 samples it contains 1 duplicate row, 6 class Unbalance Data, I run my process as following
send dataset into Optimize Parameters (Grid) operator
inside Optimize Parameters (Grid) operator:
1- remove duplicates
2- Normalize
3- split Data into 80:20
4- use Smote on Training data only
5- Train RF
6- Evaluate Model
I have includes screenshots of my process
my dataset is Glass Type with 214 samples it contains 1 duplicate row, 6 class Unbalance Data, I run my process as following
send dataset into Optimize Parameters (Grid) operator
inside Optimize Parameters (Grid) operator:
1- remove duplicates
2- Normalize
3- split Data into 80:20
4- use Smote on Training data only
5- Train RF
6- Evaluate Model
Tagged:
0
Answers
-
Hi @Safa,It's a abnormal behavior if you are using the same datasets, be sure that it's the case.for example, I saw that you are using the split operator, depending on the parameters, the datasets (training and test) may vary.Try the process with stable train and test datasets and check it.Best1
-
Hi @ceaperez
I did as you said and split the data then store the results into two separate files.
After that, I run Grid Search and get the best parameters and accuracy.
Then I test without grid search but still, I get a downgrade accuracy??
please check my screenshots and tell me if I'm doing something wrong??0 -
Hi @Safa,One of the most beautiful things about Rapidminer is that you have a whole view of your pipeline and you can explore your model step by step.I saw in your model that the accuracy is more like now than before. that is because we eliminated one source of aleatority.The Smote operator is another one. if you use the Smote operator over the same dataset twice, you will not obtain the same dataset.I invite you to explore your model using the pipeline, breakpoints and the compare distributions operator from smile extension.Best,Cesar1
-
Hi @ceaperez thank you for answering my question really appreciated i have learn few thing from you thanks.
I have used smote only once,
I have removed smote too and test again without using split operator still I get downgrade accuracy, I think using performance operator inside grid search and without grid search make slightly different result anyhow thanks
best regards0