Student Dataset is giving different classification accuracy using cross validation on RapidMiner 9.6

VikasRattan
VikasRattan New Altair Community Member
edited November 2024 in Community Q&A
Student Dataset is giving different classification accuracy using cross validation on RapidMiner 9.6(educational version)
Tagged:

Best Answer

  • varunm1
    varunm1 New Altair Community Member
    Answer ✓
    Try to run the attached process without changing multiple times as see the results. I enable random seed for SMOTE, Cross-Validation & random tree. You can import this process by going to File --> Import process. You need to set a random seed for all operators that have that option. A random seed will help generate the same data all the time and even in the random tree, it will do the same randomization. These are critical to producing reproducible results. 

    Let me know if you still have issues.

Answers

  • varunm1
    varunm1 New Altair Community Member
    edited October 2020
    Hello @VikasRattan

    Did you set "Random Seed" option in cross-validation? If not, your folds might be divided differently during different runs. Also which algorithm are you using inside cross-validation?


  • VikasRattan
    VikasRattan New Altair Community Member
    Varun Ji, I observed it for Random forest, Random tree, Knn, Naive Bayes. I have set Random seed, which is 1922, and without setting random seed. In both cases, got different accuracy on different runs. Even though, i used startified sampling, shuffled sampling and linear sampling, i got different accuracy when executing at different point of time.
  • varunm1
    varunm1 New Altair Community Member
    Can you share your .rmp file? You can go to File --> Export Process and then attach that process here.
  • VikasRattan
    VikasRattan New Altair Community Member
    Sure Sir

    File is attached.
  • varunm1
    varunm1 New Altair Community Member
    Answer ✓
    Try to run the attached process without changing multiple times as see the results. I enable random seed for SMOTE, Cross-Validation & random tree. You can import this process by going to File --> Import process. You need to set a random seed for all operators that have that option. A random seed will help generate the same data all the time and even in the random tree, it will do the same randomization. These are critical to producing reproducible results. 

    Let me know if you still have issues.