"Different performance from Backward Elimination when not using the operator"

aphongme
aphongme New Altair Community Member
edited November 2024 in Community Q&A

I used the Backward Elimination operator to optimize my AUC for logistic regression by eliminating some attributes. However, when I stop using the Backward Elimination operator and eliminate the same attributes myself using the Selected Attribute operator (based on Backward Elimination operator's results) the resultant AUC/Performance is not the same (it lower). This is the same for many optimization operators (Optimize Parameter (Grid), Forward Selection).

How do these optimization operators work and how are they different from doing it manually (without optimization operator) ?

My data has 2030 instances with 33 features and 1 binary dependent variable.

Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member

    Hi @aphongme,

     

    I'm not specialist of feature selection algorithms, so I don't know why manually, you don't obtain the same AUC as using feature selection algorithms.

    However to have an element of answer about how these algorithms works, you can find a ressource (especially part 1 / part 2) by following this link : 

    https://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/Multi-Objective-Feature-Selection-Part-1-The-Basics/ta-p/45775/jump-to/first-unread-message

     

    I hope it helps.

     

    Regards,

     

    Lionel

  • aphongme
    aphongme New Altair Community Member

    This also happen when I use Optimize (Grid) operator too. The parameters that I got, when I try running them without using the operator the AUC decrease significantly.

  • lionelderkrikor
    lionelderkrikor New Altair Community Member

    Hi again @aphongme,

     

    Can you verify your XML process and share it (the process you shared in the other topic is broken).

    An pist of investigation can be first to build the ROC curves in the 2 cases (case 1 : manually / case 2 : use of feature selection - Optimize parameters algorithms) and compare these curves (using for example Compare ROCs operator).

     

    Regards,

     

     

    Lionel

  • Telcontar120
    Telcontar120 New Altair Community Member

    Are you making sure to use a specific random seed to ensure reporducible results?