🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

How to find the most important features in a dataset?

User: "Christos_Karapapas"
New Altair Community Member
Updated by Jocelyn
I have a dataset in csv format with more than 500 columns, I have imported it to a database marking every column as polynomial since they all hold different types of information and now, I want to find which of those are the most important.  

So far, I have managed to get a table with the feature and its weight, using the weight by "X" operator, but the problem is that on the results I get every feature-value separately on a different row. Instead what I want is to aggregate by feature and have a single weight for each of them. I tried using the aggregate operator but with no luck.

As an example, this is what I get:
feature01-value05, weight:0,71
feature01-value13, weight:0,69
feature09-value03, weight:0,55

Instead I want something like this:
feature01, weight:0,7
feature09, weight:0,55
Sort by:
1 - 3 of 31
    User: "lionelderkrikor"
    New Altair Community Member
    Hi @chris_skg,

    I'm not able to get the results you obtained...
    Here the results I get by applying Weight by Information Gain operator to the Golf dataset : 



    In order we can reproduce what you observe and understand what's going on, can you please share : 
     - your XML process or your file process (.rmp file)
     - your data

    Regards,

    Lionel


    User: "Christos_Karapapas"
    New Altair Community Member
    OP
    Accepted Answer
    Thank you so much Lionel! 

    I finally managed to figure it out. I was getting a ArrayIndexOutOfBoundsException on the Weight by Information Gain operator due to some missing values in my dataset, so I was trying with various (wrong) operators to overcome this problem. One of those was the nominal to numerical which apparently caused this behavior. Once i replaced it with the (obviously right for this job) Replace Missing Values operator everything worked as expected.
    User: "lionelderkrikor"
    New Altair Community Member
    OK, @chris_skg,

    Glad that you finally found a solution ! 

    Regards,

    Lionel