Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
"Concerning Feature Selection Implementation"
AnneG
I was wondering how the feature selection is implemented in RapidMiner. In the documentation of the feature selection operator I found the step
"Evaluate the attribute sets and select only the best k."
(in the forward selection description). Does that mean for each attribute set a classification is performed and depending on the performance the best attribute sets are chosen? Or is there another additional criterion used before, such as information gain?
And does the Feature Selection Operator really remove all redundant attributes? I found a hint in the Javadoc, however, I would like to know how this is done, just to be sure.
Find more posts tagged with
AI Studio
Feature Selection
Accepted answers
All comments
land
Hi Anne,
both methods are available in RapidMiner. The FeatureSelection Operator represents what is called the "wrapper" approach, if I remember correctly. As you said, for each combination at least one learning step is involved, since you normally will use a cross-validation for estimating the performance. The filter approach uses heuristics like information gain to select a number of attributes which seem to be best. But this must not match the learners capabilities. If you have the computational power, I would recommend the wrapper approach.
And no, as far as I know, there's no extra code removing the redundant attributes deterministically. They will probably be removed by the forward selection, since the information is already know. At least this holds in theory, unless you have a learner like Naive Bayes, which might profit from the double occurrence of the same information, because it reweights the attributes in some way. So you never really know, what's redundant
But if you regard highly correlated attributes as redundant, you might start with a RemoveUselessAttributes and a RemoveCorrelatedAttributes operator. This will at least lower the computational costs for a following featureSelection, since the number of attributes is reduced...
Greetings,
Sebastian
AnneG
Thanx a lot for your quick answer. I am quite new to the topic of feature selection and in recent papers they write that correlation does not necessarily mean that attributes are redundant. I will explore the operators you named and see if this might help me. Once again, thank you.
Kind regards,
Anne
land
Hi,
that's of course correct: Correlation is no causality. And vice versa highly correlated features does not have to be redundant. But if two attributes are only linear combinations of each other, some learner like the LinearRegression will not use both.
Greetings,
Sebastian
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups