🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Feature importance operators fail on datasets with features without any data

User: "yzan"
New Altair Community Member
Updated by Jocelyn
When an ExampleSet contains even just a single feature, which consists only of missing values, following operators:
  • Weight by Information Gain Ratio
  • Weight by Information Gain
  • Weight by Gini
  • Weight by Uncertainty
fail with:
Exception: java.lang.ArrayIndexOutOfBoundsException<br>Message: 0
Similarly, Weight by Rules fails with:</code><br><code>
Exception: com.rapidminer.example.AttributeTypeException<br>Message: Cannot map index of nominal attribute to nominal value: index 0 is out of bounds!
Known workaround: Use first Remove Useless Attributes.

Expected result: Zero weight for features without any data.

Justification:
  1. Sometimes I want to report the relevance of all the features in the dataset.
  2. I dislike when a time consuming process fails because of some unlucky random seed in cross-validation...
Proposed action: Add a parameterized test, which tests all feature weighting operators whether they can handle a feature without any data (be it a nominal, numerical or date column).

Reasoning: I didn't test all the operators. And there is a good chance other operators might share the same "halt the world" trait.







Find more posts tagged with