Feature importance operators fail on datasets with features without any data
When an ExampleSet contains even just a single feature, which consists only of missing values, following operators:
Similarly, Weight by Rules fails with:
- Weight by Information Gain Ratio
- Weight by Information Gain
- Weight by Gini
- Weight by Uncertainty
fail with:
Exception: java.lang.ArrayIndexOutOfBoundsException<br>Message: 0
</code><br><code>
Exception: com.rapidminer.example.AttributeTypeException<br>Message: Cannot map index of nominal attribute to nominal value: index 0 is out of bounds!
Known workaround: Use first Remove Useless Attributes.
Expected result: Zero weight for features without any data.
Justification:
- Sometimes I want to report the relevance of all the features in the dataset.
- I dislike when a time consuming process fails because of some unlucky random seed in cross-validation...
Proposed action: Add a parameterized test, which tests all feature weighting operators whether they can handle a feature without any data (be it a nominal, numerical or date column).
Reasoning: I didn't test all the operators. And there is a good chance other operators might share the same "halt the world" trait.
Reasoning: I didn't test all the operators. And there is a good chance other operators might share the same "halt the world" trait.