discard attribute with more than x% missing values operator
dan_agape
New Altair Community Member
A suggestion: the above operator (see subject) seems to be needed in RM. It is very useful in the data pre-processing step. This simple but essential function is offered in almost any popular DM suite.
Best
Dan
Best
Dan
Tagged:
0
Answers
-
Hi,
you are right, such an operator would be nice. I have uploaded a process with our new Community Extension which performs exactly the desired task. It is called "Discard Attribute with More than x% Missing Values (Loops + Macros)" and you can download and execute the process with a few clicks after having installed our new myExperiment Community Extension from the help menu of RapidMiner.
This process loops over all attributes and calculates the fraction of missings for each attribute. If this fration is larger than the fraction defined in the first "Set Macro" operator (macro: max_unknown), the attribute will be removed from the example set.
Cheers,
Ingo
0 -
Hi Ingo,
Thanks for the prompt reply. The RM team does a great job, and we, the users, thank you for that.
BTW, that's an excellent thing that most of Weka algorithms are included under a plug in component. However it would be useful perhaps to include all the pre-processing functionality from there, although RM is very strong in this. In particular the operator from the subject would have been included.
Best
Dan0