ARFF files with ? for nominal data

Kazmin
Kazmin New Altair Community Member
edited November 5 in Community Q&A
Hi all,
I'm new to rapidminer so I apologize in advance for any stupid comments that I make.

I have an ARFF file on which I am trying to run a Decision Tree. The problem is that one of my nominal variables has only "?" as values and the decision tree algorithm fails with an error message, if I remove that variable beforehand it finishes correctly with the right result. Is there any way to alleviate that problem? I am going to process automatically a lot of those ARFF files which are also automatically generated  so if there is a way to handle the situation more gracefully it would be great.

Thank you very much for the help, it is highly appreciated.
Nikolay
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    there is a bunch of possible solutions:
    As a starter you can use the remove useless attributes to filter out all attributes that have always the same value. This will affect attributes having unknown values all the time, too.
    Another solution would incorporate the replace missing values or the impute missing values operator. You could take a look at their documentation for more information.
    Last but not least you simply could filter out attributes that have missing values with the select attributes operator.

    Which of this solutions suits you best depends on your task and on what you are going to make with the generated model.

    Greetings,
      Sebastian
  • Kazmin
    Kazmin New Altair Community Member
    Hey Sebastian,
    thank you very much for the quick and helpful reply, it was exactly what I needed.
    Best Regards,
    Nikolay