Cannot map index of nominal attribute to nominal value

Legacy User
Legacy User New Altair Community Member
edited November 5 in Community Q&A
Hi,

after recently updating my RapidMiner (branch Zaniah), the AttributeSubsetPreprocessing
operator somehow fails.

This is my model:

<operator name="Root" class="Process" expanded="yes">
    <operator name="CSVExampleSource" class="CSVExampleSource">
        <parameter key="filename" value="examples.csv"/>
        <parameter key="label_name" value="result"/>
    </operator>
    <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
        <parameter key="attribute_name_regex" value="result"/>
        <parameter key="condition_class" value="attribute_name_filter"/>
        <parameter key="process_special_attributes" value="true"/>
        <operator name="UserBasedDiscretization" class="UserBasedDiscretization">
            <list key="classes">
              <parameter key="no" value="1000000.0"/>
              <parameter key="yes" value="99.0"/>
            </list>
        </operator>
    </operator>
    <operator name="RandomForest" class="RandomForest">
    </operator>
</operator>
The dataset consists of numerical and nominal features, while the labels
are percent value. In order to use them for classification, I preprocess
my data by replacing all labels (named result) >= 99.0 with "yes", while setting the
other labels to "no". This worked fine with RapidMiner 4.2. With the
recent version I get the error message:

AttributeTypeException
Process failed Message:
Cannot map index of nominal attribute to nominal value: index -1 is out of bounds!
Any ideas what is wrong?

Regards,
Paul
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi Paul,
    did you try to set the upper bound of label no to "Infinity" ? This might help if values above 1000000 occur.
    But I must admint that the operator info states that a additional class will be introduced then, but this doesn't happen for some reason. I will check that, but probably I will not get it done before next year.

    Greetings,
      Sebastian
  • Legacy User
    Legacy User New Altair Community Member
    Hi Sebastian,

    yes, I've just tried it but it didn't help. Also, in my case the
    values are never larger than 160.0 so the value range should be
    not exceeded.

    It would be nice if you could check it. So long, I will switch back
    to RM 4.2.

    Regards,
    Paul
  • Legacy User
    Legacy User New Altair Community Member
    Hi Sebastian,

    I've found the bug. In one example, the label was missing. So, there
    are no problems with RapidMiner. Sorry. :o

    Can such problems be avoided in the future, i.e. is there a way
    to check the dataset for invalid examples with missing labels?
    Or must this be done in advance by the user?

    Regards,
    Paul