MultipleLabelIterator: how to specify positive/negative attribute values?

Legacy User
Legacy User New Altair Community Member
edited November 5 in Community Q&A
I'm using the MultipleLabelIterator, following the sample 07_Meta/05_MultipleLabelLearning.xml. However, I'm pulling my data from a database using DatabaseExampleSource. Then I apply ChangeAttributeRole operators to each attribute to make it of type 'label1', 'label2', and so on. The result looks like the sample dataset with 'positive' or 'negative' nominal features depending on whether each row exemplifies the given feature.

When run, RapidMiner fails on the AverageBuilder operator: "Cannot build average for different positive classes (positive/negative)."

Looking at the datasets I see that in the sample data, the Range for each label# feature is always "positive(##), negative(##)". In my dataset, I see that some features are listed as "negative(##), positive(##)".

It seems that RapidMiner is not relating the values 'positive' and 'negative' but instead is using their positions, which are loading inconsistently.

Is there a way to tell RapidMiner which nominal value is the positive classname? Or another way to work around this error?

Thanks,
Gary
Tagged:

Answers

  • haddock
    haddock New Altair Community Member
    The error is your own, check out the last entry here http://rapid-i.com/rapidforum/index.php/topic,776.0.html.

    Just to prove the point, again, by using the "classes" slot the error is avoided.
    <operator name="Root" class="Process" expanded="yes">
        <operator name="MultipleLabelGenerator" class="MultipleLabelGenerator">
        </operator>
        <operator name="NoiseGenerator" class="NoiseGenerator">
            <list key="noise">
            </list>
        </operator>
        <operator name="DatabaseExampleSetWriter" class="DatabaseExampleSetWriter">
            <parameter key="database_system" value="Microsoft SQL Server (Microsoft)"/>
            <parameter key="database_url" value="jdbc:sqlserver://localhost:1433;databaseName=Tradestation"/>
            <parameter key="username" value="sa"/>
            <parameter key="password" value="wL8/6ZO7YrXKa8XgQd4v7g=="/>
            <parameter key="table_name" value="Table1"/>
            <parameter key="overwrite_mode" value="overwrite"/>
        </operator>
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_system" value="Microsoft SQL Server (Microsoft)"/>
            <parameter key="database_url" value="jdbc:sqlserver://localhost:1433;databaseName=Tradestation"/>
            <parameter key="username" value="sa"/>
            <parameter key="password" value="wL8/6ZO7YrXKa8XgQd4v7g=="/>
            <parameter key="table_name" value="Table1"/>
            <parameter key="label_attribute" value="label1"/>
            <parameter key="classes" value="positive negative"/>
        </operator>
        <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
            <parameter key="name" value="label2"/>
            <parameter key="target_role" value="label"/>
        </operator>
        <operator name="MultipleLabelIterator" class="MultipleLabelIterator" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <parameter key="sampling_type" value="shuffled sampling"/>
                <operator name="DecisionTree" class="DecisionTree">
                    <parameter key="minimal_size_for_split" value="10"/>
                    <parameter key="minimal_leaf_size" value="5"/>
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Performance" class="Performance">
                    </operator>
                </operator>
            </operator>
        </operator>
        <operator name="AverageBuilder" class="AverageBuilder">
        </operator>
    </operator>
  • IngoRM
    IngoRM New Altair Community Member
    Hello Gary,

    in the latest developer version there is a new operator called "InternalBinominalRemapping" which can exactly be used for this, i.e. for defining the positive class. Until this is released (if you do not want to checkout and compile it yourself), you could instead use a loop over all features reload the data from the database by using the current feature as macro for the label column and define the classes in the corresponding parameter of the DatabaseExampleSource instead of using the multiple label iterator.

    Cheers,
    Ingo
  • Legacy User
    Legacy User New Altair Community Member

    @haddock: Thanks, that's a good tip to know. It seems that it doesn't solve the problem in this case, however. Using the classes tag appears to affect only the attribute named as the 'label_attribute'. I can see the labels reverse in the Range column of the Data Table view as I pick different labels in the database source operator. But only one attribute at a time. I need to change several of them.

    @Ingo: I don't see the InternalBinominalRemapping operator in the community CVS repository I've downloaded and updated. Does it have a different name or is it in the Enterprise repository?

    Thanks!
  • IngoRM
    IngoRM New Altair Community Member
    Hi again,

    as I said this operator was introduced in the latest developer version which is currently the branch "Wasat". Here are more information about CVS access:

    http://rapid-i.com/rapidforum/index.php/topic,294.0.html

    Cheers,
    Ingo
  • Legacy User
    Legacy User New Altair Community Member

    Thanks, Ingo. I have it working, now...

    Notes for others:
    • The new operator didn't show up in RapidMiner after I switched to the Wasat branch and reran. I had to redo 'ant copy-resources'
    • The operator has a checkbox for 'apply to special attributes'. The original 'label' attribute and the new ones changed to 'label1', 'label2', etc are special, so check this box.
    • The 'attributes' field in the operator takes a regular expression, so 'label\d+' (without quotes) works if your attributes are 'label1', 'label2', etc
    Thanks,
    Gary