🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Bug in Feature Generation: side effects"

User: "steffen"
New Altair Community Member
Updated by Jocelyn
Hello RapidMiner Team

I am using the latest cvs-version and tried to implement the ZTransformation. That means, calculating mean and std from input ExampleSet and then apply a series of RM-Operators, calling them within my code. Trying some preprocessing steps before my operator, I stepped over the strange behaviour of the FeatureGenerationOperator, which I also use. Then I simulated the Code in a process, using only RM-builtin-Operator. The strange things happened again. Two notes regarding the following setups:
  • The "useless" re-naming I got to perform because (originally) I wanted to use an attributenname containing a "(" within FeatureGeneration (confidence...)
  • In the following setups I used the dataset described by golf.aml delivered with the RM-distribution.
1. Here is my basic setup...which works!

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="golf.aml"/>
    </operator>
    <operator name="Temperature->ijon" class="ChangeAttributeName">
        <parameter key="new_name" value="ijon"/>
        <parameter key="old_name" value="Temperature"/>
    </operator>
    <operator name="apply_ztrans" class="FeatureGeneration">
        <list key="functions">
          <parameter key="tichy" value="/(-(ijon,const[73.571]()),const[6.3326]())"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
    <operator name="skip_ijon" class="FeatureNameFilter">
        <parameter key="filter_special_features" value="true"/>
        <parameter key="skip_features_with_name" value="ijon"/>
    </operator>
    <operator name="tichy->Temperature" class="ChangeAttributeName">
        <parameter key="new_name" value="Temperature"/>
        <parameter key="old_name" value="tichy"/>
    </operator>
</operator>

2. But accidently using the wrong attributename within FeatureGeneration, no error message appeared, but this wrong result.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="golf.aml"/>
    </operator>
    <operator name="Temperature->ijon" class="ChangeAttributeName">
        <parameter key="new_name" value="ijon"/>
        <parameter key="old_name" value="Temperature"/>
    </operator>
    <operator name="apply_ztrans" class="FeatureGeneration">
        <list key="functions">
          <parameter key="tichy" value="/(-(Temperature,const[73.571]()),const[6.3326]())"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
    <operator name="skip_ijon" class="FeatureNameFilter">
        <parameter key="filter_special_features" value="true"/>
        <parameter key="skip_features_with_name" value="ijon"/>
    </operator>
    <operator name="tichy->Temperature" class="ChangeAttributeName">
        <parameter key="new_name" value="Temperature"/>
        <parameter key="old_name" value="tichy"/>
    </operator>
</operator>

3. Setting the correct names, but applying  the Sorting-Operator before causes the same results as in step 2.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="golf.aml"/>
    </operator>
    <operator name="sort_temperature" class="Sorting">
        <parameter key="attribute_name" value="Temperature"/>
    </operator>
    <operator name="Temperature->ijon" class="ChangeAttributeName">
        <parameter key="new_name" value="ijon"/>
        <parameter key="old_name" value="Temperature"/>
    </operator>
    <operator name="apply_ztrans" class="FeatureGeneration">
        <list key="functions">
          <parameter key="tichy" value="/(-(ijon,const[73.571]()),const[6.3326]())"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
    <operator name="skip_ijon" class="FeatureNameFilter">
        <parameter key="filter_special_features" value="true"/>
        <parameter key="skip_features_with_name" value="ijon"/>
    </operator>
    <operator name="tichy->Temperature" class="ChangeAttributeName">
        <parameter key="new_name" value="Temperature"/>
        <parameter key="old_name" value="tichy"/>
    </operator>
</operator>

At this point I came to the conclusion, that the problem must lurk deeply in the RapidMiner entrails ...

Hope this error-desription was somehow helpful

greetings

Steffen

Find more posts tagged with