no label after application of Series2WindowExamples in RM 4.4 CVS

oheering
oheering New Altair Community Member
edited November 5 in Community Q&A
Hi together,

what did once work with RapidMiner 4.3 unfortunately doesn't work with version 4.4 any more. I'm talking about the following scenario:

(Note: when i speak of RapidMiner 4.4 i actually mean the most recent CVS version as i am unable to install the officially released 4.4 version as it wants to update the already installed 4.3 version which i need.)
  • time series data read with ExampleSource (1 example, 4000 attributes, you might know it: sales_series.aml/dat)
  • immediately after this: Series2WindowExamples with series_representation = encode_series_by_attributes, horizon = 1, window_size = 100, step_size = 5
In RapidMiner 4.3 a windowed ExampleSet was generated consisting of 100 attributes (window_size) and 1 label, which is obviously needed for training. RapidMiner 4.4 on the other hand does not produce the label attribute with these settings. Taking a look into UnivariateSeries2WindowExamples.createLabel() reveals:

if (representation == SERIES_AS_EXAMPLES) {
    Attribute seriesAttribute = exampleSet.getAttributes().iterator().next();
    int valueType = seriesAttribute.getValueType();
    return AttributeFactory.createAttribute("label", valueType);
} else {
    return null;
}
As i have representation == SERIES_AS_ATTRIBUTES it is indeed correct that no label will be created. Now the code lets me believe that there is a good reason for this behavior which i might miss. I do remember that the help recommends encoding the series as examples in favor of efficiency with respect to the memory usage. Maybe with version 4.4 RapidMiner forces me to use another encoding by simply omitting the label attribute now? ;-) I mean, it is absolutely no problem to transpose the data prior to using Series2WindowExamples, i just wanted to know what's behind all this. Maybe i am really missing some important point here.

Thanks in advance,
Oliver

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    the preferred way to transform series data into windowed examples is now the operator "MultivariateSeries2WindowExamples" for both the single and the multivariate cases (we seem to have forgotten to set the Single-variant to deprecated). Then the following process should still work:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="sum"/>
            <parameter key="number_examples" value="1"/>
            <parameter key="number_of_attributes" value="100"/>
        </operator>
        <operator name="AttributeFilter" class="AttributeFilter">
            <parameter key="condition_class" value="attribute_name_filter"/>
            <parameter key="parameter_string" value="label"/>
            <parameter key="invert_filter" value="true"/>
            <parameter key="apply_on_special" value="true"/>
        </operator>
        <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
            <parameter key="series_representation" value="encode_series_by_attributes"/>
            <parameter key="horizon" value="1"/>
            <parameter key="window_size" value="10"/>
            <parameter key="label_dimension" value="0"/>
        </operator>
    </operator>

    The reason for the change actually was the new operators "WindowExamples2ModelingData" and "WindowExamples2OriginalData" which are additionally able to produce a relative value transformation usually delivering much better results for time series predictions. This, however, only works in the "encoding_as_examples" mode. Here is an example:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="sum"/>
            <parameter key="number_of_attributes" value="1"/>
        </operator>
        <operator name="AttributeFilter" class="AttributeFilter">
            <parameter key="condition_class" value="attribute_name_filter"/>
            <parameter key="parameter_string" value="label"/>
            <parameter key="invert_filter" value="true"/>
            <parameter key="apply_on_special" value="true"/>
        </operator>
        <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
            <parameter key="window_size" value="10"/>
        </operator>
        <operator name="WindowExamples2ModelingData" class="WindowExamples2ModelingData">
            <parameter key="label_name_stem" value="att1"/>
        </operator>
    </operator>

    Cheers,
    Ingo
  • erk
    erk New Altair Community Member
    Hi,

    I am not sure MultivariateSeries2WindowExamples will resolve this. Indeed, in RM 4.4 I am observing a similar problem with MultivariateSeries2WindowExamples. As long as the attribute which we want to be the label after windowing is a special attribute, MultivariateSeries2WindowExamples operator seems to fail to create a label attribute properly. You can observe what I've meant with the config file below. I just submitted a bug report regarding this.

    Cheers,

    Erk

    <operator name="Root" class="Process" expanded="yes">
       <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
           <parameter key="target_function" value="sum"/>
           <parameter key="number_of_attributes" value="10"/>
       </operator>
       <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
           <parameter key="horizon" value="1"/>
           <parameter key="window_size" value="10"/>
           <parameter key="label_attribute" value="label"/>
       </operator>
       <operator name="LinearRegression" class="LinearRegression">
       </operator>
    </operator>
  • land
    land New Altair Community Member
    Hi Erk,
    thank you for committing a bug report and attaching the process. I will check that.

    Greetings,
      Sebastian