How can RM identify sequences in dataset?

New Altair Community Member

May 7, 2009

Bonsoir!

I think that the MultivariateSeries2WindowExamples operator may be what you need, here's an example of this bad boy at work on a mock up of your problem, 8800 entries, representing 176 rows of 50 attributes.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function"	value="random"/>
        <parameter key="number_examples"	value="8800"/>
        <parameter key="number_of_attributes"	value="3"/>
    </operator>
    <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
        <parameter key="window_size"	value="50"/>
        <parameter key="step_size"	value="50"/>
    </operator>
</operator>

New Altair Community Member

Thanks for the advice but it seems that (after the preprocessing) they are still not grouped in "sequences".
Besides, there are some problem too when RM says that an attribut must have the same type of value...and this is not my case :-(
Any other suggestions?

thank you anyway!

A.Florio

New Altair Community Member

so how can I tell to RM that
each 50 rows represent an indipendent group?

What exactly did you mean by "group" ?

New Altair Community Member

group like:
Serie1:
50 elements of attr 1
50 elements of attr 2
50 elements of attr 3.

Serie2:
|
|
SerieN

So that RM can apply its algorithm not on ALL values, but to the single series.
Example: find pattern through Apriori inside each series and after maybe compare them.
I know that is not so easy to understand my problem, but i try to explain it as the best way.

New Altair Community Member

Hmm, the previous example produces 176 rows which contain the previous 50 values for each of the 3 attributes based on the notion that each 50 row clump is disinct, so just like your series. If you meant that each example is made up of the last 50 values for each attribute then you change the step size to one, like this, where we just look for sequence patterns in att3.


<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function"	value="random"/>
        <parameter key="number_examples"	value="8800"/>
        <parameter key="number_of_attributes"	value="3"/>
    </operator>
    <operator name="FeatureNameFilter" class="FeatureNameFilter">
        <parameter key="skip_features_with_name"	value="att1|att2"/>
    </operator>
    <operator name="BinDiscretization" class="BinDiscretization">
        <parameter key="range_name_type"	value="short"/>
    </operator>
    <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
        <parameter key="window_size"	value="50"/>
        <parameter key="step_size"	value="1"/>
    </operator>
    <operator name="W-Apriori" class="W-Apriori">
    </operator>
</operator>

New Altair Community Member

May 9, 2009

I know that I'm close (thx to your help) but it is still not sufficient.
Let's put it in a simple way....I've 1 attribute with 150 elements (rows),
and i want to see in result mode on the 'data view' Series1, Series2, Series3
with under them, 50 values of the attributes.

if I do so :

<operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes"	value="~/minim.aml"/>
        </operator>
        <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples" breakpoints="after">
            <parameter key="horizon"	value="1"/>
            <parameter key="window_size"	value="50"/>
            <parameter key="step_size"	value="50"/>
            <parameter key="add_incomplete_windows"	value="true"/>
        </operator>

then the output will be (in result mode->data view) : 3 example, 50 attributes' (wrong! I've 1 attribute and 3x50 values)
i tried other series preprocessing operation like "index series" or "Single2series" but it still not what i want.
Meanwhile I want to say that I rally appreciate your help.
A.Florio

New Altair Community Member

May 9, 2009

Let's put it in a simple way....I've 1 attribute with 150 elements (rows),
and i want to see in result mode on the 'data view' Series1, Series2, Series3
with under them, 50 values of the attributes.

Does the following do it?

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function"	value="random"/>
        <parameter key="number_examples"	value="150"/>
        <parameter key="number_of_attributes"	value="1"/>
    </operator>
    <operator name="FeatureNameFilter" class="FeatureNameFilter" breakpoints="after">
        <parameter key="filter_special_features"	value="true"/>
        <parameter key="skip_features_with_name"	value="label"/>
    </operator>
    <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
        <parameter key="window_size"	value="3"/>
        <parameter key="step_size"	value="3"/>
    </operator>
    <operator name="ChangeAttributeNamesReplace" class="ChangeAttributeNamesReplace">
        <parameter key="replace_what"	value="att.*-"/>
        <parameter key="replace_by"	value="Series_"/>
    </operator>
</operator>

Hope so! Good weekend.

New Altair Community Member

In this way, i got 3 columns(ok), but the first one doesn't contains
the first 50 values of my dataset. The values are spread like
a matrix index (1st rows, 2nd rows, ...). how can i tell it to take the first 50 values,
put in the 1st column (1st series), second 50 values, put in 2nd column (2nd series) and so on?
Thank you a lot for your help.

A.Florio

New Altair Community Member

OK, now I see what you mean, at least I hope so! What about this?


<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function"	value="random"/>
        <parameter key="number_examples"	value="150"/>
        <parameter key="number_of_attributes"	value="1"/>
    </operator>
    <operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
        <parameter key="window_size"	value="50"/>
        <parameter key="step_size"	value="50"/>
    </operator>
    <operator name="ExampleSetTranspose" class="ExampleSetTranspose">
    </operator>
    <operator name="ChangeAttributeNamesReplace" class="ChangeAttributeNamesReplace">
        <parameter key="replace_what"	value="att"/>
        <parameter key="replace_by"	value="Series"/>
        <parameter key="apply_on_special"	value="false"/>
    </operator>
</operator>

New Altair Community Member

I get this error message when i put my simple dataset with just 1 column (only 1 attribute)
AttributeTypeException
Process failed Message: Cannot map index of nominal attribute to nominal value: index 0 is out of bounds!
Even after a few changes in my dataset, i get always the same error, with out telling me where exactly is in the tree.
What it does mean?

New Altair Community Member

Without seeing the data there is not much I can say.

New Altair Community Member

This is just a piece of the 1 attribute of my dataset.
Too make things easier, I ignored (for now) other attributes.
It is a series of operations: numerical and nominal, nothing special.

[attachment deleted by admin]

New Altair Community Member