Difference operator?

Legacy User
Legacy User New Altair Community Member
edited November 5 in Community Q&A
I have a table with N rows of data. Is there any operator that would convert it into a table of N-1 rows, where each row is a difference of two consecutive rows of the original table?

That is, input:

X  Y
a  b
c  d
e  f

Desired output:

X      Y
c-a  d-b
e-c  f-d

If there is no such operator, can you treat this as a feature request?

Thank you!


Tagged:

Answers

  • steffen
    steffen New Altair Community Member
    Hello Victor

    As far as I know, there is no such operator. Nevertheless, here is a workaround (thanks to RapidMiner for its litte operators which can be combined in powerful ways), tested with the iris-dataset which is also part of RM.
    <operator name="Root" class="Process" expanded="yes">
        <operator name="load_data" class="ExampleSource">
            <parameter key="attributes" value="iris.aml"/>
        </operator>
        <operator name="remove_label" class="FeatureNameFilter">
            <parameter key="filter_special_features" value="true"/>
            <parameter key="skip_features_with_name" value="label"/>
        </operator>
        <operator name="copy_data" class="IOMultiplier">
            <parameter key="io_object" value="ExampleSet"/>
        </operator>
        <operator name="process_minuend" class="OperatorChain" expanded="no">
            <operator name="remove_first_row" class="ExampleFilter">
                <parameter key="condition_class" value="attribute_value_filter"/>
                <parameter key="invert_filter" value="true"/>
                <parameter key="parameter_string" value="id=id_1"/>
            </operator>
            <operator name="remove_ID" class="FeatureNameFilter">
                <parameter key="filter_special_features" value="true"/>
                <parameter key="skip_features_with_name" value="id"/>
            </operator>
            <operator name="addID" class="IdTagging">
            </operator>
        </operator>
        <operator name="select_second_set" class="IOSelector">
            <parameter key="io_object" value="ExampleSet"/>
            <parameter key="select_which" value="2"/>
        </operator>
        <operator name="process_subtrahend" class="OperatorChain" expanded="yes">
            <operator name="remove_last_row" class="ExampleFilter">
                <parameter key="condition_class" value="attribute_value_filter"/>
                <parameter key="invert_filter" value="true"/>
                <parameter key="parameter_string" value="id=id_150"/>
            </operator>
            <operator name="remove_ID (2)" class="FeatureNameFilter">
                <parameter key="filter_special_features" value="true"/>
                <parameter key="skip_features_with_name" value="id"/>
            </operator>
            <operator name="addID (2)" class="IdTagging">
            </operator>
        </operator>
        <operator name="ExampleSetJoin" class="ExampleSetJoin">
            <parameter key="remove_double_attributes" value="false"/>
        </operator>
        <operator name="FeatureGeneration" class="FeatureGeneration">
            <list key="functions">
              <parameter key="new_a1" value="-(a1,a1_from_ES2)"/>
              <parameter key="new_a2" value="-(a2,a2_from_ES2)"/>
            </list>
        </operator>
    </operator>
    Unfortunately you got to define your functions manually (in "FeatureGeneration"), hence this is only suitable for a smaller amount of attributes.

    Note: To make this work in your situation, you must change the dataset-specific parameters (primarily the names of the attributes).
    I hope this setup is self-explanatory, if not, feel free to ask.

    hope this was helpful

    Steffen
  • TobiasMalbrecht
    TobiasMalbrecht New Altair Community Member
    Hi Victor, hi Steffen,

    wow, what a process, I did not imagine this was even possible .. ;) ... just joking!

    Well, there is indeed actually no operator accomplishing this task. Although Ingo wrote a meta operator [tt]RelativeRegression[/tt] which allows to regress on the difference of label values, there is yet no general operator which allows to build differences of attribute values. Concerning time series models, this would certainly be a nice-to-have-operator. So, sometime somebody of us will certainly write such an operator... which by the way should not be all to complicated!

    Regards,
    Tobias

  • Legacy User
    Legacy User New Altair Community Member
    Hmm... There is no way I could come up with this sequence by myself.

    If you going to add this operator, can you make it with a choice of the function:

    1. Differences.
    2. Ratios
    3. Ratios - 1
    4. ln(ratios)

    This seem to cover all typical cases.

    Thank you for the great product and the fast response!
    Victor

  • TobiasMalbrecht
    TobiasMalbrecht New Altair Community Member
    Hi Victor,
    Victor wrote:

    If you going to add this operator, can you make it with a choice of the function:

    1. Differences.
    2. Ratios
    3. Ratios - 1
    4. ln(ratios)

    This seem to cover all typical cases.
    as I tried to imply, there is pretty much on our schedule at the moment, hence we will not have enough time in the short term. But we will keep this in mind. I think, you nicely resumed the requirements for the functionality of such an operator. Thanks!

    Regards,
    Tobias
  • fjcuberos
    fjcuberos New Altair Community Member
    I've an operator (part of a plugin) that makes the difference of adjacents attributes.
    I don´t know if it is too late. I can send you the sources if you want to extend to examples.

    F.J. Cuberos

  • Kolodziej
    Kolodziej New Altair Community Member
    Hallo,
    i have the same problem. I need the difference of two rows.
    Is there an operator who can do this? If not, how can i solve this problem?

  • MariusHelf
    MariusHelf New Altair Community Member
    Hi,

    you can use the Differentiate operator from the Series extension.

    Best, Marius