"Where can I find MAPE (Mean Average percentage Error) in Rapidminer"

deva
deva New Altair Community Member
edited November 5 in Community Q&A
I'm trying to measure performance on prediction of time series with KNN method. I would like to measure performance of prediction with MAPE. Also I'm using Performance (regession) module with RMSE checked, so I know how to use performace module, but i would just like to find MAPE  :)

Is iti even possible?
Tagged:

Answers

  • wessel
    wessel New Altair Community Member
    Hey,

    In my research I have done a lot of experimenting with different error measures for time series data.
    Maybe this post should converted into a blog, to be more useful to a larger audience, but I have vary little experience with blogging.
    So here is a triple post instead.

    You can code MAPE yourself using the script operator.
    But unfortunately this is not a pretty solution.
    Posted below is a processes including a script operator that performs error analysis for time series data.
    It can easily be modified to produce MAPE, but it is not shown here because for this data MAPE produces weird results.
    The script operator takes as input a data set generated with the  "Predict Series" operator including the attributes:
    - label
    - prediction(label)
  • wessel
    wessel New Altair Community Member
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
        <process expanded="true" height="409" width="840">
          <operator activated="true" class="generate_data" compatibility="5.1.006" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="target_function" value="spiral cluster"/>
            <parameter key="number_examples" value="1000"/>
            <parameter key="number_of_attributes" value="2"/>
            <parameter key="attributes_lower_bound" value="0.0"/>
            <parameter key="attributes_upper_bound" value="1.0"/>
          </operator>
          <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="180" y="30">
            <parameter key="horizon" value="24"/>
            <parameter key="create_label" value="true"/>
            <parameter key="label_attribute" value="att1"/>
          </operator>
          <operator activated="true" class="series:predict_series" compatibility="5.1.002" expanded="true" height="60" name="Predict Series" width="90" x="315" y="30">
            <parameter key="window_width" value="12"/>
            <parameter key="horizon" value="24"/>
            <parameter key="max_training_set_size" value="1000"/>
            <process expanded="true" height="409" width="835">
              <operator activated="true" class="k_nn" compatibility="5.1.006" expanded="true" height="76" name="k-NN" width="90" x="112" y="30"/>
              <connect from_port="window example set" to_op="k-NN" to_port="training set"/>
              <connect from_op="k-NN" from_port="model" to_port="prediction model"/>
              <portSpacing port="source_window example set" spacing="0"/>
              <portSpacing port="sink_prediction model" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.1.006" expanded="true" height="76" name="Select Attributes" width="90" x="450" y="30">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="|prediction(label)|label"/>
            <parameter key="regular_expression" value=".*label.*"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="rename" compatibility="5.1.006" expanded="true" height="76" name="Rename" width="90" x="585" y="30">
            <parameter key="old_name" value="prediction(label)"/>
            <parameter key="new_name" value="pred"/>
            <list key="rename_additional_attributes"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="5.1.006" expanded="true" height="76" name="Generate Attributes" width="90" x="720" y="30">
            <list key="function_descriptions">
              <parameter key="abs" value="abs(label-pred)"/>
            </list>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.1.006" expanded="true" height="76" name="Filter Examples" width="90" x="45" y="120">
            <parameter key="condition_class" value="no_missing_attributes"/>
          </operator>
          <operator activated="true" class="execute_script" compatibility="5.1.006" expanded="true" height="76" name="Execute Script (2)" width="90" x="179" y="120">
            <parameter key="script" value="import com.rapidminer.operator.Operator;&#10;import com.rapidminer.Process;&#10;import com.rapidminer.MacroHandler;&#10;import com.rapidminer.tools.Ontology&#10;&#10;ExampleSet exampleSet = operator.getInput(ExampleSet.class);&#10;&#10;Attribute sum = AttributeFactory.createAttribute(&quot;sum&quot;, Ontology.REAL);&#10;exampleSet.getExampleTable().addAttribute(sum);&#10;exampleSet.getAttributes().addRegular(sum);&#10;&#10;Attribute avg = AttributeFactory.createAttribute(&quot;avg&quot;, Ontology.REAL);&#10;exampleSet.getExampleTable().addAttribute(avg);&#10;exampleSet.getAttributes().addRegular(avg);&#10;&#10;last = 0;&#10;n = 0;&#10;&#10;for (Example e : exampleSet) {&#9;&#10;    &#9;e[&quot;sum&quot;] = e[&quot;abs&quot;] + last;&#10;    &#9;last = e[&quot;sum&quot;];&#10;    &#9;n++;&#10;    &#9;e[&quot;avg&quot;] = last / n; &#10;}&#10;&#10;return exampleSet&#10;&#10;&#10;"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Windowing" to_port="example set input"/>
          <connect from_op="Windowing" from_port="example set output" to_op="Predict Series" to_port="example set"/>
          <connect from_op="Predict Series" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
          <connect from_op="Filter Examples" from_port="example set output" to_op="Execute Script (2)" to_port="input 1"/>
          <connect from_op="Execute Script (2)" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="90"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • wessel
    wessel New Altair Community Member
    This is the data, generated with "generate spiral cluster".
    Also plotted the predictions made by KNN on this data.
    image

    Here is the error analysis:
    abs = absolute error at time point t, so abs(label(t) - prediction(label(t)))
    sum = the sum of absolute error up to time point t
    avg = the average of the sum, so basically sum / t
    image

    Similar to the first figure, but now also added Linear Regression predictions.
    Using a different plotter for no good reason, other then to show it is possible.
    image

    Plotting how the sum of errors of both KNN and Linear Regression progresses in time.
    image
  • deva
    deva New Altair Community Member
    Thx.

    That was the info I needed  :)

    I managed to calculate MAPE, MAPE is problematic because of possible zero label values. Maybe that is why you didn't manage to calculate it.

    All I did was modifying script module (code is below) and change your Generate Attribute module with  diff = label-pred:

    import com.rapidminer.operator.Operator;
    import com.rapidminer.Process;
    import com.rapidminer.MacroHandler;
    import com.rapidminer.tools.Ontology

    ExampleSet exampleSet = operator.getInput(ExampleSet.class);

    Attribute sum = AttributeFactory.createAttribute("sum", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(sum);
    exampleSet.getAttributes().addRegular(sum);

    Attribute avg = AttributeFactory.createAttribute("avg", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(avg);
    exampleSet.getAttributes().addRegular(avg);

    last = 0;
    n = 0;
    absPerc = 0;
    MAPE = 0;

    for (Example e : exampleSet) {
    if (e["label"] != 0)
    {
    absPerc = (e["diff"]/e["label"]).abs();
        e["sum"] = absPerc + last;
        last = e["sum"];
        n++;
        e["avg"] = last / n;
        MAPE =  e["avg"]; 
    }
    else
    {
    e["sum"] = last;
    e["avg"] = last / n;
    }
    }

    return exampleSet
    As you see, in my code I'm using MAPE variable, and I would like to return only that value, not whole exampleSet. Do you know how I can do it? Also do you know where i can find some info and help about how to program scripts. Scripts are based on groovy syntax but I did'nt find any info about RapidMiner framework.

    Thx.
  • wessel
    wessel New Altair Community Member
    Hey,

    You could modify the script to print MAPE to console.

    Or, what I do mostly, is simply look at the average of an attribute in the META view.

    If you create an attribute APE, which is abs((true - pred) / true),
    then you can look at the average.
    import com.rapidminer.operator.Operator;
    import com.rapidminer.Process;
    import com.rapidminer.MacroHandler;
    import com.rapidminer.tools.Ontology

    ExampleSet exampleSet = operator.getInput(ExampleSet.class);

    Attribute sum = AttributeFactory.createAttribute("sum", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(sum);
    exampleSet.getAttributes().addRegular(sum);

    Attribute avg = AttributeFactory.createAttribute("avg", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(avg);
    exampleSet.getAttributes().addRegular(avg);

    Attribute APE = AttributeFactory.createAttribute("APE", Ontology.REAL);
    exampleSet.getExampleTable().addAttribute(APE);
    exampleSet.getAttributes().addRegular(APE);

    last = 0;
    n = 0;
    absPerc = 0;
    MAPE = 0;

    for (Example e : exampleSet) {
    if (e["label"] != 0)
    {
    absPerc = (e["diff"]/e["label"]).abs();
        e["sum"] = absPerc + last;
        last = e["sum"];
        n++;
        e["avg"] = last / n;
        e["APE"] = absPerc;
        MAPE =  e["avg"]; 
    }
    else
    {
    e["sum"] = last;
    e["avg"] = last / n;
    e["APE"] = absPerc;
    }
    }

    return exampleSet

    image

    image