"Problem with Loop Operator"

Stefan_E
Stefan_E New Altair Community Member
edited November 5 in Community Q&A
I do the following:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
    <process expanded="true" height="521" width="681">
      <operator activated="true" class="read_aml" compatibility="5.0.8" expanded="true" height="60" name="Read AML" width="90" x="45" y="30">
        <parameter key="attributes" value="C:\Users\eichenbe\Documents\Backup\Laptop\LiveCopy\Software\RM5_tja1055\SigmaTable_Real.aml"/>
      </operator>
      <operator activated="true" class="loop" compatibility="5.0.8" expanded="true" height="94" name="Loop" width="90" x="179" y="30">
        <parameter key="iterations" value="2"/>
        <process expanded="true" height="771" width="867">
          <operator activated="true" class="decision_stump" compatibility="5.0.8" expanded="true" height="76" name="Decision Stump" width="90" x="45" y="30">
            <parameter key="criterion" value="accuracy"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model (2)" width="90" x="179" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="94" name="Multiply" width="90" x="112" y="255"/>
          <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples" width="90" x="246" y="255">
            <parameter key="condition_class" value="wrong_predictions"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (2)" width="90" x="380" y="255">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="Label=0"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (4)" width="90" x="246" y="345">
            <parameter key="condition_class" value="correct_predictions"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (3)" width="90" x="380" y="345">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="Label=1"/>
          </operator>
          <operator activated="true" class="append" compatibility="5.0.8" expanded="true" height="94" name="Append" width="90" x="514" y="255"/>
          <operator activated="true" class="select_attributes" compatibility="5.0.8" expanded="true" height="76" name="Select Attributes" width="90" x="648" y="255">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Label|ID|t_92027|t_92026|t_92025|t_92024|t_92023|t_92022|t_92021|t_92020|t_92019|t_92018|t_92017|t_92016|t_92015|t_92014|t_92013|t_92012|t_92011|t_92010|t_92009|t_92008|t_92007|t_92006|t_92005|t_92004|t_92003|t_92002|t_92001|t_91027|t_91026|t_91025|t_91024|t_91023|t_91022|t_91021|t_91020|t_91019|t_91018|t_91017|t_91016|t_91015|t_91014|t_91013|t_91012|t_91011|t_91010|t_91009|t_91008|t_91007|t_91006|t_91005|t_91004|t_91003|t_91002|t_91001|t_90027|t_90026|t_90025|t_90024|t_90023|t_90022|t_90021|t_90020|t_90019|t_90018|t_90017|t_90016|t_90015|t_90014|t_90013|t_90012|t_90011|t_90010|t_90009|t_90008|t_90007|t_90006|t_90005|t_90004|t_90003|t_90002|t_90001|t_82027|t_82026|t_82025|t_82024|t_82023|t_82022|t_82021|t_82020|t_82019|t_82018|t_82017|t_82016|t_82015|t_82014|t_82013|t_82012|t_82011|t_82010|t_82007|t_82006|t_82005|t_82004|t_82003|t_82002|t_82001|t_81027|t_81026|t_81025|t_81024|t_81023|t_81022|t_81021|t_81020|t_81019|t_81018|t_81017|t_81016|t_81015|t_81014|t_81013|t_81012|t_81011|t_81010|t_81007|t_81006|t_81005|t_81004|t_81003|t_81002|t_81001|t_80027|t_80026|t_80025|t_80024|t_80023|t_80022|t_80021|t_80020|t_80019|t_80018|t_80017|t_80016|t_80015|t_80014|t_80013|t_80012|t_80011|t_80010|t_80007|t_80006|t_80005|t_80004|t_80003|t_80002|t_80001"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <connect from_port="input 1" to_op="Decision Stump" to_port="training set"/>
          <connect from_op="Decision Stump" from_port="model" to_op="Apply Model (2)" to_port="model"/>
          <connect from_op="Decision Stump" from_port="exampleSet" to_op="Apply Model (2)" to_port="unlabelled data"/>
          <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Multiply" to_port="input"/>
          <connect from_op="Apply Model (2)" from_port="model" to_port="output 2"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Filter Examples" to_port="example set input"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Filter Examples (4)" to_port="example set input"/>
          <connect from_op="Filter Examples" from_port="example set output" to_op="Filter Examples (2)" to_port="example set input"/>
          <connect from_op="Filter Examples (2)" from_port="example set output" to_op="Append" to_port="example set 1"/>
          <connect from_op="Filter Examples (4)" from_port="example set output" to_op="Filter Examples (3)" to_port="example set input"/>
          <connect from_op="Filter Examples (3)" from_port="example set output" to_op="Append" to_port="example set 2"/>
          <connect from_op="Append" from_port="merged set" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_port="output 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
          <portSpacing port="sink_output 3" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Read AML" from_port="output" to_op="Loop" to_port="input 1"/>
      <connect from_op="Loop" from_port="output 1" to_port="result 1"/>
      <connect from_op="Loop" from_port="output 2" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
This creates two IO collections, one for the model, one for the example set.
However, both models and both example sets, corresponding to the two iterations of the loop look exactly identical.

If I roll-up this process explicitly laying out the two iterations, such as here, this creates the expected results.

Hence, I must conclude that the Loop operator is broken?  >:(
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
    <process expanded="true" height="746" width="891">
      <operator activated="true" class="read_aml" compatibility="5.0.8" expanded="true" height="60" name="Read AML" width="90" x="45" y="30">
        <parameter key="attributes" value="C:\Users\eichenbe\Documents\Backup\Laptop\LiveCopy\Software\RM5_tja1055\SigmaTable_Real.aml"/>
      </operator>
      <operator activated="true" class="decision_stump" compatibility="5.0.8" expanded="true" height="76" name="Decision Stump" width="90" x="179" y="30">
        <parameter key="criterion" value="accuracy"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model" width="90" x="313" y="30">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="94" name="Multiply (2)" width="90" x="179" y="165"/>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (7)" width="90" x="313" y="255">
        <parameter key="condition_class" value="correct_predictions"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (8)" width="90" x="447" y="255">
        <parameter key="condition_class" value="attribute_value_filter"/>
        <parameter key="parameter_string" value="Label=1"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (5)" width="90" x="313" y="165">
        <parameter key="condition_class" value="wrong_predictions"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (6)" width="90" x="447" y="165">
        <parameter key="condition_class" value="attribute_value_filter"/>
        <parameter key="parameter_string" value="Label=0"/>
      </operator>
      <operator activated="true" class="append" compatibility="5.0.8" expanded="true" height="94" name="Append (2)" width="90" x="581" y="210"/>
      <operator activated="true" class="select_attributes" compatibility="5.0.8" expanded="true" height="76" name="Select Attributes (2)" width="90" x="715" y="210">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="Label|ID|t_92027|t_92026|t_92025|t_92024|t_92023|t_92022|t_92021|t_92020|t_92019|t_92018|t_92017|t_92016|t_92015|t_92014|t_92013|t_92012|t_92011|t_92010|t_92009|t_92008|t_92007|t_92006|t_92005|t_92004|t_92003|t_92002|t_92001|t_91027|t_91026|t_91025|t_91024|t_91023|t_91022|t_91021|t_91020|t_91019|t_91018|t_91017|t_91016|t_91015|t_91014|t_91013|t_91012|t_91011|t_91010|t_91009|t_91008|t_91007|t_91006|t_91005|t_91004|t_91003|t_91002|t_91001|t_90027|t_90026|t_90025|t_90024|t_90023|t_90022|t_90021|t_90020|t_90019|t_90018|t_90017|t_90016|t_90015|t_90014|t_90013|t_90012|t_90011|t_90010|t_90009|t_90008|t_90007|t_90006|t_90005|t_90004|t_90003|t_90002|t_90001|t_82027|t_82026|t_82025|t_82024|t_82023|t_82022|t_82021|t_82020|t_82019|t_82018|t_82017|t_82016|t_82015|t_82014|t_82013|t_82012|t_82011|t_82010|t_82007|t_82006|t_82005|t_82004|t_82003|t_82002|t_82001|t_81027|t_81026|t_81025|t_81024|t_81023|t_81022|t_81021|t_81020|t_81019|t_81018|t_81017|t_81016|t_81015|t_81014|t_81013|t_81012|t_81011|t_81010|t_81007|t_81006|t_81005|t_81004|t_81003|t_81002|t_81001|t_80027|t_80026|t_80025|t_80024|t_80023|t_80022|t_80021|t_80020|t_80019|t_80018|t_80017|t_80016|t_80015|t_80014|t_80013|t_80012|t_80011|t_80010|t_80007|t_80006|t_80005|t_80004|t_80003|t_80002|t_80001"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="94" name="Multiply" width="90" x="179" y="390"/>
      <operator activated="true" class="decision_stump" compatibility="5.0.8" expanded="true" height="76" name="Decision Stump (2)" width="90" x="313" y="390">
        <parameter key="criterion" value="accuracy"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model (2)" width="90" x="447" y="390">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="94" name="Multiply (3)" width="90" x="179" y="525"/>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (4)" width="90" x="313" y="615">
        <parameter key="condition_class" value="correct_predictions"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (9)" width="90" x="447" y="615">
        <parameter key="condition_class" value="attribute_value_filter"/>
        <parameter key="parameter_string" value="Label=1"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (2)" width="90" x="313" y="525">
        <parameter key="condition_class" value="wrong_predictions"/>
      </operator>
      <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples (3)" width="90" x="447" y="525">
        <parameter key="condition_class" value="attribute_value_filter"/>
        <parameter key="parameter_string" value="Label=0"/>
      </operator>
      <operator activated="true" class="append" compatibility="5.0.8" expanded="true" height="94" name="Append (3)" width="90" x="581" y="570"/>
      <operator activated="true" class="select_attributes" compatibility="5.0.8" expanded="true" height="76" name="Select Attributes (3)" width="90" x="715" y="570">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="Label|ID|t_92027|t_92026|t_92025|t_92024|t_92023|t_92022|t_92021|t_92020|t_92019|t_92018|t_92017|t_92016|t_92015|t_92014|t_92013|t_92012|t_92011|t_92010|t_92009|t_92008|t_92007|t_92006|t_92005|t_92004|t_92003|t_92002|t_92001|t_91027|t_91026|t_91025|t_91024|t_91023|t_91022|t_91021|t_91020|t_91019|t_91018|t_91017|t_91016|t_91015|t_91014|t_91013|t_91012|t_91011|t_91010|t_91009|t_91008|t_91007|t_91006|t_91005|t_91004|t_91003|t_91002|t_91001|t_90027|t_90026|t_90025|t_90024|t_90023|t_90022|t_90021|t_90020|t_90019|t_90018|t_90017|t_90016|t_90015|t_90014|t_90013|t_90012|t_90011|t_90010|t_90009|t_90008|t_90007|t_90006|t_90005|t_90004|t_90003|t_90002|t_90001|t_82027|t_82026|t_82025|t_82024|t_82023|t_82022|t_82021|t_82020|t_82019|t_82018|t_82017|t_82016|t_82015|t_82014|t_82013|t_82012|t_82011|t_82010|t_82007|t_82006|t_82005|t_82004|t_82003|t_82002|t_82001|t_81027|t_81026|t_81025|t_81024|t_81023|t_81022|t_81021|t_81020|t_81019|t_81018|t_81017|t_81016|t_81015|t_81014|t_81013|t_81012|t_81011|t_81010|t_81007|t_81006|t_81005|t_81004|t_81003|t_81002|t_81001|t_80027|t_80026|t_80025|t_80024|t_80023|t_80022|t_80021|t_80020|t_80019|t_80018|t_80017|t_80016|t_80015|t_80014|t_80013|t_80012|t_80011|t_80010|t_80007|t_80006|t_80005|t_80004|t_80003|t_80002|t_80001"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <connect from_op="Read AML" from_port="output" to_op="Decision Stump" to_port="training set"/>
      <connect from_op="Decision Stump" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Decision Stump" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Multiply (2)" to_port="input"/>
      <connect from_op="Apply Model" from_port="model" to_port="result 1"/>
      <connect from_op="Multiply (2)" from_port="output 1" to_op="Filter Examples (5)" to_port="example set input"/>
      <connect from_op="Multiply (2)" from_port="output 2" to_op="Filter Examples (7)" to_port="example set input"/>
      <connect from_op="Filter Examples (7)" from_port="example set output" to_op="Filter Examples (8)" to_port="example set input"/>
      <connect from_op="Filter Examples (8)" from_port="example set output" to_op="Append (2)" to_port="example set 2"/>
      <connect from_op="Filter Examples (5)" from_port="example set output" to_op="Filter Examples (6)" to_port="example set input"/>
      <connect from_op="Filter Examples (6)" from_port="example set output" to_op="Append (2)" to_port="example set 1"/>
      <connect from_op="Append (2)" from_port="merged set" to_op="Select Attributes (2)" to_port="example set input"/>
      <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Decision Stump (2)" to_port="training set"/>
      <connect from_op="Multiply" from_port="output 2" to_port="result 3"/>
      <connect from_op="Decision Stump (2)" from_port="model" to_op="Apply Model (2)" to_port="model"/>
      <connect from_op="Decision Stump (2)" from_port="exampleSet" to_op="Apply Model (2)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Multiply (3)" to_port="input"/>
      <connect from_op="Apply Model (2)" from_port="model" to_port="result 2"/>
      <connect from_op="Multiply (3)" from_port="output 1" to_op="Filter Examples (2)" to_port="example set input"/>
      <connect from_op="Multiply (3)" from_port="output 2" to_op="Filter Examples (4)" to_port="example set input"/>
      <connect from_op="Filter Examples (4)" from_port="example set output" to_op="Filter Examples (9)" to_port="example set input"/>
      <connect from_op="Filter Examples (9)" from_port="example set output" to_op="Append (3)" to_port="example set 2"/>
      <connect from_op="Filter Examples (2)" from_port="example set output" to_op="Filter Examples (3)" to_port="example set input"/>
      <connect from_op="Filter Examples (3)" from_port="example set output" to_op="Append (3)" to_port="example set 1"/>
      <connect from_op="Append (3)" from_port="merged set" to_op="Select Attributes (3)" to_port="example set input"/>
      <connect from_op="Select Attributes (3)" from_port="example set output" to_port="result 4"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
      <portSpacing port="sink_result 5" spacing="0"/>
    </process>
  </operator>
</process>
Regards            Stefan

Answers

  • cherokee
    cherokee New Altair Community Member
    Hi Stefan_E,

    as far as I remember the Loop Operator worked for me. And we cannot check your process setup as something important is missing -- the data!

    Best regards,
    chero
  • haddock
    haddock New Altair Community Member
    Hi there,

    Like Chero, I'm OK with the loop operator in most of its guises http://rapid-i.com/rapidforum/index.php/topic,2251.msg9179.html#msg9179. As I remember it, finding that 'append' flattens collections helped. I ran your code with data generators and have to ask what you are trying to achieve with it.
  • Stefan_E
    Stefan_E New Altair Community Member
    Thanks for your answers. Not yet helping though, as I still don't understand why the two processes should give different results. Also not sure what Haddock wants to say with the remark on 'append'?

    What I want to do? Similar to a boosted decision stump learner:
    • I want to separate the data set in a tree like fashion but want to make sure that each attribute is only used twice at max, that is with a single upper and single lower bound for separation.
    • in the process I accept mis-classified good examples but want to eventually find all bad examples
    • hence, after each iteration, I build a new data set consisting of all mis-classified bad examples and all correctly classified good examples, then apply Decision Stump anew.
    So far, results look pretty good - but unfortunatly, I had to roll-up the loop into separate sub-processes which I then instantiate many times with 'Execute Process'.

    Stefan
  • cherokee
    cherokee New Altair Community Member
    Soooo,

    we have the typical case of documentation not matching code. The Loop operator does not deliver its output as new input for the next iteration. It just runs n times on the original input and collects the output.

    So the solution to your problem is not trivial. Try using the operators Remember and Recall within the loop operator and do not use any input directly.

    Best regards,
    chero
  • haddock
    haddock New Altair Community Member
    Greets,

    Chero's absolutely right, collections get made at each pass, in order to process them as one example set you need to use the append operator ( which you'd expect nearby on the menu ).

  • Stefan_E
    Stefan_E New Altair Community Member
    cherokee wrote:

    we have the typical case of documentation not matching code. The Loop operator does not deliver its output as new input for the next iteration. It just runs n times on the original input and collects the output.
    Hi Chero,

    you hit the nail on the head... It works perfect with Remember/Recall.

    I hope the RapidMiners read here and work on the documentation. It neither helps if it's not matching the code nor if it's trivial - as it most of the time is ... (eg: "Minimal size for split - the minimal size of a node in order to allow a split  ::) )

    Stefan