Create new dataset from logical event?

Mark_Knecht
Mark_Knecht New Altair Community Member
edited November 5 in Community Q&A
Hi all,
  I'm completely new to RapidMiner and working through the demo videos and online tutorials, but at the same time trying to piece together how I'd use RapidMiner to look for certain things in a time series. I've got a lot to learn.

  I have a goal with this code:

1) Read some SPY data
2) Take a 5 period moving average of the close (MA_Fast)
3) Take a 4 period moving average of MA_Fast (MA_Slow)
4) Each time MA_Fast crosses above or below MA_Slow build a new new data set of the SPY data on the next cycle which includes all the data columns in the original set (Date, Time, Open, High

  I'm attaching a little bit of code which _might_ do something like steps 1-3, but I'm clueless as to how I do step 4. Can someone give me a hint? The multipliers are, in my mind, are creating results outputs that would go to some other piece of logic I don't know how to do now, not the actual output of the design.

  I'm also not sure I've defined labels in the original SPY data correctly. Columns are correctly names, but really both Date and Time seem like the label to me and so far it seems RM wants to only accept a single label. I'm probably doing somethign wrong there also.

Thanks,
Mark


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Process">
    <process expanded="true" height="586" width="413">
      <operator activated="true" class="retrieve" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
        <parameter key="repository_entry" value="//NewLocalRepository/Mark1/data/SPY_Daily"/>
      </operator>
      <operator activated="true" class="multiply" expanded="true" height="94" name="Multiply (2)" width="90" x="179" y="30"/>
      <operator activated="true" class="series:moving_average" expanded="true" height="76" name="Moving Average" width="90" x="112" y="165">
        <parameter key="attribute_name" value="Close"/>
      </operator>
      <operator activated="true" class="multiply" expanded="true" height="94" name="Multiply" width="90" x="112" y="255"/>
      <operator activated="true" class="series:moving_average" expanded="true" height="76" name="Moving Average (3)" width="90" x="246" y="300">
        <parameter key="attribute_name" value="Close"/>
      </operator>
      <connect from_op="Retrieve" from_port="output" to_op="Multiply (2)" to_port="input"/>
      <connect from_op="Multiply (2)" from_port="output 1" to_port="result 1"/>
      <connect from_op="Multiply (2)" from_port="output 2" to_op="Moving Average" to_port="example set input"/>
      <connect from_op="Moving Average" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_port="result 2"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Moving Average (3)" to_port="example set input"/>
      <connect from_op="Moving Average (3)" from_port="example set output" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="144"/>
      <portSpacing port="sink_result 3" spacing="90"/>
      <portSpacing port="sink_result 4" spacing="0"/>
    </process>
  </operator>
</process>
Tagged:

Answers

  • SebastianLoh
    SebastianLoh New Altair Community Member
    Hi Mark,

    take a look at this process:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="572" width="681">
          <operator activated="true" class="retrieve" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="generate_id" expanded="true" height="76" name="Generate ID" width="90" x="179" y="30">
            <parameter key="create_nominal_ids" value="true"/>
          </operator>
          <operator activated="true" class="multiply" expanded="true" height="94" name="Multiply" width="90" x="313" y="30"/>
          <operator activated="true" class="select_attributes" expanded="true" height="76" name="Select Attributes" width="90" x="447" y="120">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Temperature|id"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="series:moving_average" expanded="true" height="76" name="Moving Average" width="90" x="45" y="165">
            <parameter key="attribute_name" value="Temperature"/>
          </operator>
          <operator activated="true" class="series:moving_average" expanded="true" height="76" name="Moving Average (3)" width="90" x="179" y="165">
            <parameter key="attribute_name" value="moving_average(Temperature)"/>
            <parameter key="window_width" value="4"/>
          </operator>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename" width="90" x="45" y="300">
            <parameter key="old_name" value="moving_average(Temperature)"/>
            <parameter key="new_name" value="MA_Fast"/>
          </operator>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename (2)" width="90" x="179" y="300">
            <parameter key="old_name" value="moving_average(moving_average(Temperature))"/>
            <parameter key="new_name" value="MA_Slow"/>
          </operator>
          <operator activated="true" class="generate_attributes" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="300">
            <list key="function_descriptions">
              <parameter key="larger" value="if (MA_Fast  &gt; MA_Slow, 1, 0) "/>
            </list>
          </operator>
          <operator activated="true" class="select_attributes" expanded="true" height="76" name="Select Attributes (2)" width="90" x="447" y="300">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="id|larger"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="materialize_data" expanded="true" height="76" name="Materialize Data" width="90" x="45" y="435"/>
          <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing" width="90" x="179" y="435">
            <parameter key="window_size" value="2"/>
          </operator>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename (3)" width="90" x="313" y="435">
            <parameter key="old_name" value="larger-0"/>
            <parameter key="new_name" value="first"/>
          </operator>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename (4)" width="90" x="447" y="435">
            <parameter key="old_name" value="larger-1"/>
            <parameter key="new_name" value="second"/>
          </operator>
          <operator activated="true" class="generate_attributes" expanded="true" height="76" name="Generate Attributes (2)" width="90" x="581" y="435">
            <list key="function_descriptions">
              <parameter key="crossing" value="first != second"/>
            </list>
          </operator>
          <operator activated="true" class="join" expanded="true" height="76" name="Join" width="90" x="581" y="30"/>
          <connect from_op="Retrieve" from_port="output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Join" to_port="left"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Moving Average" to_port="example set input"/>
          <connect from_op="Moving Average" from_port="example set output" to_op="Moving Average (3)" to_port="example set input"/>
          <connect from_op="Moving Average (3)" from_port="example set output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Rename (2)" to_port="example set input"/>
          <connect from_op="Rename (2)" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Select Attributes (2)" to_port="example set input"/>
          <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Materialize Data" to_port="example set input"/>
          <connect from_op="Materialize Data" from_port="example set output" to_op="Windowing" to_port="example set input"/>
          <connect from_op="Windowing" from_port="example set output" to_op="Rename (3)" to_port="example set input"/>
          <connect from_op="Rename (3)" from_port="example set output" to_op="Rename (4)" to_port="example set input"/>
          <connect from_op="Rename (4)" from_port="example set output" to_op="Generate Attributes (2)" to_port="example set input"/>
          <connect from_op="Generate Attributes (2)" from_port="example set output" to_op="Join" to_port="right"/>
          <connect from_op="Join" from_port="join" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="414"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Ciao Sebastian