"How to get the last row after windowing"
WinKad
New Altair Community Member
Hi everybody,
I use the following test data [as you can see, all is in matrix form - (row,column)]:
1,1;1,2;1,3;1,4;1,5
2,1;2,2;2,3;2,4;2,5
3,1;3,2;3,3;3,4;3,5
4,1;4,2;4,3;4,4;4,5
5,1;5,2;5,3;5,4;5,5
6,1;6,2;6,3;6,4;6,5
7,1;7,2;7,3;7,4;7,5
8,1;8,2;8,3;8,4;8,5
9,1;9,2;9,3;9,4;9,5
10,1;10,2;10,3;10,4;10,5
After windowing with window size =3 for processing I want to get the last row of the data after windowing with window size = 2 as feed (unlabel data) for the process.
Perhaps is this question posted in another form, I didn't found it.
Here is my code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.11" expanded="true" name="Process">
<process expanded="true" height="521" width="415">
<operator activated="true" class="read_csv" compatibility="5.0.11" expanded="true" height="60" name="Read CSV" width="90" x="45" y="30">
<parameter key="file_name" value="D:\Eigene Dateien\Meine Projekte\Lotto\Rapidminer\Test.csv"/>
<parameter key="encoding" value="windows-1252"/>
<parameter key="trim_lines" value="true"/>
<parameter key="use_first_row_as_attribute_names" value="false"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="attribute_0.true.1.regular"/>
<parameter key="1" value="attribute_1.true.1.regular"/>
<parameter key="2" value="attribute_2.true.1.regular"/>
<parameter key="3" value="attribute_3.true.1.regular"/>
<parameter key="4" value="attribute_4.true.1.regular"/>
</list>
<parameter key="attribute_names_already_defined" value="true"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="5.0.11" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="30">
<parameter key="replace_what" value="(attribute_)"/>
<parameter key="replace_by" value="Z"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.0.11" expanded="true" height="112" name="Multiply" width="90" x="45" y="120"/>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing" width="90" x="179" y="165">
<parameter key="window_size" value="3"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing (2)" width="90" x="179" y="255">
<parameter key="window_size" value="2"/>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Windowing" to_port="example set input"/>
<connect from_op="Multiply" from_port="output 3" to_op="Windowing (2)" to_port="example set input"/>
<connect from_op="Windowing" from_port="example set output" to_port="result 2"/>
<connect from_op="Windowing (2)" from_port="example set output" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="90"/>
<portSpacing port="sink_result 2" spacing="36"/>
<portSpacing port="sink_result 3" spacing="162"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Is there a special opeator for this?
I use the following test data [as you can see, all is in matrix form - (row,column)]:
1,1;1,2;1,3;1,4;1,5
2,1;2,2;2,3;2,4;2,5
3,1;3,2;3,3;3,4;3,5
4,1;4,2;4,3;4,4;4,5
5,1;5,2;5,3;5,4;5,5
6,1;6,2;6,3;6,4;6,5
7,1;7,2;7,3;7,4;7,5
8,1;8,2;8,3;8,4;8,5
9,1;9,2;9,3;9,4;9,5
10,1;10,2;10,3;10,4;10,5
After windowing with window size =3 for processing I want to get the last row of the data after windowing with window size = 2 as feed (unlabel data) for the process.
Perhaps is this question posted in another form, I didn't found it.
Here is my code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.11" expanded="true" name="Process">
<process expanded="true" height="521" width="415">
<operator activated="true" class="read_csv" compatibility="5.0.11" expanded="true" height="60" name="Read CSV" width="90" x="45" y="30">
<parameter key="file_name" value="D:\Eigene Dateien\Meine Projekte\Lotto\Rapidminer\Test.csv"/>
<parameter key="encoding" value="windows-1252"/>
<parameter key="trim_lines" value="true"/>
<parameter key="use_first_row_as_attribute_names" value="false"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="attribute_0.true.1.regular"/>
<parameter key="1" value="attribute_1.true.1.regular"/>
<parameter key="2" value="attribute_2.true.1.regular"/>
<parameter key="3" value="attribute_3.true.1.regular"/>
<parameter key="4" value="attribute_4.true.1.regular"/>
</list>
<parameter key="attribute_names_already_defined" value="true"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="5.0.11" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="30">
<parameter key="replace_what" value="(attribute_)"/>
<parameter key="replace_by" value="Z"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.0.11" expanded="true" height="112" name="Multiply" width="90" x="45" y="120"/>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing" width="90" x="179" y="165">
<parameter key="window_size" value="3"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing (2)" width="90" x="179" y="255">
<parameter key="window_size" value="2"/>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Windowing" to_port="example set input"/>
<connect from_op="Multiply" from_port="output 3" to_op="Windowing (2)" to_port="example set input"/>
<connect from_op="Windowing" from_port="example set output" to_port="result 2"/>
<connect from_op="Windowing (2)" from_port="example set output" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="90"/>
<portSpacing port="sink_result 2" spacing="36"/>
<portSpacing port="sink_result 3" spacing="162"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Is there a special opeator for this?
Tagged:
0
Answers
-
What you want to achieve with this last row?
Make a prediction for the last window in your data?0 -
Hallo wessel,
yes, that's what I want.
But I have seen by trying to apply both outputs after windowing - one with window size=3 and the other with window size=2 - together that RM say NO to this managing. I suppose that there is a problem with the names or/and the order of the columns.
I have just looked at the output with 9,1 10,1 9,2 10,2 9,3 10,3 ... 9,5 10,5. But that is just what I want to get. Do I have to rename the names of the columns (with a macro-Iterator)?
Ciao
Winkad0 -
I did something very similar but I was not happy with my solution, so I hope someone else can suggest something better.
If you have a dataset lets say:
x
1
2
3
4
5
6
7
8
9
and you have windowSize = 3, horizon = 2, you get
x-2 x-1 x-0 label (where label is x+2)
1 2 3 5
2 3 4 6
3 4 5 7
4 5 8 9
so what you want is
7 8 9 ?
you can get this by filter example range 7 to 9
which gives
x
7
8
9
if you do windowing on this dataset without a horizon you get
x-2 x-1 x-0
7 8 9
Rapid Miner automatically adds the label attribute, it will give a warning that the label is missing, but it will work.
0 -
Hi everybody,
oh, what am I stupid. I thought that Filtering by Example Range is meaning the content of the rows...
Now here is what I found out:
The operator 'Naive Bayes' is nonsens here, but I want check if the filtered row would be accepted by the 'Apply Model'-operator.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
<process expanded="true" height="386" width="480">
<operator activated="true" class="subprocess" compatibility="5.0.10" expanded="true" height="130" name="Subprocess" width="90" x="45" y="30">
<parameter key="parallelize_nested_chain" value="true"/>
<process expanded="true" height="431" width="567">
<operator activated="true" class="read_csv" compatibility="5.0.10" expanded="true" height="60" name="Read CSV" width="90" x="45" y="30">
<parameter key="file_name" value="D:\Eigene Dateien\Meine Projekte\Lotto\Rapidminer\Test.csv"/>
<parameter key="encoding" value="windows-1252"/>
<parameter key="trim_lines" value="true"/>
<parameter key="use_first_row_as_attribute_names" value="false"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="attribute_0.true.1.regular"/>
<parameter key="1" value="attribute_1.true.1.regular"/>
<parameter key="2" value="attribute_2.true.1.regular"/>
<parameter key="3" value="attribute_3.true.1.regular"/>
<parameter key="4" value="attribute_4.true.1.regular"/>
</list>
<parameter key="attribute_names_already_defined" value="true"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="5.0.11" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="30">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="attribute_4|attribute_3|attribute_2|attribute_1|attribute_0"/>
<parameter key="regular_expression" value="(attribute_)"/>
<parameter key="replace_what" value="(attribute_)"/>
<parameter key="replace_by" value="col"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.0.11" expanded="true" height="94" name="Multiply" width="90" x="45" y="165"/>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing2" width="90" x="179" y="255">
<parameter key="window_size" value="2"/>
</operator>
<operator activated="true" class="filter_example_range" compatibility="5.0.11" expanded="true" height="76" name="Filter Example Range" width="90" x="313" y="255">
<parameter key="first_example" value="4"/>
<parameter key="last_example" value="4"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing3" width="90" x="179" y="165">
<parameter key="window_size" value="3"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.0.11" expanded="true" height="76" name="Set Role" width="90" x="313" y="165">
<parameter key="name" value="col0-0"/>
<parameter key="target_role" value="label"/>
</operator>
<operator activated="true" class="naive_bayes" compatibility="5.0.11" expanded="true" height="76" name="Naive Bayes" width="90" x="447" y="165"/>
<operator activated="true" class="apply_model" compatibility="5.0.11" expanded="true" height="76" name="Apply Model" width="90" x="447" y="255">
<list key="application_parameters"/>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Windowing3" to_port="example set input"/>
<connect from_op="Multiply" from_port="output 2" to_op="Windowing2" to_port="example set input"/>
<connect from_op="Windowing2" from_port="example set output" to_op="Filter Example Range" to_port="example set input"/>
<connect from_op="Filter Example Range" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Windowing3" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="out 1"/>
<connect from_op="Apply Model" from_port="model" to_port="out 2"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="sink_out 1" spacing="234"/>
<portSpacing port="sink_out 2" spacing="0"/>
<portSpacing port="sink_out 3" spacing="0"/>
<portSpacing port="sink_out 4" spacing="0"/>
<portSpacing port="sink_out 5" spacing="0"/>
</process>
</operator>
<connect from_op="Subprocess" from_port="out 1" to_port="result 1"/>
<connect from_op="Subprocess" from_port="out 2" to_port="result 2"/>
<connect from_op="Subprocess" from_port="out 3" to_port="result 3"/>
<connect from_op="Subprocess" from_port="out 4" to_port="result 4"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
<portSpacing port="sink_result 5" spacing="0"/>
</process>
</operator>
</process>
But now, what is the meaning of
?PM WARNING: SimpleDistribution: The number of regular attributes of the given example set does not fit the number of attributes of the training example set, training: 14, application: 10
PM WARNING: SimpleDistribution: The given example set does not contain a regular attribute with name 'col0-2'. This might cause problems for some models depending on this particular attribute.
0 -
Additional question: how can I get the number of the last row?0
-
Hi,
Note: I suppose there is an error! Let's see...
Windowing with window size=3 give with an original data set of 2 columns, labeled as C0 and C1, and with the header (here in Excel notation) :
C0-2 C0-1 C0-0 C1-2 C1-1 C0-0
A1 A2 A3 B1 B2 B3
A2 A3 A4 B2 B3 B4
A3 A4 A5 B3 B4 B5
Windowing with window size=2
C0-1 C0-0 C1-1 C1-0
A1 A2 B1 B2
A2 A3 B2 B3
A3 A4 B3 B4
A4 A5 B4 B5
Using this 2 example sets, the second one as unlabeled, with ApplyModel don't match.
It's a great pity!
I cannot make head or tail of it. ???
Ciao
WinKad
0