Select inside Loop Parameters module not working
Aj
New Altair Community Member
Hello,
I am trying to run the following simulation
1) Execute a model (SVM or any other model) with different combination of parameters.
2) Get the model for each parameter set as output for the given training data.
3) Get the predicted outcome of the test data for each model corresponding to a parameter set combination.
I used "Loop Parameters" module and used SVM module inside it. When the setup is like this, I am getting one model output for each combination of parameter set, as expected. (This corresponds to "Loop Parameters" module in the attached XML code)
When I am applying the model generated by SVM, using "Apply Model" to the test data, inside "Loop Parameters(2)" module, I am getting the test data output for only one model only and not for each of the models. (This setup corresponds to "Loop Parameters(2)" in the attached XML code)
I tried a different approach to getting output for each of the models generated, corresponding to a combination of parameter set, by passing models "Collection" as input to "Loop Parameters", selecting each of them using "Select" and then passing that particular model to "Apply Model". Still I am only getting one model as output and not as many as the loop number is. (This setup corresponds to "Loop Parameters(3)" module of the attached XML code).
I noticed that the single output coming out of "Loop Parameters" module for the predicted output data is corresponding to the model number that is mentioned as default for "Select". This means that "Select.index" is not being changed. Accepting that the output coming out of the "Loop parameters" is going to be only one in number, inside the "Loop parameters" module, I tried writing to file, database etc. But, it does not work, i.e only one output get written, and not for each model that is generated.
The only way that multiple parameter test data can be accessed out of the "Loop parameter" module seems to be "IOObjectCollection". But, I cannot find a way to convert "IOObjectCollection" to "ExampleSet". Does anyone know of a way to do this conversion?
Could someone please point out to me what I am doing wrong, or any suggestions to try out or any alternate way of solving the problem?
Thanks,
Ajay
My XML code is as follows
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="404" width="642">
<operator activated="true" class="read_excel" compatibility="5.1.006" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
<parameter key="excel_file" value="/home/Ajay/learnRapidMiner/learnData.xls"/>
<parameter key="imported_cell_range" value="A1:H21"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply" width="90" x="179" y="75"/>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="76" name="Loop Parameters" width="90" x="380" y="30">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="341" width="660">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM" width="90" x="170" y="47">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="C" value="0.1"/>
</operator>
<connect from_port="input 1" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="read_excel" compatibility="5.1.006" expanded="true" height="60" name="Read Excel (2)" width="90" x="38" y="163">
<parameter key="excel_file" value="/home/Ajay/learnRapidMiner/learnData.xls"/>
<parameter key="sheet_number" value="2"/>
<parameter key="imported_cell_range" value="A1:H21"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (2)" width="90" x="511" y="30"/>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (3)" width="90" x="179" y="210"/>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Loop Parameters (2)" width="90" x="380" y="165">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="341" width="660">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM (2)" width="90" x="179" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model" width="90" x="374" y="101">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="SVM (2)" to_port="training set"/>
<connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="SVM (2)" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Loop Parameters (3)" width="90" x="514" y="300">
<list key="parameters">
<parameter key="Select.index" value="[1.0;4;4;linear]"/>
</list>
<process expanded="true" height="341" width="642">
<operator activated="true" class="select" compatibility="5.1.006" expanded="true" height="60" name="Select" width="90" x="112" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model (2)" width="90" x="352" y="39">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="Select" to_port="collection"/>
<connect from_port="input 2" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Select" from_port="selected" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters (2)" to_port="input 1"/>
<connect from_op="Loop Parameters" from_port="result 1" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Read Excel (2)" from_port="output" to_op="Multiply (3)" to_port="input"/>
<connect from_op="Multiply (2)" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply (2)" from_port="output 2" to_op="Loop Parameters (3)" to_port="input 1"/>
<connect from_op="Multiply (3)" from_port="output 1" to_op="Loop Parameters (2)" to_port="input 2"/>
<connect from_op="Multiply (3)" from_port="output 2" to_op="Loop Parameters (3)" to_port="input 2"/>
<connect from_op="Loop Parameters (2)" from_port="result 1" to_port="result 2"/>
<connect from_op="Loop Parameters (3)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
train data is as follows
id label a b c d e f
1 0 0.25 0.85 0.65 0.45 0.95 0.12
2 1 0.75 0.85 0.35 0.55 0.95 0.86
3 1 0.75 0.85 0.65 0.55 0.95 0.12
4 1 0.75 0.85 0.65 0.55 0.685 0.12
5 1 0.75 0.85 0.65 0.45 0.95 0.12
6 0 0.75 0.85 0.65 0.55 0.95 0.86
7 1 0.75 0.15 0.65 0.45 0.95 0.12
8 1 0.25 0.85 0.35 0.45 0.685 0.86
9 0 0.75 0.85 0.65 0.45 0.685 0.12
10 1 0.75 0.85 0.65 0.55 0.95 0.86
11 1 0.75 0.85 0.65 0.55 0.95 0.86
12 1 0.75 0.85 0.65 0.55 0.685 0.12
13 0 0.75 0.85 0.35 0.55 0.95 0.12
14 1 0.75 0.85 0.65 0.45 0.95 0.12
15 0 0.75 0.85 0.65 0.45 0.95 0.86
16 1 0.75 0.85 0.65 0.45 0.95 0.86
17 0 0.75 0.85 0.65 0.45 0.95 0.12
18 1 0.25 0.15 0.35 0.45 0.95 0.12
19 1 0.75 0.85 0.65 0.55 0.685 0.86
20 1 0.25 0.85 0.65 0.45 0.95 0.12
test data is as follows
id label a b c d e f
1 1 0.25 0.15 0.35 0.45 0.95 0.12
I am trying to run the following simulation
1) Execute a model (SVM or any other model) with different combination of parameters.
2) Get the model for each parameter set as output for the given training data.
3) Get the predicted outcome of the test data for each model corresponding to a parameter set combination.
I used "Loop Parameters" module and used SVM module inside it. When the setup is like this, I am getting one model output for each combination of parameter set, as expected. (This corresponds to "Loop Parameters" module in the attached XML code)
When I am applying the model generated by SVM, using "Apply Model" to the test data, inside "Loop Parameters(2)" module, I am getting the test data output for only one model only and not for each of the models. (This setup corresponds to "Loop Parameters(2)" in the attached XML code)
I tried a different approach to getting output for each of the models generated, corresponding to a combination of parameter set, by passing models "Collection" as input to "Loop Parameters", selecting each of them using "Select" and then passing that particular model to "Apply Model". Still I am only getting one model as output and not as many as the loop number is. (This setup corresponds to "Loop Parameters(3)" module of the attached XML code).
I noticed that the single output coming out of "Loop Parameters" module for the predicted output data is corresponding to the model number that is mentioned as default for "Select". This means that "Select.index" is not being changed. Accepting that the output coming out of the "Loop parameters" is going to be only one in number, inside the "Loop parameters" module, I tried writing to file, database etc. But, it does not work, i.e only one output get written, and not for each model that is generated.
The only way that multiple parameter test data can be accessed out of the "Loop parameter" module seems to be "IOObjectCollection". But, I cannot find a way to convert "IOObjectCollection" to "ExampleSet". Does anyone know of a way to do this conversion?
Could someone please point out to me what I am doing wrong, or any suggestions to try out or any alternate way of solving the problem?
Thanks,
Ajay
My XML code is as follows
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="404" width="642">
<operator activated="true" class="read_excel" compatibility="5.1.006" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
<parameter key="excel_file" value="/home/Ajay/learnRapidMiner/learnData.xls"/>
<parameter key="imported_cell_range" value="A1:H21"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply" width="90" x="179" y="75"/>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="76" name="Loop Parameters" width="90" x="380" y="30">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="341" width="660">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM" width="90" x="170" y="47">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="C" value="0.1"/>
</operator>
<connect from_port="input 1" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="read_excel" compatibility="5.1.006" expanded="true" height="60" name="Read Excel (2)" width="90" x="38" y="163">
<parameter key="excel_file" value="/home/Ajay/learnRapidMiner/learnData.xls"/>
<parameter key="sheet_number" value="2"/>
<parameter key="imported_cell_range" value="A1:H21"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (2)" width="90" x="511" y="30"/>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (3)" width="90" x="179" y="210"/>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Loop Parameters (2)" width="90" x="380" y="165">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="341" width="660">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM (2)" width="90" x="179" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model" width="90" x="374" y="101">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="SVM (2)" to_port="training set"/>
<connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="SVM (2)" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Loop Parameters (3)" width="90" x="514" y="300">
<list key="parameters">
<parameter key="Select.index" value="[1.0;4;4;linear]"/>
</list>
<process expanded="true" height="341" width="642">
<operator activated="true" class="select" compatibility="5.1.006" expanded="true" height="60" name="Select" width="90" x="112" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model (2)" width="90" x="352" y="39">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="Select" to_port="collection"/>
<connect from_port="input 2" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Select" from_port="selected" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters (2)" to_port="input 1"/>
<connect from_op="Loop Parameters" from_port="result 1" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Read Excel (2)" from_port="output" to_op="Multiply (3)" to_port="input"/>
<connect from_op="Multiply (2)" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply (2)" from_port="output 2" to_op="Loop Parameters (3)" to_port="input 1"/>
<connect from_op="Multiply (3)" from_port="output 1" to_op="Loop Parameters (2)" to_port="input 2"/>
<connect from_op="Multiply (3)" from_port="output 2" to_op="Loop Parameters (3)" to_port="input 2"/>
<connect from_op="Loop Parameters (2)" from_port="result 1" to_port="result 2"/>
<connect from_op="Loop Parameters (3)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
train data is as follows
id label a b c d e f
1 0 0.25 0.85 0.65 0.45 0.95 0.12
2 1 0.75 0.85 0.35 0.55 0.95 0.86
3 1 0.75 0.85 0.65 0.55 0.95 0.12
4 1 0.75 0.85 0.65 0.55 0.685 0.12
5 1 0.75 0.85 0.65 0.45 0.95 0.12
6 0 0.75 0.85 0.65 0.55 0.95 0.86
7 1 0.75 0.15 0.65 0.45 0.95 0.12
8 1 0.25 0.85 0.35 0.45 0.685 0.86
9 0 0.75 0.85 0.65 0.45 0.685 0.12
10 1 0.75 0.85 0.65 0.55 0.95 0.86
11 1 0.75 0.85 0.65 0.55 0.95 0.86
12 1 0.75 0.85 0.65 0.55 0.685 0.12
13 0 0.75 0.85 0.35 0.55 0.95 0.12
14 1 0.75 0.85 0.65 0.45 0.95 0.12
15 0 0.75 0.85 0.65 0.45 0.95 0.86
16 1 0.75 0.85 0.65 0.45 0.95 0.86
17 0 0.75 0.85 0.65 0.45 0.95 0.12
18 1 0.25 0.15 0.35 0.45 0.95 0.12
19 1 0.75 0.85 0.65 0.55 0.685 0.86
20 1 0.25 0.85 0.65 0.45 0.95 0.12
test data is as follows
id label a b c d e f
1 1 0.25 0.15 0.35 0.45 0.95 0.12
Tagged:
0
Answers
-
Hi,
check out, if the XML-code below returns your expected results.
Best regards
Marcin
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
<process expanded="true" height="533" width="705">
<operator activated="true" class="read_csv" compatibility="5.1.008" expanded="true" height="60" name="Read Train Data" width="90" x="45" y="30">
<parameter key="csv_file" value="/home/marcin/temp/forum-question/train.csv"/>
<parameter key="column_separators" value="\s+"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="read_csv" compatibility="5.1.008" expanded="true" height="60" name="Read Test Data" width="90" x="45" y="120">
<parameter key="csv_file" value="/home/marcin/temp/forum-question/test.csv"/>
<parameter key="column_separators" value="\s+"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="loop_parameters" compatibility="5.1.008" expanded="true" height="94" name="Learn Model With Different Parameters" width="90" x="246" y="30">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="533" width="547">
<operator activated="true" class="support_vector_machine" compatibility="5.1.008" expanded="true" height="112" name="SVM" width="90" x="112" y="30">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="C" value="0.1"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="179" y="255">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.008" expanded="true" height="94" name="Multiply" width="90" x="313" y="75"/>
<operator activated="true" class="performance" compatibility="5.1.008" expanded="true" height="76" name="Performance" width="90" x="447" y="30"/>
<connect from_port="input 1" to_op="SVM" to_port="training set"/>
<connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="SVM" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Performance" to_port="labelled data"/>
<connect from_op="Multiply" from_port="output 2" to_port="result 1"/>
<connect from_op="Performance" from_port="performance" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Train Data" from_port="output" to_op="Learn Model With Different Parameters" to_port="input 1"/>
<connect from_op="Read Test Data" from_port="output" to_op="Learn Model With Different Parameters" to_port="input 2"/>
<connect from_op="Learn Model With Different Parameters" from_port="result 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Hello Marcin,
Thanks for working on the problem I am facing and also for replying.
I have copied the code you have sent and ran the simulation. I am getting more than one more model output. The module that really seem to be making the main difference in getting more than one model predicted output displayed is "Multiply", i.e "Performance" module does not seem to make any difference in this regard.
"Multiply" module seems to be acting like some sort of a buffer that produces predicted output as many times as the number of parameter combinations. But, if notices closely, alternate models predicted output is the same, i.e the predicted output for different combination of parameters is getting repeated. Let's say there are 8 different models for parameter combinations, there will be only two different values of predicted outcomes out of the 8.
To find out whether the predicted output is really like that, I generated a data set with more number of examples and the output (label) a more complex function. I modified the simulation such that the output of the data set is predicted with "Apply Model" inside "Loop Parameters". I also passed this generated models outside "Loop Parameters" and ran the loop with "Loop Collection". Also, to notice any subtleties, I increased the number of parameter combinations to 8.
With the simulation set up modified as above, the number of different values of predicted outputs coming out of "Loop Parameters" is only 2 out of 8 different models. But for "Loop Collection", there are total of 6 different values out of 8 different models, which seem to be more like the weights generated by the different models.
"Loop Collection" module seems to be a work around for "Loop Parameter" not working on all its inner modules, i.e the loop seem to be working only on the first inner module. Please note that "Loop Parameter Parallel" does not work even on the first inner module.
Please let me know if I am doing something wrong or if you have any suggestions for me to work on to solve this problem.
Thanks,
Ajay
The output of my simulation is as follows
Loop Collection
0.996
-9.082
0.964
-26.026
0.996
74.939
0.964
116.079
Learn model With Different Parameters
0.964
116.079
0.964
116.079
0.964
116.079
0.964
116.079
The train data that I used for simulation is as follows
id label a b c d e f
1 0 0.75 0.85 0.65 0.45 0.95 0.12
2 1 0.75 0.85 0.65 0.55 0.685 0.86
3 1 0.75 0.85 0.65 0.45 0.95 0.12
4 1 0.75 0.85 0.65 0.55 0.95 0.12
5 1 0.75 0.85 0.65 0.45 0.95 0.12
6 0 0.75 0.15 0.65 0.55 0.685 0.12
7 1 0.75 0.85 0.35 0.45 0.685 0.12
8 0 0.25 0.15 0.35 0.45 0.95 0.12
9 1 0.75 0.85 0.35 0.45 0.685 0.12
10 1 0.25 0.15 0.35 0.55 0.95 0.12
11 1 0.25 0.85 0.35 0.45 0.685 0.12
12 0 0.75 0.85 0.65 0.45 0.685 0.12
13 1 0.75 0.15 0.65 0.55 0.685 0.12
14 1 0.75 0.85 0.35 0.55 0.95 0.12
15 0 0.75 0.85 0.35 0.45 0.95 0.12
16 0 0.75 0.15 0.65 0.45 0.95 0.12
17 1 0.25 0.85 0.65 0.45 0.95 0.12
18 1 0.25 0.85 0.35 0.45 0.95 0.12
19 1 0.75 0.85 0.65 0.55 0.95 0.86
20 0 0.75 0.85 0.65 0.45 0.685 0.86
21 0 0.75 0.85 0.65 0.45 0.685 0.86
22 1 0.75 0.85 0.65 0.45 0.685 0.86
23 0 0.75 0.85 0.65 0.45 0.685 0.86
24 0 0.75 0.85 0.65 0.45 0.685 0.86
25 0 0.75 0.85 0.65 0.45 0.685 0.86
26 1 0.75 0.85 0.65 0.45 0.685 0.86
27 0 0.75 0.85 0.65 0.45 0.685 0.86
28 1 0.75 0.85 0.65 0.45 0.685 0.86
29 0 0.75 0.85 0.65 0.45 0.685 0.86
30 1 0.75 0.85 0.65 0.45 0.685 0.86
31 1 0.75 0.85 0.65 0.45 0.685 0.86
32 1 0.75 0.85 0.65 0.45 0.685 0.86
33 1 0.75 0.85 0.65 0.45 0.685 0.86
34 0 0.75 0.85 0.65 0.45 0.685 0.86
35 0 0.75 0.85 0.65 0.45 0.685 0.86
36 0 0.75 0.85 0.65 0.45 0.685 0.86
37 0 0.75 0.85 0.65 0.45 0.685 0.86
38 1 0.75 0.85 0.65 0.45 0.685 0.86
39 1 0.75 0.85 0.65 0.45 0.685 0.86
40 1 0.75 0.85 0.65 0.45 0.685 0.86
41 0 0.75 0.85 0.65 0.45 0.685 0.86
42 1 0.75 0.85 0.65 0.45 0.685 0.86
43 1 0.75 0.85 0.65 0.45 0.685 0.86
44 1 0.75 0.85 0.65 0.45 0.685 0.86
45 1 0.75 0.85 0.65 0.45 0.685 0.86
46 0 0.75 0.85 0.65 0.45 0.685 0.86
47 1 0.75 0.85 0.65 0.45 0.685 0.86
48 1 0.75 0.85 0.65 0.45 0.685 0.86
49 1 0.75 0.85 0.65 0.45 0.685 0.86
50 1 0.75 0.85 0.65 0.45 0.685 0.86
test data is as follows
id label a b c d e f
1 0 0.12 0.95 1 0.75 2 0.3
The XML generated by Rapid Miner for my simulation is as follows
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="533" width="705">
<operator activated="true" class="read_csv" compatibility="5.1.006" expanded="true" height="60" name="Read Train Data" width="90" x="45" y="30">
<parameter key="csv_file" value="/home/Ajay/learnRapidMiner/train.csv"/>
<parameter key="column_separators" value=","/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="read_csv" compatibility="5.1.006" expanded="true" height="60" name="Read Test Data" width="90" x="45" y="120">
<parameter key="csv_file" value="/home/Ajay/learnRapidMiner/test.csv"/>
<parameter key="column_separators" value=","/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Learn Model With Different Parameters" width="90" x="246" y="30">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
<parameter key="SVM.kernel_degree" value="3,4"/>
<parameter key="SVM.convergence_epsilon" value="0.001"/>
</list>
<process expanded="true" height="533" width="547">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM" width="90" x="112" y="30">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="kernel_degree" value="4"/>
<parameter key="C" value="0.1"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model" width="90" x="179" y="255">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply" width="90" x="313" y="75"/>
<operator activated="true" class="performance" compatibility="5.1.006" expanded="true" height="76" name="Performance" width="90" x="447" y="30"/>
<connect from_port="input 1" to_op="SVM" to_port="training set"/>
<connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="SVM" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Multiply" to_port="input"/>
<connect from_op="Apply Model" from_port="model" to_port="result 2"/>
<connect from_op="Multiply" from_port="output 1" to_op="Performance" to_port="labelled data"/>
<connect from_op="Multiply" from_port="output 2" to_port="result 1"/>
<connect from_op="Performance" from_port="performance" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (2)" width="90" x="380" y="75"/>
<operator activated="true" class="loop_collection" compatibility="5.1.006" expanded="true" height="76" name="Loop Collection" width="90" x="514" y="165">
<process expanded="true" height="311" width="689">
<operator activated="true" class="read_csv" compatibility="5.1.006" expanded="true" height="60" name="Read Test Data (2)" width="90" x="23" y="94">
<parameter key="csv_file" value="/home/Ajay/learnRapidMiner/test.csv"/>
<parameter key="column_separators" value=","/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model (2)" width="90" x="246" y="30">
<list key="application_parameters"/>
</operator>
<connect from_port="single" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Read Test Data (2)" from_port="output" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="output 1"/>
<portSpacing port="source_single" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Train Data" from_port="output" to_op="Learn Model With Different Parameters" to_port="input 1"/>
<connect from_op="Read Test Data" from_port="output" to_op="Learn Model With Different Parameters" to_port="input 2"/>
<connect from_op="Learn Model With Different Parameters" from_port="result 1" to_port="result 1"/>
<connect from_op="Learn Model With Different Parameters" from_port="result 2" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Multiply (2)" from_port="output 1" to_op="Loop Collection" to_port="collection"/>
<connect from_op="Multiply (2)" from_port="output 2" to_port="result 2"/>
<connect from_op="Loop Collection" from_port="output 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
0 -
Hey Aj,
I am in a hurry right now, but i will come back to you later.
To clarify the matter, can you explain what you want to do and where exactly your problem is? As far as i have understood you:- Learn X different model with X different parameter sets (Using a SVM for regression)
- Apply each of this X models to the test data
Marcin0