Select inside Loop Parameters module not working
Hello,
I am trying to run the following simulation
1) Execute a model (SVM or any other model) with different combination of parameters.
2) Get the model for each parameter set as output for the given training data.
3) Get the predicted outcome of the test data for each model corresponding to a parameter set combination.
I used "Loop Parameters" module and used SVM module inside it. When the setup is like this, I am getting one model output for each combination of parameter set, as expected. (This corresponds to "Loop Parameters" module in the attached XML code)
When I am applying the model generated by SVM, using "Apply Model" to the test data, inside "Loop Parameters(2)" module, I am getting the test data output for only one model only and not for each of the models. (This setup corresponds to "Loop Parameters(2)" in the attached XML code)
I tried a different approach to getting output for each of the models generated, corresponding to a combination of parameter set, by passing models "Collection" as input to "Loop Parameters", selecting each of them using "Select" and then passing that particular model to "Apply Model". Still I am only getting one model as output and not as many as the loop number is. (This setup corresponds to "Loop Parameters(3)" module of the attached XML code).
I noticed that the single output coming out of "Loop Parameters" module for the predicted output data is corresponding to the model number that is mentioned as default for "Select". This means that "Select.index" is not being changed. Accepting that the output coming out of the "Loop parameters" is going to be only one in number, inside the "Loop parameters" module, I tried writing to file, database etc. But, it does not work, i.e only one output get written, and not for each model that is generated.
The only way that multiple parameter test data can be accessed out of the "Loop parameter" module seems to be "IOObjectCollection". But, I cannot find a way to convert "IOObjectCollection" to "ExampleSet". Does anyone know of a way to do this conversion?
Could someone please point out to me what I am doing wrong, or any suggestions to try out or any alternate way of solving the problem?
Thanks,
Ajay
My XML code is as follows
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="404" width="642">
<operator activated="true" class="read_excel" compatibility="5.1.006" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
<parameter key="excel_file" value="/home/Ajay/learnRapidMiner/learnData.xls"/>
<parameter key="imported_cell_range" value="A1:H21"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply" width="90" x="179" y="75"/>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="76" name="Loop Parameters" width="90" x="380" y="30">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="341" width="660">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM" width="90" x="170" y="47">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="C" value="0.1"/>
</operator>
<connect from_port="input 1" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="read_excel" compatibility="5.1.006" expanded="true" height="60" name="Read Excel (2)" width="90" x="38" y="163">
<parameter key="excel_file" value="/home/Ajay/learnRapidMiner/learnData.xls"/>
<parameter key="sheet_number" value="2"/>
<parameter key="imported_cell_range" value="A1:H21"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="id.true.integer.id"/>
<parameter key="1" value="label.true.integer.label"/>
<parameter key="2" value="a.true.real.attribute"/>
<parameter key="3" value="b.true.real.attribute"/>
<parameter key="4" value="c.true.real.attribute"/>
<parameter key="5" value="d.true.real.attribute"/>
<parameter key="6" value="e.true.real.attribute"/>
<parameter key="7" value="f.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (2)" width="90" x="511" y="30"/>
<operator activated="true" class="multiply" compatibility="5.1.006" expanded="true" height="94" name="Multiply (3)" width="90" x="179" y="210"/>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Loop Parameters (2)" width="90" x="380" y="165">
<list key="parameters">
<parameter key="SVM.kernel_type" value="radial,polynomial"/>
<parameter key="SVM.C" value="0.01,0.1"/>
</list>
<process expanded="true" height="341" width="660">
<operator activated="true" class="support_vector_machine" compatibility="5.1.006" expanded="true" height="112" name="SVM (2)" width="90" x="179" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model" width="90" x="374" y="101">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="SVM (2)" to_port="training set"/>
<connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="SVM (2)" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="loop_parameters" compatibility="5.1.006" expanded="true" height="94" name="Loop Parameters (3)" width="90" x="514" y="300">
<list key="parameters">
<parameter key="Select.index" value="[1.0;4;4;linear]"/>
</list>
<process expanded="true" height="341" width="642">
<operator activated="true" class="select" compatibility="5.1.006" expanded="true" height="60" name="Select" width="90" x="112" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.1.006" expanded="true" height="76" name="Apply Model (2)" width="90" x="352" y="39">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="Select" to_port="collection"/>
<connect from_port="input 2" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Select" from_port="selected" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="source_input 3" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters (2)" to_port="input 1"/>
<connect from_op="Loop Parameters" from_port="result 1" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Read Excel (2)" from_port="output" to_op="Multiply (3)" to_port="input"/>
<connect from_op="Multiply (2)" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply (2)" from_port="output 2" to_op="Loop Parameters (3)" to_port="input 1"/>
<connect from_op="Multiply (3)" from_port="output 1" to_op="Loop Parameters (2)" to_port="input 2"/>
<connect from_op="Multiply (3)" from_port="output 2" to_op="Loop Parameters (3)" to_port="input 2"/>
<connect from_op="Loop Parameters (2)" from_port="result 1" to_port="result 2"/>
<connect from_op="Loop Parameters (3)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
train data is as follows
id label a b c d e f
1 0 0.25 0.85 0.65 0.45 0.95 0.12
2 1 0.75 0.85 0.35 0.55 0.95 0.86
3 1 0.75 0.85 0.65 0.55 0.95 0.12
4 1 0.75 0.85 0.65 0.55 0.685 0.12
5 1 0.75 0.85 0.65 0.45 0.95 0.12
6 0 0.75 0.85 0.65 0.55 0.95 0.86
7 1 0.75 0.15 0.65 0.45 0.95 0.12
8 1 0.25 0.85 0.35 0.45 0.685 0.86
9 0 0.75 0.85 0.65 0.45 0.685 0.12
10 1 0.75 0.85 0.65 0.55 0.95 0.86
11 1 0.75 0.85 0.65 0.55 0.95 0.86
12 1 0.75 0.85 0.65 0.55 0.685 0.12
13 0 0.75 0.85 0.35 0.55 0.95 0.12
14 1 0.75 0.85 0.65 0.45 0.95 0.12
15 0 0.75 0.85 0.65 0.45 0.95 0.86
16 1 0.75 0.85 0.65 0.45 0.95 0.86
17 0 0.75 0.85 0.65 0.45 0.95 0.12
18 1 0.25 0.15 0.35 0.45 0.95 0.12
19 1 0.75 0.85 0.65 0.55 0.685 0.86
20 1 0.25 0.85 0.65 0.45 0.95 0.12
test data is as follows
id label a b c d e f
1 1 0.25 0.15 0.35 0.45 0.95 0.12