Hi,
I was trying to test the features of RM5 during the weekend and I got something working which is intended to optimize
the parameters of a neural network for prediction. ( I used that Topic for an example comparing RM4 tree model with that I used in RM5
http://rapid-i.com/rapidforum/index.php/topic,615.msg2350.html#msg2350 )
First of all it should optimize Learning rate and Momentum for the MLP network.
If I choose a "strange" combination for grid range for the LR and M then it happens that the process halts stops during execution and error is thrown saying "Cannot reset network to a smaller learning rate". (I submitted for that already a bugreport)
BTW for the RBF network it works just fine!
1. It would be nice to know, if I might picked up wrong combinations of parameter ranges (0.1-0.8 for both LR and M with 10 steps) that causes this error?
2. Can I optimize the number of neurons as well? (In the parameter wizard the "hidden_layer" is grayed out)
3. I replace the missing values infront of the GridOptimization and not within the nested process. Rapid miner realizes that as error in the problemsTab at the bottom but it seems to work. Is that bad practice placing the preprocessing processes outside of the nested operators?
4. I havent quite understood how I can use the "optimized parameters" further. Where are these optimized parameters stored? and how can I access it? During the optimization the process does keep the "at this stage best" parameters for later use, doesnt it?
There exists a process SetParameters but only has an input node and no output. I tried to use the name_map feature there to map the parameters from the learner inside the optimization process to an outside Learner but it seems not really to work. maybe I have not understood the concept behind it. Can someone explain that for Beginners?
Which model does the parameterOptimzation operator returns? The last model which has been evaluated (maxRange of parameter) or that one with the optimized parameters?
I am sorry to ask that but I am simply not clear with that.
Any answer is appreciated! Thanks in advance
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input>
<location/>
</input>
<output>
<location/>
<location/>
<location/>
</output>
<macros/>
</context>
<operator activated="true" class="process" expanded="true" name="Process">
<process expanded="true" height="558" width="913">
<operator activated="true" class="retrieve" expanded="true" height="60" name="Retrieve" width="90" x="72" y="41">
<parameter key="repository_entry" value="VoidsBGA2007"/>
</operator>
<operator activated="true" class="nominal_to_numerical" expanded="true" height="94" name="Nominal to Numerical" width="90" x="179" y="120"/>
<operator activated="true" class="select_attributes" expanded="true" height="76" name="Select Attributes" width="90" x="112" y="255">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="BGA Standoff Soldered|BGA Standoff assembled|BGA Void mean|Solder paste|Soldering|Alloy compound|Wetting inner area mean|Wetting paste height"/>
</operator>
<operator activated="true" class="filter_examples" expanded="true" height="76" name="Filter Examples" width="90" x="112" y="435">
<parameter key="condition_class" value="missing_labels"/>
<parameter key="invert_filter" value="true"/>
</operator>
<operator activated="true" class="replace_missing_values" expanded="true" height="94" name="Replace Missing Values" width="90" x="246" y="435">
<list key="columns"/>
</operator>
<operator activated="true" class="multiply" expanded="true" height="94" name="Multiply" width="90" x="380" y="435"/>
<operator activated="true" class="optimize_parameters_grid" expanded="true" height="130" name="Optimize Parameters (Grid)" width="90" x="447" y="255">
<list key="parameters">
<parameter key="TrainBPNN local.learning_rate" value="[0.1;0.8;10;linear]"/>
<parameter key="TrainBPNN local.momentum" value="[0.1;0.8;10;linear]"/>
</list>
<parameter key="parallelize_optimization_process" value="true"/>
<process expanded="true" height="576" width="931">
<operator activated="true" class="x_validation" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
<parameter key="sampling_type" value="shuffled sampling"/>
<process expanded="true" height="576" width="440">
<operator activated="true" class="neural_net" expanded="true" height="76" name="TrainBPNN local" width="90" x="179" y="30">
<list key="hidden_layers">
<parameter key="H1" value="20"/>
</list>
<parameter key="learning_rate" value="0.66"/>
<parameter key="momentum" value="0.1"/>
</operator>
<connect from_port="training" to_op="TrainBPNN local" to_port="training set"/>
<connect from_op="TrainBPNN local" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="576" width="440">
<operator activated="true" class="apply_model" expanded="true" height="76" name="testBPNN local" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" expanded="true" height="76" name="Performance" width="90" x="268" y="30">
<parameter key="use_example_weights" value="false"/>
</operator>
<connect from_port="model" to_op="testBPNN local" to_port="model"/>
<connect from_port="test set" to_op="testBPNN local" to_port="unlabelled data"/>
<connect from_op="testBPNN local" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 2"/>
<connect from_op="Validation" from_port="training" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
<operator activated="true" class="set_parameters" expanded="true" height="60" name="Set Parameters" width="90" x="514" y="120">
<list key="name_map">
<parameter key="testBPNN local" value="BPNN Opt"/>
</list>
</operator>
<operator activated="true" class="neural_net" expanded="true" height="76" name="BPNN Opt" width="90" x="715" y="435">
<list key="hidden_layers">
<parameter key="H1" value="5"/>
</list>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Nominal to Numerical" to_port="example set input"/>
<connect from_op="Nominal to Numerical" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="BPNN Opt" to_port="training set"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_op="Set Parameters" to_port="parameter set"/>
<connect from_op="BPNN Opt" from_port="model" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
PS: I read the related topic here
http://rapid-i.com/rapidforum/index.php/topic,1279.msg4911.html#msg4911