Hi,
I'd like to perform a parameter optimization where the quality of the model and its parameters
are not evaluated by the standard performance values like "absolute_error" but by a customized
measure which is generated by an external tool.
My idea looks as follows (not yet complete) [based on 1. example from sample directory "07_Meta"]:
<operator name="Root" class="Process" expanded="yes">
<operator name="Input" class="ExampleSource">
<parameter key="attributes" value="../data/polynomial.aml"/>
</operator>
<operator name="ParameterOptimization" class="GridParameterOptimization" expanded="yes">
<list key="parameters">
<parameter key="Training.C" value="50,100,150,200,250"/>
<parameter key="Training.degree" value="1,2,3,4,5"/>
</list>
<operator name="Validation" class="XValidation" expanded="yes">
<parameter key="sampling_type" value="shuffled sampling"/>
<operator name="Training" class="LibSVMLearner">
<parameter key="C" value="50"/>
<parameter key="degree" value="1"/>
<parameter key="epsilon" value="0.01"/>
<parameter key="kernel_type" value="poly"/>
<parameter key="svm_type" value="epsilon-SVR"/>
</operator>
<operator name="ApplierChain" class="OperatorChain" expanded="yes">
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="/tmp/mymodel.mod"/>
<parameter key="output_type" value="XML"/>
</operator>
<operator name="CommandLineOperator" class="CommandLineOperator">
<parameter key="command" value="INVOKE EXTERNAL TOOL"/>
</operator>
<operator name="Replace1" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="Replace2" class="RegressionPerformance">
<parameter key="absolute_error" value="true"/>
<parameter key="normalized_absolute_error" value="true"/>
<parameter key="root_mean_squared_error" value="true"/>
<parameter key="squared_error" value="true"/>
</operator>
</operator>
</operator>
</operator>
<operator name="ParameterSetWriter" class="ParameterSetWriter">
<parameter key="parameter_file" value="/tmp/parameters.par"/>
</operator>
</operator>
A short explanation:
The idea is to write each model (with currently considered parameters) into a file ("ModelWrite") where
it is used together with an external application (invoked via "CommandLineOperator") to generate a
performance value. I skip all details - just assume that the performance of the currently considered model
is dumped into a file from where RapidMiner should read it. Based on these values the cross-validation can
be performed. So basically RapidMiner's "ModelApplier" and the performance operator generating a performance
vector must be replaced by a call to an external tool.
My question:
How can I integrate the results of an external tool into the XValdation chain? Obviously. the two operators named
"Replace1/2" must be replaced by an operator that reads the custom performance value (is an integer value, the
smaller the better) from a file generated by the external tool and transforms it into a valid "PerformanceVector"
which is used afterwards by the XValidation operator. Reading from a file and translating the integer value into
a performance vector are the two issues I couldn't solve yet. :-)
Do you have any ideas how to accomplish this?
Thank you in advance.