Grid Parameter Optimization
lexusboy
New Altair Community Member
Hello,
I am a little confused with the behavior of the Grid Parameter Optimization operator, which i as i understand performs a grid search on a list of parameters for a particular machine learning algorithm (e.g. LibSVM), on a particular set of data, and gives you the best (or optimal) parameter for that data.
When i ran my tests I also used the ProcessLog operator to track the values as the tests went on. Below is the output from one such test, of the process Log & the grid parameter operator. From the output you can clearly see that the C parameter values 128 & 0.5 yield the best results, but strangely the grid parameter gives "64" as the optimal parameter value. And this is just one example i have many more test outputs which give similar results, could somebody please explain this. Thanks!
1)
# Generated by ProcessLog[com.rapidminer.operator.visualization.ProcessLogOperator]
# ClassificationAccuracy C Parameter
NaN 0.03125
0.4875 0.0625
0.5 0.125
0.5 0.25
0.7125 0.5
0.65 1.0
0.5875 2.0
0.6 4.0
0.6625 8.0
0.6375 16.0
0.6625 32.0
0.625 64.0
0.725 128.0
0.6625 256.0
0.625 512.0
0.6625 1024.0
0.6875 2048.0
0.6375 4096.0
0.6375 8192.0
0.6625 16384.0
0.6875 32768.0
2)
<?xml version="1.0" encoding="windows-1252"?>
<parameterset version="4.6">
<parameter operator="LibSVMLearner" key="C" value="64"/>
</parameterset>
I am a little confused with the behavior of the Grid Parameter Optimization operator, which i as i understand performs a grid search on a list of parameters for a particular machine learning algorithm (e.g. LibSVM), on a particular set of data, and gives you the best (or optimal) parameter for that data.
When i ran my tests I also used the ProcessLog operator to track the values as the tests went on. Below is the output from one such test, of the process Log & the grid parameter operator. From the output you can clearly see that the C parameter values 128 & 0.5 yield the best results, but strangely the grid parameter gives "64" as the optimal parameter value. And this is just one example i have many more test outputs which give similar results, could somebody please explain this. Thanks!
1)
# Generated by ProcessLog[com.rapidminer.operator.visualization.ProcessLogOperator]
# ClassificationAccuracy C Parameter
NaN 0.03125
0.4875 0.0625
0.5 0.125
0.5 0.25
0.7125 0.5
0.65 1.0
0.5875 2.0
0.6 4.0
0.6625 8.0
0.6375 16.0
0.6625 32.0
0.625 64.0
0.725 128.0
0.6625 256.0
0.625 512.0
0.6625 1024.0
0.6875 2048.0
0.6375 4096.0
0.6375 8192.0
0.6625 16384.0
0.6875 32768.0
2)
<?xml version="1.0" encoding="windows-1252"?>
<parameterset version="4.6">
<parameter operator="LibSVMLearner" key="C" value="64"/>
</parameterset>
Tagged:
0
Answers
-
Hi there,
Sounds interesting, perhaps you could post the XML for the process?0 -
Hi Haddock,
Here is the XML of the process:
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSource" class="ExampleSource">
<parameter key="attributes" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\test_2_svm_negative_out_200_tf_idf.aml"/>
</operator>
<operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="yes">
<list key="parameters">
<parameter key="LibSVMLearner.C" value="0.03125,0.0625,0.125,0.25,0.5,1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,32768"/>
</list>
<operator name="ProcessLog" class="ProcessLog">
<parameter key="filename" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\processLog.log"/>
<list key="log">
<parameter key="ClassificationAccuracy" value="operator.ClassificationPerformance.value.accuracy"/>
<parameter key="C Parameter" value="operator.LibSVMLearner.parameter.C"/>
</list>
<parameter key="persistent" value="true"/>
</operator>
<operator name="XValidation" class="XValidation" expanded="yes">
<parameter key="number_of_validations" value="5"/>
<operator name="LibSVMLearner" class="LibSVMLearner">
<parameter key="C" value="32768"/>
<list key="class_weights">
</list>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="main_criterion" value="accuracy"/>
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="true"/>
<parameter key="weighted_mean_recall" value="true"/>
<parameter key="weighted_mean_precision" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
</operator>
<operator name="ParameterSetWriter" class="ParameterSetWriter">
<parameter key="parameter_file" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\gridParameters.par"/>
</operator>
</operator>
Thanks for looking into this0 -
Hola lexusboy,
I think you just got the log in the wrong place, it needs to come after the validation rather than before it. That is why your earlier version did not provide a performance figure for the first pass...
To save you the bother, here it is..
# ClassificationAccuracy C Parameter
NaN 0.03125<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSource" class="ExampleSource" activated="no">
<parameter key="attributes" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\test_2_svm_negative_out_200_tf_idf.aml"/>
</operator>
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="random"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="parameter_string" value="label"/>
<parameter key="attribute_name_regex" value="label"/>
<parameter key="process_special_attributes" value="true"/>
<operator name="BinDiscretization" class="BinDiscretization">
<parameter key="range_name_type" value="short"/>
</operator>
</operator>
<operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="yes">
<list key="parameters">
<parameter key="LibSVMLearner.C" value="0.03125,0.0625,0.125,0.25,0.5,1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,32768"/>
</list>
<operator name="XValidation" class="XValidation" expanded="no">
<parameter key="number_of_validations" value="5"/>
<operator name="LibSVMLearner" class="LibSVMLearner">
<parameter key="C" value="32768"/>
<list key="class_weights">
</list>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="main_criterion" value="accuracy"/>
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="true"/>
<parameter key="weighted_mean_recall" value="true"/>
<parameter key="weighted_mean_precision" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
<operator name="ProcessLog" class="ProcessLog">
<parameter key="filename" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\processLog.log"/>
<list key="log">
<parameter key="ClassificationAccuracy" value="operator.ClassificationPerformance.value.accuracy"/>
<parameter key="C Parameter" value="operator.LibSVMLearner.parameter.C"/>
</list>
<parameter key="persistent" value="true"/>
</operator>
</operator>
<operator name="ParameterSetWriter" class="ParameterSetWriter">
<parameter key="parameter_file" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\gridParameters.par"/>
</operator>
</operator>0 -
Hi Haddock,
Yes thats what was wrong .....thanks for your help
Cheers0