"[SOLVED][HELP]How to implement user-based performance criteria??"
xiazhang3
New Altair Community Member
Hi,
How to implement user-based performance criteria? For example, I would like the sensitivity to be as high as possible, if the false positive rate is not larger than 80%.
Or, I would like to the sum of (0.8*sensitivity+0.2*specificity) as my performance criterion.
Thanks a lot!!!
Xia
How to implement user-based performance criteria? For example, I would like the sensitivity to be as high as possible, if the false positive rate is not larger than 80%.
Or, I would like to the sum of (0.8*sensitivity+0.2*specificity) as my performance criterion.
Thanks a lot!!!
Xia
Tagged:
0
Answers
-
Hello
The "Extract Performance" operator may help
At the risk of shameless self promotion here is a process that uses it.
http://rapidminernotes.blogspot.co.uk/search/label/ClusterValidity
regards
Andrew0 -
You may also want to try "Combine Performances". See the attached process.
Best, Marius<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="296" width="681">
<operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75">
<parameter key="target_function" value="random classification"/>
</operator>
<operator activated="true" class="naive_bayes" compatibility="5.2.006" expanded="true" height="76" name="Naive Bayes" width="90" x="179" y="75"/>
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="313" y="75">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance (2)" width="90" x="447" y="75"/>
<operator activated="true" class="combine_performances" compatibility="5.2.006" expanded="true" height="60" name="Performance" width="90" x="581" y="75">
<list key="criteria_weights">
<parameter key="recall" value="0.8"/>
<parameter key="precision" value="0.2"/>
</list>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Naive Bayes" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Performance (2)" from_port="performance" to_op="Performance" to_port="performance"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Thanks all for your input!
The issue with "combine performances" is that the resulting output would not contain the "performanceVector", "parameterVector", and "kernel model (SVM). Is there any way to show them up in the "results" window or store them as a file somehow after the optimization process?
The following is the my process without the "combine performances" component:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="595" width="815">
<operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="75" y="77">
<parameter key="repository_entry" value="../../dataset/10530_213_data"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="5.2.000" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="75">
<list key="columns"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.006" expanded="true" height="94" name="Normalize" width="90" x="45" y="345"/>
<operator activated="true" class="free_memory" compatibility="5.2.006" expanded="true" height="76" name="Free Memory" width="90" x="179" y="480"/>
<operator activated="true" class="remap_binominals" compatibility="5.2.006" expanded="true" height="76" name="Remap Binominals" width="90" x="380" y="390">
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="inactive"/>
<parameter key="positive_value" value="active"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.2.006" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="715" y="300">
<list key="parameters">
<parameter key="SVM.C" value="5,2,3,4,10,40,1"/>
</list>
<process expanded="true" height="540" width="471">
<operator activated="true" class="x_validation" compatibility="5.2.006" expanded="true" height="112" name="Validation" width="90" x="112" y="120">
<process expanded="true" height="558" width="219">
<operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="45" y="30">
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="0.0010"/>
<parameter key="C" value="1"/>
<parameter key="convergence_epsilon" value="0.0010"/>
<parameter key="L_neg" value="5000.0"/>
<parameter key="balance_cost" value="true"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="701" width="279">
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="179" y="435">
<parameter key="main_criterion" value="sensitivity"/>
<parameter key="accuracy" value="false"/>
<parameter key="sensitivity" value="true"/>
<parameter key="specificity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="5.2.006" expanded="true" height="76" name="Log" width="90" x="313" y="255">
<parameter key="filename" value="/project/itdd/zhangx2/rapidminer_project/logfiles/log_Lpostive"/>
<list key="log">
<parameter key="C" value="operator.SVM.parameter.C"/>
<parameter key="gamma" value="operator.SVM.parameter.kernel_gamma"/>
<parameter key="performance" value="operator.Performance.value.performance"/>
</list>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Free Memory" to_port="through 1"/>
<connect from_op="Free Memory" from_port="through 1" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 2"/>
<connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>0 -
I've bookmarked your blog, though I still don't quite understand how to use that component to solve my problem . Thanks!awchisholm wrote:
Hello
The "Extract Performance" operator may help
At the risk of shameless self promotion here is a process that uses it.
http://rapidminernotes.blogspot.co.uk/search/label/ClusterValidity
regards
Andrew0 -
Hi, for me this is working (had to replace your Retrieve operator with Generate Data, since I obviously can't retrieve your data). Does this problem still exist in the latest version of RapidMiner?xiazhang3 wrote:
The issue with "combine performances" is that the resulting output would not contain the "performanceVector", "parameterVector", and "kernel model (SVM). Is there any way to show them up in the "results" window or store them as a file somehow after the optimization process?
Best,
Marius0 -
Thanks for your reminder! I just updated it to the latest version. It now can show the parameter set, the performance vector, the kernel (SVM).
However, under "PerformanceVector", it only shows weighted_performance as the following:
weighted_performance: 0.961 +/- 0.012 (mikro: 0.961). Is there any way to show the confusion matrix, or true positives vs predicted positives, true negatives vs. predicted negatives? That will be very helpful.
Thanks again for your kind help!0 -
Combine performances does not output the confusion matrix. You have to connect the output of the standard Performance operator to the process output, too, e.g. with a Multiply. Please have a look at the attached process.
Best, Marius<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="296" width="949">
<operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75">
<parameter key="target_function" value="random classification"/>
</operator>
<operator activated="true" class="naive_bayes" compatibility="5.2.006" expanded="true" height="76" name="Naive Bayes" width="90" x="179" y="75"/>
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="313" y="75">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance (2)" width="90" x="447" y="75"/>
<operator activated="true" class="multiply" compatibility="5.2.006" expanded="true" height="94" name="Multiply" width="90" x="581" y="120"/>
<operator activated="true" class="combine_performances" compatibility="5.2.006" expanded="true" height="60" name="Performance" width="90" x="715" y="30">
<list key="criteria_weights">
<parameter key="recall" value="0.8"/>
<parameter key="precision" value="0.2"/>
</list>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Naive Bayes" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Performance (2)" from_port="performance" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Performance" to_port="performance"/>
<connect from_op="Multiply" from_port="output 2" to_port="result 2"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>0 -
Hi Marius,Marius wrote:
Combine performances does not output the confusion matrix. You have to connect the output of the standard Performance operator to the process output, too, e.g. with a Multiply. Please have a look at the attached process.
Best, Marius
This is great! Thanks so much for your kind help!
I have another silly question. It seems to me that the results are different between the situations where I chose/check "sensitivity" alone in the performance component, and where I chose/check "sensitivity" as the main criteria and also specificity and AUC in the performance component. Is there any places or documents that I could learn more clearly how the performance component works? Or if you could explain a little more?
Thanks a lot!
Xia0 -
Unfortunately, most operators don't document their algorithms in detail, because they implement well-known algorithms and concepts.
The performance should not change however. Can you please post a process where this is the case?
Best,
Marius0 -
Sure.
Sensitivity as the main criterion and the only criterion.
code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="595" width="815">
<operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="75" y="77">
<parameter key="repository_entry" value="../../dataset/10530_213_data"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="5.2.000" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="75">
<list key="columns"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.006" expanded="true" height="94" name="Normalize" width="90" x="45" y="345"/>
<operator activated="true" class="free_memory" compatibility="5.2.006" expanded="true" height="76" name="Free Memory" width="90" x="179" y="480"/>
<operator activated="true" class="remap_binominals" compatibility="5.2.006" expanded="true" height="76" name="Remap Binominals" width="90" x="380" y="390">
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="inactive"/>
<parameter key="positive_value" value="active"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.2.006" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="715" y="300">
<list key="parameters">
<parameter key="SVM.C" value="5,1"/>
</list>
<process expanded="true" height="540" width="471">
<operator activated="true" class="x_validation" compatibility="5.2.006" expanded="true" height="112" name="Validation" width="90" x="112" y="120">
<process expanded="true" height="558" width="219">
<operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="45" y="30">
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="0.0010"/>
<parameter key="C" value="5"/>
<parameter key="convergence_epsilon" value="0.0010"/>
<parameter key="L_neg" value="5000.0"/>
<parameter key="balance_cost" value="true"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="558" width="232">
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="45" y="210">
<parameter key="main_criterion" value="sensitivity"/>
<parameter key="accuracy" value="false"/>
<parameter key="sensitivity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="5.2.006" expanded="true" height="76" name="Log" width="90" x="313" y="255">
<parameter key="filename" value="/project/itdd/zhangx2/rapidminer_project/logfiles/log_Lpostive"/>
<list key="log">
<parameter key="C" value="operator.SVM.parameter.C"/>
<parameter key="gamma" value="operator.SVM.parameter.kernel_gamma"/>
<parameter key="performance" value="operator.Performance.value.performance"/>
</list>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>extr
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Free Memory" to_port="through 1"/>
<connect from_op="Free Memory" from_port="through 1" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 2"/>
<connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 3"/>
<<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="595" width="815">
<operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="75" y="77">
<parameter key="repository_entry" value="../../dataset/10530_213_data"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="5.2.000" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="75">
<list key="columns"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.006" expanded="true" height="94" name="Normalize" width="90" x="45" y="345"/>
<operator activated="true" class="free_memory" compatibility="5.2.006" expanded="true" height="76" name="Free Memory" width="90" x="179" y="480"/>
<operator activated="true" class="remap_binominals" compatibility="5.2.006" expanded="true" height="76" name="Remap Binominals" width="90" x="380" y="390">
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="inactive"/>
<parameter key="positive_value" value="active"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.2.006" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="715" y="300">
<list key="parameters">
<parameter key="SVM.C" value="5,1"/>
</list>
<process expanded="true" height="540" width="471">
<operator activated="true" class="x_validation" compatibility="5.2.006" expanded="true" height="112" name="Validation" width="90" x="112" y="120">
<process expanded="true" height="558" width="219">
<operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="45" y="30">
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="0.0010"/>
<parameter key="C" value="1"/>
<parameter key="convergence_epsilon" value="0.0010"/>
<parameter key="L_neg" value="5000.0"/>
<parameter key="balance_cost" value="true"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="701" width="279">
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="179" y="435">
<parameter key="main_criterion" value="sensitivity"/>
<parameter key="accuracy" value="false"/>
<parameter key="sensitivity" value="true"/>
<parameter key="specificity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="5.2.006" expanded="true" height="76" name="Log" width="90" x="313" y="255">
<parameter key="filename" value="/project/itdd/zhangx2/rapidminer_project/logfiles/log_Lpostive"/>
<list key="log">
<parameter key="C" value="operator.SVM.parameter.C"/>
<parameter key="gamma" value="operator.SVM.parameter.kernel_gamma"/>
<parameter key="performance" value="operator.Performance.value.performance"/>
</list>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Free Memory" to_port="through 1"/>
<connect from_op="Free Memory" from_port="through 1" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 2"/>
<connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
0 -
Sensitivity as the main criterion and also checked specificity in the performance component:
Code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="595" width="815">
<operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="75" y="77">
<parameter key="repository_entry" value="../../dataset/10530_213_data"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="5.2.000" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="75">
<list key="columns"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.006" expanded="true" height="94" name="Normalize" width="90" x="45" y="345"/>
<operator activated="true" class="free_memory" compatibility="5.2.006" expanded="true" height="76" name="Free Memory" width="90" x="179" y="480"/>
<operator activated="true" class="remap_binominals" compatibility="5.2.006" expanded="true" height="76" name="Remap Binominals" width="90" x="380" y="390">
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="inactive"/>
<parameter key="positive_value" value="active"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.2.006" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="715" y="300">
<list key="parameters">
<parameter key="SVM.C" value="5,1"/>
</list>
<process expanded="true" height="540" width="471">
<operator activated="true" class="x_validation" compatibility="5.2.006" expanded="true" height="112" name="Validation" width="90" x="112" y="120">
<process expanded="true" height="558" width="219">
<operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="45" y="30">
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="0.0010"/>
<parameter key="C" value="1"/>
<parameter key="convergence_epsilon" value="0.0010"/>
<parameter key="L_neg" value="5000.0"/>
<parameter key="balance_cost" value="true"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="701" width="279">
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="179" y="435">
<parameter key="main_criterion" value="sensitivity"/>
<parameter key="accuracy" value="false"/>
<parameter key="sensitivity" value="true"/>
<parameter key="specificity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="595" width="815">
<operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="75" y="77">
<parameter key="repository_entry" value="../../dataset/10530_213_dextrata"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="5.2.000" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="75">
<list key="columns"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.006" expanded="true" height="94" name="Normalize" width="90" x="45" y="345"/>
<operator activated="true" class="free_memory" compatibility="5.2.006" expanded="true" height="76" name="Free Memory" width="90" x="179" y="480"/>
<operator activated="true" class="remap_binominals" compatibility="5.2.006" expanded="true" height="76" name="Remap Binominals" width="90" x="380" y="390">
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="inactive"/>
<parameter key="positive_value" value="active"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.2.006" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="715" y="300">
<list key="parameters">
<parameter key="SVM.C" value="5,1"/>
</list>
<pr<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<process expanded="true" height="595" width="815">
<operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="75" y="77">
<parameter key="repository_entry" value="../../dataset/10530_213_data"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="5.2.000" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="75">
<list key="columns"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.006" expanded="true" height="94" name="Normalize" width="90" x="45" y="345"/>
<operator activated="true" class="free_memory" compatibility="5.2.006" expanded="true" height="76" name="Free Memory" width="90" x="179" y="480"/>
<operator activated="true" class="remap_binominals" compatibility="5.2.006" expanded="true" height="76" name="Remap Binominals" width="90" x="380" y="390">
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="inactive"/>
<parameter key="positive_value" value="active"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="5.2.006" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="715" y="300">
<list key="parameters">
<parameter key="SVM.C" value="5,1"/>
</list>
<process expanded="true" height="540" width="471">
<operator activated="true" class="x_validation" compatibility="5.2.006" expanded="true" height="112" name="Validation" width="90" x="112" y="120">
<process expanded="true" height="558" width="219">
<operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="45" y="30">
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="0.0010"/>
<parameter key="C" value="1"/>
<parameter key="convergence_epsilon" value="0.0010"/>
<parameter key="L_neg" value="5000.0"/>
<parameter key="balance_cost" value="true"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="701" width="279">
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="179" y="435">
<parameter key="main_criterion" value="sensitivity"/>
<parameter key="accuracy" value="false"/>
<parameter key="sensitivity" value="true"/>
<parameter key="specificity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="5.2.006" expanded="true" height="76" name="Log" width="90" x="313" y="255">
<parameter key="filename" value="/project/itdd/zhangx2/rapidminer_project/logfiles/log_Lpostive"/>
<list key="log">
<parameter key="C" value="operator.SVM.parameter.C"/>
<parameter key="gamma" value="operator.SVM.parameter.kernel_gamma"/>
<parameter key="performance" value="operator.Performance.value.performance"/>
</list>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Free Memory" to_port="through 1"/>
<connect from_op="Free Memory" from_port="through 1" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 2"/>
<connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
ocess expanded="true" height="540" width="471">
<operator activated="true" class="x_validation" compatibility="5.2.006" expanded="true" height="112" name="Validation" width="90" x="112" y="120">
<process expanded="true" height="558" width="219">
<operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="45" y="30">
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="0.0010"/>
<parameter key="C" value="5"/>
<parameter key="convergence_epsilon" value="0.0010"/>
<parameter key="L_neg" value="5000.0"/>
<parameter key="balance_cost" value="true"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="558" width="232">
<operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="45" y="210">
<parameter key="main_criterion" value="sensitivity"/>
<parameter key="accuracy" value="false"/>
<parameter key="sensitivity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="5.2.006" expanded="true" height="76" name="Log" width="90" x="313" y="255">
<parameter key="filename" value="/project/itdd/zhangx2/rapidminer_project/logfiles/log_Lpostive"/>
<list key="log">
<parameter key="C" value="operator.SVM.parameter.C"/>
<parameter key="gamma" value="operator.SVM.parameter.kernel_gamma"/>
<parameter key="performance" value="operator.Performance.value.performance"/>
</list>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Free Memory" to_port="through 1"/>
<connect from_op="Free Memory" from_port="through 1" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 2"/>
<connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>0 -
I just re-ran the above two codes. However, this time it gave me the same results.
But, the last time I run it, it give two different results. One with sensitivity larger than 95%, and the other ~83%.
Maybe, it is because I have updated the RapidMiner. ?
Anyway, thanks for helping with my problems!!! I really appreciate this software and your help!
Xia0 -
Maybe by accident you did not set a parameter correctly in your first tries. Happens even to the best0