split results into bins for evaluation
bobdobbs
New Altair Community Member
Hi,
Another fun puzzle for the RM team.
I've run a model that outputs confidence values for an SVM class. The values range from 0 to 1 (as expected for this type of model.)
One very common method of evaluation I've seen in papers is to break the results into "bins" or "groups" by confidence range and then report the accuracy of each range.
Something like
Any way to this sort of analysis in RM??
Thanks!
Another fun puzzle for the RM team.
I've run a model that outputs confidence values for an SVM class. The values range from 0 to 1 (as expected for this type of model.)
One very common method of evaluation I've seen in papers is to break the results into "bins" or "groups" by confidence range and then report the accuracy of each range.
Something like
Range | # predicted | # correct | % correct |
60-65 | 35 | 21 | 60 |
55-60 | 130 | 75 | 57.6 |
Thanks!
0
Answers
-
Hello bobdobbs
I do not have the time to create a perfect fully tested process, but this non-executable process should give you an idea:
I hope you can make it from here.
<operator name="Root" class="Process" expanded="yes">
<operator name="yourdata" class="ExampleSource">
</operator>
<operator name="select_confidence_attribute" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="attribute_name_regex" value="your-confidence-column"/>
<operator name="define_confidence_discretization_here" class="UserBasedDiscretization">
<list key="classes">
<parameter key="last" value="Infinity"/>
</list>
</operator>
</operator>
<operator name="iterate_over_discretized_confidence_attribute" class="ValueSubgroupIterator" expanded="yes">
<list key="attributes">
<parameter key="discretized_confidence" value="all"/>
</list>
<operator name="your-measure" class="BinominalClassificationPerformance">
<parameter key="false_positive" value="true"/>
<parameter key="true_positive" value="true"/>
<parameter key="positive_predictive_value" value="true"/>
</operator>
<operator name="Macro2Log" class="Macro2Log">
<parameter key="macro_name" value="loop_value"/>
</operator>
<operator name="ProcessLog" class="ProcessLog">
<list key="log">
<parameter key="value" value="operator.Macro2Log.value.macro_value"/>
<parameter key="performance" value="operator.your-measure.value.positive_predictive_value"/>
</list>
</operator>
</operator>
</operator>
regards,
Steffen0 -
Just a small note: beside calculating this yourself as indicated by Steffen, the lift chart operators might be also interesting to you.
Cheers,
Ingo0 -
Thanks Stefan!!!
That's a great way to solve this problem.
I'm getting good predictions, but the confidence "score" seems out of alignment. For example, my 90%-100% confidence level is true about 70%. My 80-90% is true about 60%, etc. So, I guess the model is good, but the scores are just "scores" and not reliable probability.
Thanks!!!!!
-B0 -
Hello bobdobbs
Nice to hear. As a quick idea, you could use the operator Platt Scaling to calibrate the scores i.e. to make them a better approximation to the true probabilties.
kind regards,
Steffen
0 -
Stefan,
A great idea as usual!
Question: Where in the process would I put the platt operator. Do I do it after training? or testing? or both?
Thnaks!
0 -
Hello again ... weekend break is over ...
I suggest something like the process below. Additional remark: The XVPrediction is used to train a separate basis of confidences to prevent overfitting as suggested in platt's original paper.
regards,
<operator name="Root" class="Process" expanded="yes">
<operator name="generate_set" class="OperatorChain" expanded="no">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="random"/>
</operator>
<operator name="IdTagging" class="IdTagging">
</operator>
<operator name="FeatureNameFilter" class="FeatureNameFilter">
<parameter key="filter_special_features" value="true"/>
<parameter key="skip_features_with_name" value="label"/>
</operator>
<operator name="NominalExampleSetGenerator" class="NominalExampleSetGenerator">
<parameter key="number_of_attributes" value="1"/>
<parameter key="number_of_values" value="2"/>
</operator>
<operator name="FeatureNameFilter (2)" class="FeatureNameFilter">
<parameter key="filter_special_features" value="true"/>
<parameter key="skip_features_with_name" value="att1"/>
</operator>
<operator name="IdTagging (2)" class="IdTagging">
</operator>
<operator name="ExampleSetJoin" class="ExampleSetJoin">
</operator>
</operator>
<operator name="XVal" class="XValidation" expanded="no">
<parameter key="sampling_type" value="shuffled sampling"/>
<operator name="training" class="OperatorChain" expanded="yes">
<operator name="Training" class="LibSVMLearner">
<parameter key="keep_example_set" value="true"/>
<parameter key="kernel_type" value="poly"/>
<parameter key="C" value="1000.0"/>
<list key="class_weights">
</list>
</operator>
<operator name="train_platt_model" class="XVPrediction" expanded="no">
<parameter key="number_of_validations" value="3"/>
<operator name="Training (2)" class="LibSVMLearner">
<parameter key="keep_example_set" value="true"/>
<parameter key="kernel_type" value="poly"/>
<parameter key="C" value="1000.0"/>
<list key="class_weights">
</list>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="dummy" class="ClassificationPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
<operator name="PlattScaling" class="PlattScaling">
</operator>
</operator>
<operator name="ApplierChain" class="OperatorChain" expanded="yes">
<operator name="Test" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
</operator>
</operator>
</operator>
Steffen0 -
Stefan,
VERY clever application.
I don't entirely understand why you train an SVM and then train 3 more svm with platt insie the xvpred?? Does the XVPrediciton somehow deliver the "best" platt from the XV tests?
Also, does the platt model from within the XVPrediction pass through to the model that we eventually apply? If so, would it be safe to assume that we could write that model and it would include the Platt as well?0 -
Ok, I see that we need more remarks:
Aaarrgh. My fault. Just remove the operator chain (and hence all its childoperators) named "train_platt_model". What I told you above regarding the prevention of overfitting is correct, but RapidMiner's special implementation of Platt Scaling does not allow this strategy (I noted this issues before see here : http://rapid-i.com/rapidforum/index.php/topic,447.0.html , but maybe I was to picky).
Stefan
I don't entirely understand why you train an SVM and then train 3 more svm with platt insie the xvpred?? Does the XVPrediciton somehow deliver the "best" platt from the XV tests?
Platt Scaling combines the classification model and the calibration model into one. Hence writing and reading should cause no problems.
Also, does the platt model from within the XVPrediction pass through to the model that we eventually apply? If so, would it be safe to assume that we could write that model and it would include the Platt as well?
hope this was helpful,
regards,
Steffen
PS: Sorry, the name is "Steffen", not "Stefan". This is an important difference here in Germany0