confusion matrix results

warwick
New Altair Community Member
Hi,
I am new to data mining and Rapid Miner.
I have made a model with a KNN in a x validation with categorical data. I am getting a confusion matrix which doesn't appear to be standard and am having trouble understanding it. Could someone explain what these numbers represent.
PerformanceVector:
accuracy: 72.73%
ConfusionMatrix:
True: US CA FR ES IT GB NL AU DE PT other
US: 0.027 0.028 0.062 0.039 0.041 0.038 0.012 0.017 0.019 0.012 0.016
CA: 0.000 0.107 0.001 0.000 0.001 0.001 0.000 0 0.000 0 0.000
FR: 0.001 0.001 0.056 0.002 0.002 0.002 0.000 0.000 0.000 0.001 0.001
ES: 0.000 0.001 0.002 0.087 0.001 0.001 0.001 0.001 0.000 0 0.000
IT: 0.000 0.001 0.003 0.002 0.082 0.002 0.000 0 0.001 0 0.001
GB: 0.000 0.000 0.002 0.001 0.001 0.087 0 0 0.001 0.001 0.001
NL: 0.000 0 0.001 0.001 0.001 0.000 0.117 0 0 0 0.000
AU: 0.000 0 0.001 0.000 0.000 0.000 0 0.115 0.000 0 0.000
DE: 0.000 0 0.001 0.001 0.001 0.001 0 0 0.114 0 0.000
PT: 0.000 0.000 0.000 0 0 0 0 0 0 0.115 0.000
other: 0.002 0.001 0.008 0.004 0.006 0.004 0 0 0.001 0 0.053
absolute_error: 0.273 +/- 0.000
thanks
Warwick
I am new to data mining and Rapid Miner.
I have made a model with a KNN in a x validation with categorical data. I am getting a confusion matrix which doesn't appear to be standard and am having trouble understanding it. Could someone explain what these numbers represent.
PerformanceVector:
accuracy: 72.73%
ConfusionMatrix:
True: US CA FR ES IT GB NL AU DE PT other
US: 0.027 0.028 0.062 0.039 0.041 0.038 0.012 0.017 0.019 0.012 0.016
CA: 0.000 0.107 0.001 0.000 0.001 0.001 0.000 0 0.000 0 0.000
FR: 0.001 0.001 0.056 0.002 0.002 0.002 0.000 0.000 0.000 0.001 0.001
ES: 0.000 0.001 0.002 0.087 0.001 0.001 0.001 0.001 0.000 0 0.000
IT: 0.000 0.001 0.003 0.002 0.082 0.002 0.000 0 0.001 0 0.001
GB: 0.000 0.000 0.002 0.001 0.001 0.087 0 0 0.001 0.001 0.001
NL: 0.000 0 0.001 0.001 0.001 0.000 0.117 0 0 0 0.000
AU: 0.000 0 0.001 0.000 0.000 0.000 0 0.115 0.000 0 0.000
DE: 0.000 0 0.001 0.001 0.001 0.001 0 0 0.114 0 0.000
PT: 0.000 0.000 0.000 0 0 0 0 0 0 0.115 0.000
other: 0.002 0.001 0.008 0.004 0.006 0.004 0 0 0.001 0 0.053
absolute_error: 0.273 +/- 0.000
thanks
Warwick
Tagged:
0
Answers
-
Please have a look at this video: http://docs.rapidminer.com/studio/getting-started/5-evaluating-model.html starting at ~12:30
~Martin0 -
Hi Martin,
Thanks for replying. I looked at the video . I understand the concepts of the confusion table. My question is why are the values I am creating in the confusion table so small? In the video the confusion table shows the number of examples that fall in each category( True True, True False etc). My values are very small ie. 0.027 so this is obviously not the case in my situation. What is it displaying instead?
Thanks
Warwick
0 -
Without being able to see your process I'm betting that you use example weighting, right?
See the below example which uses the Generate Weight operator to make a confusion matrix similar to yours.<?xml version="1.0" encoding="UTF-8" standalone="no"?>
Try putting the weighting inside the training side of the XValidation.
<process version="7.0.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.0.001" expanded="true" height="68" name="Golf" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="generate_id" compatibility="7.0.001" expanded="true" height="82" name="Generate ID" width="90" x="179" y="34"/>
<operator activated="true" class="generate_weight_stratification" compatibility="7.0.001" expanded="true" height="82" name="Generate Weight (Stratification)" width="90" x="246" y="187">
<description align="center" color="yellow" colored="true" width="126">Cunning use of weights for a confusing confusion matrix.</description>
</operator>
<operator activated="true" class="x_validation" compatibility="7.0.001" expanded="true" height="124" name="Validation" width="90" x="313" y="34">
<parameter key="number_of_validations" value="3"/>
<parameter key="sampling_type" value="linear sampling"/>
<process expanded="true">
<operator activated="true" class="parallel_decision_tree" compatibility="7.0.001" expanded="true" height="82" name="Decision Tree" width="90" x="179" y="34"/>
<connect from_port="training" to_op="Decision Tree" to_port="training set"/>
<connect from_op="Decision Tree" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="7.0.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_classification" compatibility="7.0.001" expanded="true" height="82" name="Performance (2)" width="90" x="180" y="30">
<list key="class_weights"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="126"/>
</process>
</operator>
<connect from_op="Golf" from_port="output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Generate Weight (Stratification)" to_port="example set input"/>
<connect from_op="Generate Weight (Stratification)" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="training" to_port="result 2"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>0