"cluster performance evaluation - negative (?) average of distances"
dan_agape
New Altair Community Member
Hi,
I have just tested the operators measuring clustering performance, in particular for a centroid based scheme. The Cluster Distance Performance operator provided, as a measure of clustering quality, negative (?) averages of distances from the centroids to the instances within the respective clusters. Here is an example process that uses a clustering model built by the first process in http://rapid-i.com/rapidforum/index.php/topic,2608.0.html
Regards,
Dan
I have just tested the operators measuring clustering performance, in particular for a centroid based scheme. The Cluster Distance Performance operator provided, as a measure of clustering quality, negative (?) averages of distances from the centroids to the instances within the respective clusters. Here is an example process that uses a clustering model built by the first process in http://rapid-i.com/rapidforum/index.php/topic,2608.0.html
Regards,
Dan
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
<process expanded="true" height="404" width="599">
<operator activated="true" class="generate_data" compatibility="5.0.10" expanded="true" height="60" name="Generate Data (2)" width="90" x="45" y="165">
<parameter key="number_examples" value="2000"/>
<parameter key="use_local_random_seed" value="true"/>
<parameter key="local_random_seed" value="20090"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.0.10" expanded="true" height="76" name="Select Attributes" width="90" x="45" y="255">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="label"/>
<parameter key="invert_selection" value="true"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<operator activated="true" class="retrieve" compatibility="5.0.10" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
<parameter key="repository_entry" value="//NewLocalRepository/models/tmp_kmeans_mod"/>
</operator>
<operator activated="true" class="cluster_distance_performance" compatibility="5.0.10" expanded="true" height="94" name="Performance" width="90" x="246" y="165"/>
<connect from_op="Generate Data (2)" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Performance" to_port="example set"/>
<connect from_op="Retrieve" from_port="output" to_op="Performance" to_port="cluster model"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<connect from_op="Performance" from_port="example set" to_port="result 3"/>
<connect from_op="Performance" from_port="cluster model" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0
Answers
-
Hi,
this seems to be strange, but I cannot execute your process, because of missing data. Could you please file a bug report for this, too?
With kind regards,
Sebastian Land0