"Within Cluster Distances in ClusterCentroidEvalautor Vs ClusterDensityEvalautor"

Shubha
Shubha New Altair Community Member
edited November 5 in Community Q&A
Hi,

Code and output of ClusterCentroidEvalautor:
<operator name="Root" class="Process" expanded="yes">
   <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
       <parameter key="target_function" value="sum classification"/>
       <parameter key="number_examples" value="200"/>
       <parameter key="number_of_attributes" value="2"/>
   </operator>
   <operator name="KMeans" class="KMeans">
   </operator>
   <operator name="ClusterCentroidEvaluator" class="ClusterCentroidEvaluator">
   </operator>
</operator>
PerformanceVector:
avg_within_distance: -41.825
avg_within_distance_cluster_0: -38.826
avg_within_distance_cluster_1: -44.707
DaviesBouldin: -1.162

Code and Output of ClusterDensityEvalautor:
<operator name="Root" class="Process" expanded="yes">
   <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
       <parameter key="target_function" value="sum classification"/>
       <parameter key="number_examples" value="200"/>
       <parameter key="number_of_attributes" value="2"/>
   </operator>
   <operator name="ExampleSet2Similarity" class="ExampleSet2Similarity">
   </operator>
   <operator name="KMeans" class="KMeans">
   </operator>
   <operator name="ClusterDensityEvaluator" class="ClusterDensityEvaluator">
   </operator>
</operator>
PerformanceVector:
Avg. within cluster distance: -804.184
Avg. within cluster distance for cluster 0: -760.009
Avg. within cluster distance for cluster 1: -846.627

Both the outputs have average within cluster distances, with different values. What are these values and why are they different? Thanks for the help...

Shubha

Answers

  • land
    land New Altair Community Member
    Hi Shuba,
    the difference is, that the ClusterCentroidEvaluator calculates the distance inside a cluster by averaging the distance between each cluster example and the centroid, while the clusterDensityevaluator uses the average distances between all examples of a cluster.

    Greetings,
      Sebastian