What does this cluster plot explain?

kayvanjoo
kayvanjoo New Altair Community Member
edited November 5 in Community Q&A

Hello, I am doing clustering using X-mean thhat yields into 4 cluster and in my results I have one centroid table and also a plot option which looks as in picture.

Can comeone kindly explain what does the plot is describing? I couldn't really figure it out by the first look! I guess it showed the features that have been used for clustering and their range...but it doesn't make sense with its shape so I donno 

Thanks a lot!Plot First Cluster.png

Best Answer

  • IngoRM
    IngoRM New Altair Community Member
    Answer ✓

    Hi,

     

    No.  Each line in the plot shows the values of the centroid of your clusters.  Think about how k-means (and other centroid-based clustering mechanisms) work.  They determine the centroids for each of the k clusters and assign all data points to their nearest centroids.  In this sense the centroids can be seen as prototypical for your clusters.

     

    The plot now shows for all your columns (in a so-called "parallel plot") where those cluster centroids are located.  This allows you to understand things like

     

    1. where do the clusters differ most (which attributes are important for which cluster)
    2. where do the clusters not differ (all clusters have basically the same values for certain attributes)
    3. how "complex" are the differences between the clusters, i.e. do you need a lot of attributes to differentiate the clusters or only a few

    Hope this helps,

    Ingo

Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    My initial review of the plot shows that your cluster model isn't that great. I think you're suffering from a scaling issue because all other attributes look very flat. Try rescaling all the values (maybe use a Normalize operator with z-transformation) .The only thing that jumps out at me is that Cluster 3's basal volume is very different from all the rest. 

     

  • kayvanjoo
    kayvanjoo New Altair Community Member

    Yes that is true, I already am aware tha ty data need noralization but you could you please tell me what does this plot explain?? How can I interprete it? is it just saying that my clustering was done using only 3 three attributes? and is it showing only maximum attribute in each cluster or is it basd on the average ? 

    Thank you

  • IngoRM
    IngoRM New Altair Community Member
    Answer ✓

    Hi,

     

    No.  Each line in the plot shows the values of the centroid of your clusters.  Think about how k-means (and other centroid-based clustering mechanisms) work.  They determine the centroids for each of the k clusters and assign all data points to their nearest centroids.  In this sense the centroids can be seen as prototypical for your clusters.

     

    The plot now shows for all your columns (in a so-called "parallel plot") where those cluster centroids are located.  This allows you to understand things like

     

    1. where do the clusters differ most (which attributes are important for which cluster)
    2. where do the clusters not differ (all clusters have basically the same values for certain attributes)
    3. how "complex" are the differences between the clusters, i.e. do you need a lot of attributes to differentiate the clusters or only a few

    Hope this helps,

    Ingo

  • AustinT
    AustinT New Altair Community Member

    To tack on here, if I have z-score normalized a value like "Duration" in my Example Set, and the centroid value gets calculated as "- 0.5" in Cluster 1, does this indicate that centroid value for Duration in Cluster 1 is 0.5 of a standard score to the left (or less than the mean)? 

  • Kenshinn
    Kenshinn New Altair Community Member
    Good explain and easy understanding. Thank you very much.