"k-means and its centroïde table values SOLVED"

John_Davis
John_Davis New Altair Community Member
edited November 5 in Community Q&A
Hi,

The k-means operator in Rapid-Minder gives us a centroïde table values in which each cluters contains items and corresponding values  . What are these values:  tf-idf, Chi2, information rate,...?    

Yours

John Davis

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    Hi John,

    that are probably columns that have been present in your data.

    k-Means defines clusters by their central data point, i.e. the average of all elements in the cluster. These so called centroids are defined by the centroid table, where each column contains the attribute values of a centroid.

    Best regards,
    Marius
  • John_Davis
    John_Davis New Altair Community Member
    Hello,

    I think I was not so clear in my first post.

    I understand that when using k-means operator, one can have a look through the example set at  each cluster's centroïd. (i.e. the attribute values of each cluster's centroïd). My question is about the values that are given in the k-means spreed sheets. For example, when applying k-means on textual data (k=3 clusters), on could end up with a k-means spreed sheet like: 

    ATTRIBUTE    cluster_ 1  cluster_ 2  cluster_ 3
        word x          0.2            0.01            0.2
        word y          0,4            0,3            0.01
        word z            0            0.03          0.002

    What are the values fo each column

    Yours

    John
                                                                       
  • MariusHelf
    MariusHelf New Altair Community Member
    Hi John,

    you mean how to interpret the values or the meaning of them? They are the normalized TD-IDF values of the centroids. The TF-IDF values are created by the process documents operator and you will find plenty of information if you google for TF-IDF. Basically it is a kind of smart counting of words in the documents.

    Best regards,
    Marius
  • John_Davis
    John_Davis New Altair Community Member
    Thanks a lot. I'am familiar with this numerical statistic.

    Yours
    John