"k-means and its centroïde table values SOLVED"
John_Davis
New Altair Community Member
Hi,
The k-means operator in Rapid-Minder gives us a centroïde table values in which each cluters contains items and corresponding values . What are these values: tf-idf, Chi2, information rate,...?
Yours
John Davis
The k-means operator in Rapid-Minder gives us a centroïde table values in which each cluters contains items and corresponding values . What are these values: tf-idf, Chi2, information rate,...?
Yours
John Davis
Tagged:
0
Answers
-
Hi John,
that are probably columns that have been present in your data.
k-Means defines clusters by their central data point, i.e. the average of all elements in the cluster. These so called centroids are defined by the centroid table, where each column contains the attribute values of a centroid.
Best regards,
Marius0 -
Hello,
I think I was not so clear in my first post.
I understand that when using k-means operator, one can have a look through the example set at each cluster's centroïd. (i.e. the attribute values of each cluster's centroïd). My question is about the values that are given in the k-means spreed sheets. For example, when applying k-means on textual data (k=3 clusters), on could end up with a k-means spreed sheet like:
ATTRIBUTE cluster_ 1 cluster_ 2 cluster_ 3
word x 0.2 0.01 0.2
word y 0,4 0,3 0.01
word z 0 0.03 0.002
What are the values fo each column
Yours
John
0 -
Hi John,
you mean how to interpret the values or the meaning of them? They are the normalized TD-IDF values of the centroids. The TF-IDF values are created by the process documents operator and you will find plenty of information if you google for TF-IDF. Basically it is a kind of smart counting of words in the documents.
Best regards,
Marius0 -
Thanks a lot. I'am familiar with this numerical statistic.
Yours
John0