Validation of k-means Clustering
tiramisusann
New Altair Community Member
Hi everybody,
I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.
First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?
What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.
Does anybody know? I really would apprecciate your help and your ideas.
Best,
tiramisusann
I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.
First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?
What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.
Does anybody know? I really would apprecciate your help and your ideas.
Best,
tiramisusann
Tagged:
0
Answers
-
Hello tiramisusann
This link might help...
rapidminernotes.blogspot.com/search/label/Clustering
The reason the values are negative is that some operators work by trying to maximise performance - a negative value that tends to 0 fits this requirement although in reality the absolute value is the one to use.
regards,
Andrew0