An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
HelloI want to compare clusters and evaluate which operators should I use?AndHow do I find the optimal parameters for each clustering method?Thanks
Hi,
Â
finding optimal settings for clustering is indeed a bit tricky.
But RapidMiner offers performance measures for clustering or segmentation tasks.
In the Operator list under Validation -> Segmentation you'll find the corresponding Operators.If you have a subset of your data, where you exactly know into which cluster each example belongs, you can also try to set the cluster Attribute as a prediction and optimize the classification performance instead.
Best,David
HelloConcept ofavg within centroid distance -1.0876davies bouldin -5.675What is?
I used SilhouetteWhat do these results show?Please guideThanks
Hi again,
I guess the Silhoutte performance comes from a 3rd party extension, so I can't say much about it. But wikipedia has an entry about it:
https://en.wikipedia.org/wiki/Silhouette_(clustering)
In short it messaures how similar an Example is to the rest of the cluster. The value is normed between -1 and +1 and a high value indicates a higher similarity.
The Davies–Bouldin criterion is also quite good explained in wikipedia:
https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index
The idea is to maximise the inter-cluster distance (the different between the different clusters) and minimize inter-cluster distances (the points within each cluster should be close together). Here a lower index is better.
HelloMany thanksCriterionAVG within centroid distance -1.043What is?What does the Silhouette of each cluster show in the first photo?