🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

How to evaluate clustering

User: "ahootanha"
New Altair Community Member
Updated by Jocelyn

Hello
I want to compare clusters and evaluate which operators should I use?
And
How do I find the optimal parameters for each clustering method?
Thanks

Sort by:
1 - 5 of 51
    User: "David_A"
    New Altair Community Member

    Hi,

     

    finding optimal settings for clustering is indeed a bit tricky.

     

    But RapidMiner offers performance measures for clustering or segmentation tasks.

    In the Operator list under Validation -> Segmentation you'll find the corresponding Operators.
    If you have a subset of your data, where you exactly know into which cluster each example belongs, you can also try to set the cluster Attribute as a prediction and optimize the classification performance instead.cluster_performance.png

     

     

     

    Best,
    David

    User: "ahootanha"
    New Altair Community Member
    OP

    Hello
    Concept of
    avg within centroid distance -1.0876
    davies bouldin -5.675

    What is?

    User: "ahootanha"
    New Altair Community Member
    OP

    I used Silhouette
    What do these results show?
    Please guide
    Thanks

    مهم.JPGمهم۲.JPG

    User: "David_A"
    New Altair Community Member

    Hi again,

     

    I guess the Silhoutte performance comes from a 3rd party extension, so I can't say much about it. But wikipedia has an entry about it:

    https://en.wikipedia.org/wiki/Silhouette_(clustering)

    In short it messaures how similar an Example is to the rest of the cluster. The value is normed between -1 and +1 and a high value indicates a higher similarity.

     

    The Davies–Bouldin criterion is also quite good explained in wikipedia:

    https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index

    The idea is to maximise the inter-cluster distance (the different between the different clusters) and minimize inter-cluster distances (the points within each cluster should be close together).  Here a lower index is better.

     

     

    Best,
    David

    User: "ahootanha"
    New Altair Community Member
    OP

    Hello
    Many thanks
    Criterion
    AVG within centroid distance -1.043
    What is?
    What does the Silhouette of each cluster show in the first photo?