Problem with hierarchical clustering

elena20
elena20 New Altair Community Member
edited November 5 in Community Q&A

hello. I used the prossecc document from data and tf-idf
  I used the top down clustering and agglomerative clustering operator
How do I optimize the number of clusters?
And how do I evaluate them?
Can I use performance distance clustering?
Please, tutors
Thankful

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Hi @elena20,

    please have a look at the operator "Flatten Clustering". This reduces the hierachy to n-leaves. Afterwards you can go forward with usual cluster performance measures.

     

    Best,

    Martin

  • elena20
    elena20 New Altair Community Member

    Thank you very much
    But
    How can I evaluate hierarchical paraphernalia? Do you send a sample without wounding?
    Thank you

  • Telcontar120
    Telcontar120 New Altair Community Member

    I don't understand your last question at all, but you can use any standard clustering performance metric, such as DB index.  However, since clustering is unsupervised, I would say your own use case should guide your evaluation at least as much as any formal metric.  What are you clustering and for what purpose?  Based on that purpose, how many clusters is reasonable versus too many?  Etc.

     

  • elena20
    elena20 New Altair Community Member

    Hello
    So much
    I want to do a hierarchical clustering on Twitter. And then compare with kmeans clustering. Is he honey
    Which operator to evaluate hierarchical results?
    Performance clustering distance operator error
    Thankful