Problem with hierarchical clustering
hello. I used the prossecc document from data and tf-idf
I used the top down clustering and agglomerative clustering operator
How do I optimize the number of clusters?
And how do I evaluate them?
Can I use performance distance clustering?
Please, tutors
Thankful
Answers
-
Hi @elena20,
please have a look at the operator "Flatten Clustering". This reduces the hierachy to n-leaves. Afterwards you can go forward with usual cluster performance measures.
Best,
Martin
1 -
Thank you very much
But
How can I evaluate hierarchical paraphernalia? Do you send a sample without wounding?
Thank you0 -
I don't understand your last question at all, but you can use any standard clustering performance metric, such as DB index. However, since clustering is unsupervised, I would say your own use case should guide your evaluation at least as much as any formal metric. What are you clustering and for what purpose? Based on that purpose, how many clusters is reasonable versus too many? Etc.
0 -
Hello
So much
I want to do a hierarchical clustering on Twitter. And then compare with kmeans clustering. Is he honey
Which operator to evaluate hierarchical results?
Performance clustering distance operator error
Thankful0