optimal number of clusters in fuzzy c-means
farzane
New Altair Community Member
Hi
I'm using fuzzy c-means to cluster a few text data. How can I find the optimal number of clusters? is intar_cluster_distance a good measure?
I'm using fuzzy c-means to cluster a few text data. How can I find the optimal number of clusters? is intar_cluster_distance a good measure?
Tagged:
1
Answers
-
I assume that you are talking about Fuzzy C-Means operator from the Information Selection extension? The key to finding an optimum k is create an optimisation loop, e.g. using Optimize Parameters (Grid), which could vary the cluster numbers vs some performance measure.
If you are interested only in the final cluster allocation then we have lots of possible solutions for you. However, as Fuzzy C-Means is not returning the centroid table (such as k-Means), you will not be able to use Davis-Bouldin measurement from Cluster Distance Performance. However, you can rely on the commonly used Item Distribution Performance (e.g. Sum of Squares measure) and plot it against k to use the "elbow method" of finding the "optimum" cluster number. Alternatively, you could use a combination of Data to Similarity and Cluster Density Performance to optimise the average cluster density.
Note however that the whole idea of using Fuzzy C-Means to utilise the fuzzy membership of examples in each cluster. If this was the aim to consider all possible cluster memberships then there are no obvious performance measures available in RapidMiner, you could create your own measure by weighing different clustering performance indicators with cluster membership confidence factors.
Information Selection extension also provides two performance operators worth investigating here - one is calculating within cluster distance variance, unfortunately it does not take into consideration the fuzzy cluster membership.
Jacob-1 -
@jacobcybulski
Thank you so much. the problem has been solved0 -
Hi, @farzane
which solution did you use? can you explain to me, please??
you can mention me in this discussion or send to my email endirizal.f@gmail.com.
thankyou for your help
Endirizalf
0