K-means cluster with text data

New Altair Community Member

Nov 22, 2019

Updated Nov 5, 2024 by Jocelyn

Hello experts!

I'd like to do k-means cluster with text data. My data is saved in one excel file. It has only one column with one word in each cell. Not sure whether I am doing it correctly (picture attached) because the output is like below, with cluster 3 having 4889 items??

Cluster 0: 20 items
Cluster 1: 18 items
Cluster 2: 20 items
Cluster 3: 4889 items
Cluster 4: 20 items
Cluster 5: 10 items
Cluster 6: 10 items
Cluster 7: 10 items
Total number of items: 4997

Image: https://us.v-cdn.net/6030995/uploads/editor/89/5rhn66xvmsgn.png

Also, I wonder is it possible to use something like Silhouette scores to define the ideal number of cluster? Thank you!!!

Find more posts tagged with

AI Studio

k-Means Clustering

Text Mining + NLP

K-means cluster with text data

Find more posts tagged with

Quick Links