Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
clustering
laurab
Hi,
I am using a really large dataset to train a model. I want to improve the prediction results by breaking down the datset in to smaller groups that have similar trends. Rather than one large group with lots of different trends. The data is so large and complex, and I am not familar enough with it to break it down into suitable subgroups by hand so I have to use a clustering model.
I am using the kmeans clustering. I am also using the EvolutionaryParemterOptimizer to establish the optimum number of k clusters. The problem is that I cant not see any distinuishing / correlating aspects between the clusters. What should I be looking for ?
Am i using the best model for the task?
Thanks
Laura
Find more posts tagged with
AI Studio
Accepted answers
All comments
land
Hi Laura,
I'm not sure if this approach is suitable after all. KMeans will group the similiarst examples in the same cluster. Often this examples are then of the same class, making correct prediction more difficult due to the class imbalance problem.
This might work in the case you have very inhomogeneous data and keep the number of clusters small enough. But since Clustering does not provide a clear criterion to optimize you will have to guess or include the following classification into the optimizing. But this might take a huge amount of calculation power to solve this.
Greetings,
Sebastian
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups