newbie requires advice to select a clustering algorithm

New Altair Community Member

Feb 21, 2010

Updated Nov 5, 2024 by Jocelyn

Hello,

I discovered RapidMiner yesterday after several hours of research into data clustering (it looks very nice and friendly). I need a little bit of help in selecting an algorithm for what is most likely a simple case.

I have a series of events that happened in time at irregular intervals. I would like to determine which of those events are in clusters, where in my case a cluster is composed by those adjacent events that were closer in time than a given threshold. The time span of the cluster does not matter (so 3 events at 10 seconds apart or 20 events at 1 minute interval are still valid clusters, I only care about the distance between two succesive events).

From what I've read so far, k-means and its variants are not appropriate since they require the user to specify how many clusters are desired. I don't know how many there are and, in this case, their number is in fact an output of the analysis, not an input.

Any guidance is appreciated.

Thanks,
-jl

Find more posts tagged with

AI Studio

Sort by:

1 - 2 of 21

land

New Altair Community Member

Feb 22, 2010

Hi,
if each of your example is marked with the point in time, when the even occurs, you might use the Agglomerative Clustering with single link. If you only cluster on the time (mar each other attribute special or remove it), you will get a dendrogram, showing which events are combined into one cluster and which distance is between them.

Greetings,
Sebastian

jeanluc

New Altair Community Member

Feb 22, 2010

Sebastian Land wrote:

Hi,
if each of your example is marked with the point in time, when the even occurs, you might use the Agglomerative Clustering with single link. If you only cluster on the time (mar each other attribute special or remove it), you will get a dendrogram, showing which events are combined into one cluster and which distance is between them.

Perfect, that's what I needed. Thanks.

newbie requires advice to select a clustering algorithm

Find more posts tagged with

Quick Links