newbie requires advice to select a clustering algorithm
Hello,
I discovered RapidMiner yesterday after several hours of research into data clustering (it looks very nice and friendly). I need a little bit of help in selecting an algorithm for what is most likely a simple case.
I have a series of events that happened in time at irregular intervals. I would like to determine which of those events are in clusters, where in my case a cluster is composed by those adjacent events that were closer in time than a given threshold. The time span of the cluster does not matter (so 3 events at 10 seconds apart or 20 events at 1 minute interval are still valid clusters, I only care about the distance between two succesive events).
From what I've read so far, k-means and its variants are not appropriate since they require the user to specify how many clusters are desired. I don't know how many there are and, in this case, their number is in fact an output of the analysis, not an input.
Any guidance is appreciated.
Thanks,
-jl
I discovered RapidMiner yesterday after several hours of research into data clustering (it looks very nice and friendly). I need a little bit of help in selecting an algorithm for what is most likely a simple case.
I have a series of events that happened in time at irregular intervals. I would like to determine which of those events are in clusters, where in my case a cluster is composed by those adjacent events that were closer in time than a given threshold. The time span of the cluster does not matter (so 3 events at 10 seconds apart or 20 events at 1 minute interval are still valid clusters, I only care about the distance between two succesive events).
From what I've read so far, k-means and its variants are not appropriate since they require the user to specify how many clusters are desired. I don't know how many there are and, in this case, their number is in fact an output of the analysis, not an input.
Any guidance is appreciated.
Thanks,
-jl