"Two spiral problem with SVM in Rapidminer"
speedyjb
New Altair Community Member
Hello everybody.
I am trying to get Rapidminer 5.0.006 to solve the famous two-spiral problem with SVM (Clustering).
I found a sample in Rapidminer which is practically what I search for,
only with a different dataset, namely "three ring clusters".
This "three ring clusters"-sample can be found under:
Samples - processes - 07_Clustering - 10_SVClustering.
This sample as it is works just fine, but as I simply change the targetfunction
in the ExampleSetGenerator from "three ring clusters" to "spiral cluster",
the results I get make no sense. Mostly I get 1 cluster, sometimes more but never good ones
that seperate the spiral-arms as they should be.
No matter what parameters I choose for the SupportVectorClustering!
(I know from experiments with LibSVM the values for the parameters C and Gamma
that do an excellent job for this problem).
Can anybody help me with this? Thank you in advance!
Greetings,
speedyjb
I am trying to get Rapidminer 5.0.006 to solve the famous two-spiral problem with SVM (Clustering).
I found a sample in Rapidminer which is practically what I search for,
only with a different dataset, namely "three ring clusters".
This "three ring clusters"-sample can be found under:
Samples - processes - 07_Clustering - 10_SVClustering.
This sample as it is works just fine, but as I simply change the targetfunction
in the ExampleSetGenerator from "three ring clusters" to "spiral cluster",
the results I get make no sense. Mostly I get 1 cluster, sometimes more but never good ones
that seperate the spiral-arms as they should be.
No matter what parameters I choose for the SupportVectorClustering!
(I know from experiments with LibSVM the values for the parameters C and Gamma
that do an excellent job for this problem).
Can anybody help me with this? Thank you in advance!
Greetings,
speedyjb
0
Answers
-
Hi,
Could be but that's probably not the point here. Beside C and Gamma - the former is not even a parameter of SV Clustering and the latter often is implemented as 1 / Gamma in other approaches so the notion of a "correct" value could not hold here anyway - the more important parameters for SV Clustering are the number of minimal points, p, r, and the number of neighbors. All these parameters make SV Clustering much harder to tune than most other clustering schemes - including the also not really easy tunable scheme DBScan...
(I know from experiments with LibSVM the values for the parameters C and Gamma
that do an excellent job for this problem).
Cheers,
Ingo0 -
I will do some experiments with all the parameters that can be tuned to see if I can get them optimized for this problem!
Perhaps searching for the best parameters via genetic optimization, grid search or something similar?
(To save time and effort).
In the end I just want to be able to replicate the results I got with libSVM in Rapidminer on the 2 spiral problem
and be able to take advantage of all the surplus Rapidminer gives with its build-in features.
But first things first: trying to get Rapidminer going on this one!
Greetings,
speedyjb
0 -
Hi,
ok, but please note that the Support Vector Clustering is not built on top of the LibSVM but on the mySVM by Stefan Rüping. The operator using the LibSVM can - like the original implementation as far as I know - only be used for classification, regression, and one-class learning but not for clustering.
In the end I just want to be able to replicate the results I got with libSVM in Rapidminer on the 2 spiral problem
Cheers,
Ingo0