[SOLVED] Clustering SSE measure calculation
nav
New Altair Community Member
Hi,
does anyone know how to calculate the SSE value of a clustering in rapidminer?
does anyone know how to calculate the SSE value of a clustering in rapidminer?
Tagged:
0
Answers
-
I answer myself and share some piece of knowledge with you...
To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this;/**
The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.
* Author: N@v
* Version: 0.0.1
* Date: 11/01/2012
*
* Description:
* This script permits to calculate the SSE measure of a given clustering coming out from a centroid-based clustering algorithm.
*
* Input:
* input[0]: the example set of the clustering
* input[1]: the example set of the centroids
* input[2]: the cluster model of the cluster operator
*
* Output:
* The SSE value of the clustering will be displayed in log consolle.
**/
import com.rapidminer.operator.clustering.ClusterModel;
import com.rapidminer.operator.clustering.Cluster;
ExampleSet clusteringSet = input[0];
ExampleSet centroids = input[1];
ClusterModel clustering = input[2];
Double sum = new Double(0);
centroids.remapIds();
TreeMap<Integer,Example> centroidMap = new TreeMap<Integer, Example>();
for (Example centroid : centroids) {
String key = centroid.getValueAsString(centroid.getAttributes().get("cluster"));
key = key.substring(8);
Cluster cluster = clustering.getCluster(Integer.parseInt(key));
if (cluster.getNumberOfExamples() == 0) {
continue;
}
else {
Collection<Object> idsList = cluster.getExampleIds();
clusteringSet.remapIds();
for (Object id : idsList) {
Example example = clusteringSet.getExampleFromId((Double) id);
distance = new Double(calculateEuclideanDistance(centroid, example));
sum += distance*distance;
}
}
}
operator.logNote("SSE: " + sum);
Double calculateEuclideanDistance(Example a, Example b)
{
Attribute[] atts = a.getAttributes().createRegularAttributeArray();
Double sum = new Double(0);
Double dist = new Double(0);
for (Attribute att : atts){
String attStr = att.getName();
Double aValue = new Double(a.getValue(a.getAttributes().get(attStr)));
Double bValue = new Double(b.getValue(b.getAttributes().get(attStr)));
Double difference = new Double(aValue - bValue);
sum += Math.pow(difference,2);
}
dist = Math.sqrt(sum);
return dist;
}
If you have any comments or suggestions let me know.0 -
Hello
I used the script but i got this error
"cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'
thanks0