Find more posts tagged with
Hey guys,
i am searching for an explaxation to this negative mentioned davies-bouldin values. Please, can anyone explain to me why Rapidminer ist calculation negative values?
My Performance Vectors are looking like this:
Performance Vector
Average within centroid distance
cluster_0: -1.831
cluster_1: -1.931
cluster_2: -1.856
cluster_3: -1.897
cluster_4: -1.903
cluster_5: -1.885
cluster_6: -1.891
cluster_7: -1.878
cluster_8: -1.818
cluster_9: -1.869
Davides Bouldin: -1.974
Thanks in advance for your help and reply,
Stefan
Hi Stefan,
i do not know why, but by default the values are multiplied by -1 so that you can run a minimizer on it. That's why the operator has an option called maximize with this description:
maximize
Description: This parameter specifies if the results should be maximized. If set to true, the result is not multiplied by minus one.
Simply check it and get what you like more
Best,
Martin
Hey Martin,
thanks for your fast reply.
I read about the multiplication by -1. Thanks for the advanced paramrter advice. Now my values turn in positive ones. BUT, I am still wondering why the values are greater >1. Usualy Davies Boulding values are between 0 and 1 (0="good" clusters and 1="bad" clusters). Now that my values are greater 1, do you have a suggestions for interpretation?
Regards,
Stefan
Hi Stefan,
why do you think this should be normalized? According to Wikipedia: https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index i don't see any reason to have it in [0,1].
Nevertheless you can of course normalize the DB index.
~Martin
Hi,
as far as i know smaller absolute values are better. From the doc:
davies_bouldin: The algorithms that produce clusters with low intra-cluster distances (high intra-cluster similarity) and high inter-cluster distances (low inter-cluster similarity) will have a low Davies–Bouldin index, the clustering algorithm that produces a collection of clusters with the smallest Davies–Bouldin index is considered the best algorithm based on this criterion.
Best,
Martin
Its just easier to use the negative value for optimization.