Find more posts tagged with
Hey guys,
i am searching for an explaxation to this negative mentioned davies-bouldin values. Please, can anyone explain to me why Rapidminer ist calculation negative values?
My Performance Vectors are looking like this:
Performance Vector
Average within centroid distance
cluster_0: -1.831
cluster_1: -1.931
cluster_2: -1.856
cluster_3: -1.897
cluster_4: -1.903
cluster_5: -1.885
cluster_6: -1.891
cluster_7: -1.878
cluster_8: -1.818
cluster_9: -1.869
Davides Bouldin: -1.974
Thanks in advance for your help and reply,
Stefan
Hi Stefan,
i do not know why, but by default the values are multiplied by -1 so that you can run a minimizer on it. That's why the operator has an option called maximize with this description:
maximize
Description: This parameter specifies if the results should be maximized. If set to true, the result is not multiplied by minus one.
Simply check it and get what you like more ![]()
Best,
Martin
Hey Martin,
thanks for your fast reply.
I read about the multiplication by -1. Thanks for the advanced paramrter advice. Now my values turn in positive ones. BUT, I am still wondering why the values are greater >1. Usualy Davies Boulding values are between 0 and 1 (0="good" clusters and 1="bad" clusters). Now that my values are greater 1, do you have a suggestions for interpretation?
Regards,
Stefan
Hi Stefan,
why do you think this should be normalized? According to Wikipedia: https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index i don't see any reason to have it in [0,1].
Nevertheless you can of course normalize the DB index.
~Martin
Hi,
as far as i know smaller absolute values are better. From the doc:
davies_bouldin: The algorithms that produce clusters with low intra-cluster distances (high intra-cluster similarity) and high inter-cluster distances (low inter-cluster similarity) will have a low Davies–Bouldin index, the clustering algorithm that produces a collection of clusters with the smallest Davies–Bouldin index is considered the best algorithm based on this criterion.
Best,
Martin


Its just easier to use the negative value for optimization.