🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Average Distance within Cluster"

User: "Pinguicula"
New Altair Community Member
Updated by Jocelyn
Sorry,

here was premature comment which resolved into mist after some further literature review. And I'm unfortunately unable to remove my message.

Best Norbert

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "IngoRM"
    New Altair Community Member
    Hi Norbert,

    I am no clustering expert myself but as far as I can see from the source code the calculation is roughly done as in the following pseudo code:

    count = 0;
    sum = 0.0;

    for each cluster C do {

    for each object O in C do {
                    distance = getDistanceFromCentroid(C, O);
                    sum = sum + v * v;
                    count++;
            }

    }

    result = sum / count;

    double divisionFactor = 1.0;
    if (getParameterAsBoolean(PARAMETER_NORMALIZE))
      divisionFactor = es.getAttributes().size();

    result = result / divisionFactor;

    Hope that helps. Maybe you did not take the normalization with the number of attributes into account?

    Cheers,
    Ingo
    User: "Pinguicula"
    New Altair Community Member
    OP
    Hi Ingo,

    Your answer resolves somehow my problems.

    If my assumption is correct and in your pseudo code v is equivalent to distance the feature labelled average distance within cluster is actually the variance of the data points with the cluster and has little in common (exagerating)  ;) with the average distance within cluster used e.g. in the calculation of the Silhouette coefficient (Kaufman& Rousseeuw, 1990).

    By the way the Silhouette coefficient or the Hopkins statistic would be nice features in the next RM release.

    Best

    Norbert