Minor problem with the Extract Cluster Prototypes operator
If I cluster with K-medoids the last value in the data set becomes a centroid so I need to exclude it. I am doing this by finding the value, generating a macro, generating an attribute and finally a filter examples operator with an expression. It does not work. I can't seem to be able to see the data after the Extract Prototypes operator to filter it. In the screenshot, it should be trivial to exclude cluster_11 but I am not making much progress. I am using RM 9.7, not the 9.8 beta.


Find more posts tagged with
Sort by:
1 - 5 of
51

Hi,
why cant you just use filter example range?
Best,
Martin
Hi Martin,
I will have to think about that. I could do with a couple of days off. I could be missing the simple solution to this problem.
At the moment, I can't isolate and filter the centroid of cluster_11 after the Extract Prototypes. On the next run it might be the centroid of cluster_3 that is the last value in the data set. Here is a screen shot with a crude example. The last value in the data set is the second to last black line at the bottom which is also the centroid of one of the clusters. I have taken the centroid values and generated attributes. It is kind of a false value when it is also an outlier. That value doesn't match anything in the data set. If it does, then it is random. A simple filter examples expression should have fixed it but the composite value in the data set is invisible to me. I will upload a process that should make things clearer.
I will have to think about that. I could do with a couple of days off. I could be missing the simple solution to this problem.
At the moment, I can't isolate and filter the centroid of cluster_11 after the Extract Prototypes. On the next run it might be the centroid of cluster_3 that is the last value in the data set. Here is a screen shot with a crude example. The last value in the data set is the second to last black line at the bottom which is also the centroid of one of the clusters. I have taken the centroid values and generated attributes. It is kind of a false value when it is also an outlier. That value doesn't match anything in the data set. If it does, then it is random. A simple filter examples expression should have fixed it but the composite value in the data set is invisible to me. I will upload a process that should make things clearer.

Sort by:
1 - 1 of
11
Hi Alex,
you could just do aggregate count(cluster) group_by cluster, join this on cluster and then filter for count(cluster) != 1 ?
Best,
Martin