"How to get the standard deviation from clustered data?"
stever1k
New Altair Community Member
Hi,
after clustering my data, the data has the following format:
id A B C Cluster
a x y z 0
.. .... .... 1
.. .... .... 1
.. .... .... 2
.. .... .... 0
.. .... ....
.. .... .... N
So the cluster algorithm found several clusters and created a new column with the attribute cluster. I now want to calculate the standard deviation for Cluster 0 for the attributes A B and C, the same for cluster 1 up to N. Any ideas how this works?
cordially,
Stever
after clustering my data, the data has the following format:
id A B C Cluster
a x y z 0
.. .... .... 1
.. .... .... 1
.. .... .... 2
.. .... .... 0
.. .... ....
.. .... .... N
So the cluster algorithm found several clusters and created a new column with the attribute cluster. I now want to calculate the standard deviation for Cluster 0 for the attributes A B and C, the same for cluster 1 up to N. Any ideas how this works?
cordially,
Stever
Tagged:
0
Answers
-
Hi Stever,
this is a typical situation for using the aggregation operator. You can group the examples by the cluster and then calculate a aggregation function over each attribute. I have done this in this process:<operator name="Root" class="Process" expanded="yes">
It should be easy to adapt it onto your needs.
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="gaussian mixture clusters"/>
</operator>
<operator name="KMeans" class="KMeans">
<parameter key="k" value="3"/>
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="cluster"/>
</operator>
<operator name="Aggregation" class="Aggregation">
<list key="aggregation_attributes">
<parameter key="att1" value="standard_deviation"/>
<parameter key="att2" value="standard_deviation"/>
<parameter key="att3" value="standard_deviation"/>
<parameter key="att4" value="standard_deviation"/>
<parameter key="att5" value="standard_deviation"/>
</list>
<parameter key="group_by_attributes" value="cluster"/>
</operator>
</operator>
Greetings,
Sebastian0 -
thanks a lot Sebastian, that is EXACTLY what I'm looking for. My problem was, that I was searching for suitable operator inside the preprocession->attributres tree instead of the olap!
best regards,
Stever0