"Rank order of attributes to each cluster"
dorona
New Altair Community Member
Hi,
I just now started to play around with clustering and using Rapid Miner I was able to get results. Now my problem is how to categorize each cluster. Is there a way to get out of Rapid Miner for each cluster a ranked ordered list of attributes that best describe each cluster?
In addition, it would be great to have an actual value of contribution to the model and a statistic to measure its statistical significance as well.
Thanks
I just now started to play around with clustering and using Rapid Miner I was able to get results. Now my problem is how to categorize each cluster. Is there a way to get out of Rapid Miner for each cluster a ranked ordered list of attributes that best describe each cluster?
In addition, it would be great to have an actual value of contribution to the model and a statistic to measure its statistical significance as well.
Thanks
Tagged:
0
Answers
-
Hi,
yes, this is possible with RapidMiner. After clustering, each example in the input data set gets a cluster id assigned. Now you could use the new operator "AttributeConstruction" (will replace the operator FeatureGeneration in future releases together with the new ValueIterator operator). The whole setup looks like this:
Please note that you will have to use the latest CVS version of RapidMiner or you would have to wait until the next release in order to get access to the latest version containing both new operators. It's by the way also possible with older versions but the process is much more complicated then.
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="number_examples" value="200"/>
<parameter key="number_of_attributes" value="10"/>
<parameter key="target_function" value="gaussian mixture clusters"/>
</operator>
<operator name="IdTagging" class="IdTagging">
</operator>
<operator name="KMeans" class="KMeans">
<parameter key="k" value="5"/>
</operator>
<operator name="IOConsumer" class="IOConsumer">
<parameter key="io_object" value="ClusterModel"/>
</operator>
<operator name="ValueIterator" class="ValueIterator" expanded="yes">
<parameter key="attribute" value="cluster"/>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="AttributeConstruction" class="AttributeConstruction">
<list key="function_descriptions">
<parameter key="inner_label_%{loop_value}" value="if (cluster == "%{loop_value}", "%{loop_value}", "other")"/>
</list>
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="inner_label_%{loop_value}"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="Relief" class="Relief">
</operator>
<operator name="IOConsumer (2)" class="IOConsumer">
<parameter key="io_object" value="ExampleSet"/>
</operator>
</operator>
</operator>
</operator>
Cheers,
Ingo0