Set roles for unsupervised learning - clustering
lovefinearts198
New Altair Community Member
Hi there,
I have a dataset example as follow :
I want to perform a clustering operation to detect anomalies, but i am not sure about the kind of role i must give to my attributes.
I was thinking about :
Thanks for help.
I have a dataset example as follow :
I was thinking about :
- id_product : id
- Color : cluster
- reference : label
- quantity_ordered : weight
- price_paid : regular
- weight_in_grams : regular
Thanks for help.
Tagged:
0
Answers
-
Can anyone can give me a lead or a way to understand the roles in rapidminer ?0
-
Hello again,
is my question too easy or too complexe ??
Here's rapidminer help extract :
So perhaps this is better ?Description
This operator can be used to change the role of an attribute of the input ExampleSet. If you want to change the attribute name you should use the
Rename operator. The target role indicates if the attribute is a regular attribute (used by learning operators) or a special attribute (e.g. a label or id attribute).
The following target attribute types are possible:- regular: only regular attributes are used as input variables for learning tasks
- id: the id attribute for the example set
- label: target attribute for learning
- prediction: predicted attribute, i.e. the predictions of a learning scheme
- cluster: indicates the membership to a cluster
- weight: indicates the weight of the example
- batch: indicates the membership to an example batch
Please be aware that roles have to be unique! Assigning a non regular role the second time will cause the first attribute to be dropped from the example set. If you want to keep this attribute, you have to change it's role first.- id_product : id
- Color : regular
- reference : label
- quantity_ordered : regular
- price_paid : regular
- weight_in_grams : regular
0 -
Hello
Set the attributes you want to use to drive cluster membership to be "regular"
All other types will by ignored by the clustering.
I don't know your data but if the attribute called "reference" is some sort of pre-existing classification and you want to compare with the final clustering then it makes sense to set the role of this to be label as you have done. There is an operator called "map clustering on labels" that can be used to determine which cluster is closest to the labels. the resultant example set contains a prediction that can be used to determine a performance measure using the "performance" operator.
regards
Andrew0