How to weight features before clustering

kayvanjoo
kayvanjoo New Altair Community Member
edited November 5 in Community Q&A

Dear all,

I have set of data that I want to cluster but before clustering I want to weight my features.

My question is how should I label my data as feature selection methods require a label role and I actually do not have any real label yet?

Should I set my expriments IDs as the label ?

What feature selection methods can I use before I cluster my data?

 

Thanks for your attention!

 

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member

    If you don't have a label, then how are you planning to assign weights?  Assigning the label tells RapidMiner the thing you are interested in predicting, which most weighting schemes will evaluate other attributes with respect to that label.  So don't use an ID variable, that would be pointless.  So perhaps you can explain a bit more about what you are trying to accomplish with the weighting prior to clustering?

     

  • kayvanjoo
    kayvanjoo New Altair Community Member

    Yes true !

    The reason that I want to use attribute weighting is that I somehow want to do feature selection and selecting statistically imporant features with a weight higher than .5 for example in order to classify my data points and that's why I want to do attribute weighting but I actually now dont have any idea that how can it be done or if it is possible or not!

    looking forward for your suggestions !

  • Telcontar120
    Telcontar120 New Altair Community Member

    When you say "statistically important" that implies you have a reference point---statistically important to what?  That's generally when you have a label. Machine learning problems in general are classified as either supervised learning, where you have some specific target variable (called a label in RapidMiner) in mind, and unsupervised learning, where you don't have such a goal and instead the algorithms are merely looking for interesting structures or relationships in the data.

     

    You haven't said much about what you are actually trying to accomplish, but if clustering is the key method, then I would suggest that you go ahead and run your clustering without worrying about weighting yet.

     

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    Yes, you can do all that. What you want to check out is the Select by Weights operator. There you can set your threshold and automatically select the attributes you want.