Normalization Issue
Carlo
New Altair Community Member
Hello Rapid Miner Community,
I'm currently working on a clustering model.
I cluster different countries according to certain determinants.
I'm currently working on a clustering model.
I cluster different countries according to certain determinants.
However, the determinants are composed of different factors (example: Determinant: Degree of economic integration is composed of the factors: Trade Freedom and Trading across borders. The determinant transport infrastructure consists only of the factor: LPI Index).
I use the normalization operator to isolate different scale levels.
However, each determinant (degree of economic integration and transport infrastructure) should be equally weighted, since one determinant consists of more indicators than the other, it is overweighted so far.
My question to you is how I should proceed in RapidMiner in order to weight each determinant equally without having to aggregate the individual factors of a determinant.
Thank you for your support and hints.
Best regards, Carlo
0
Best Answer
-
You can also apply attribute weights. Look at all the options you have in terms of operators under Feature Weights. You can use an algorithmic approach or you could also set the weights manually.5
Answers
-
Hi @Carlothe task seems interesting, but I don't understand what you mean by determinants. Could you provide some further explanation?It seems that there are categories and subcategories, but I still don't get what the connection with the clustering is.Regards,Sebastian
0 -
Hi @SGolbert,I'm sorry, I might have expressed myself a little awkwardly.
There are altogether 5 determinants, these are quasi the main categories.
The determinants consist of a different number of factors (subcategory).
I try to illustrate it with two determinants:- The determinant or main category transport infrastructure consists of one factor, namely the LPI index.
- The determinant or main category homogeneity of demand consists of the factors or subcategories purchasing power, market size and article turnover.
In the second step (and this is my problem) I would now like to balance the main categories as well, since one determinant consists of only one factor and the other determinant of three factors, I do not know how to proceed and would be very pleased about your opinions.
I hope I explained it better this time
0 -
Hello Carlo,
Not sure if I understood this correctly, but if you have an issue with the number of dimensions (Attributes) per determinant, why not apply dimensionality reduction techniques like PCA?
Thanks2 -
You can also apply attribute weights. Look at all the options you have in terms of operators under Feature Weights. You can use an algorithmic approach or you could also set the weights manually.5