Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Dummy Encoding in Rapidminer
Adi1215
Hi, I am new to Rapidminer and building my first predictive model. While working on the feature engineering part I used dummy encoding on one of the categorical columns, it gave me columns based on the number of categories present in that column. Ideally, it should give n-1 column else multicollinearity will increase as per my understanding. Any trick to get rid off from this issue. Do I need to manually delete one of the generated columns after applying dummy encoding?
Guys, please share your thoughts.
Regards,
Find more posts tagged with
AI Studio
Accepted answers
Telcontar120
Most of the modern ML algorithms implemented in RapidMiner include adjustments for perfect multi-collinearity if needed, so dummy coding is actually just fine. But the Nominal To Numerical operator supports the n-1 encoding approach as well, just select the "effect coding" option in the coding type parameter instead of dummy coding and then specify the omitted categories in the resulting "comparison groups" dialog box. This is tedious for a large number of attributes, though, so if you can use dummy coding, that is preferable.
All comments
Telcontar120
Most of the modern ML algorithms implemented in RapidMiner include adjustments for perfect multi-collinearity if needed, so dummy coding is actually just fine. But the Nominal To Numerical operator supports the n-1 encoding approach as well, just select the "effect coding" option in the coding type parameter instead of dummy coding and then specify the omitted categories in the resulting "comparison groups" dialog box. This is tedious for a large number of attributes, though, so if you can use dummy coding, that is preferable.
Adi1215
Thanks for this. I'll try this out and let you know.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups