Home
Discussions
Community Q&A
Feature generation
User118888
Hello,
I'm researching about feature generation.
There is a feature generation option in AutoModel. Which algorithm does RapidMiner use for generating new features from existing features?
Thank you.
Find more posts tagged with
AI Studio
Feature Generation
Auto Model
Accepted answers
IngoRM
Hi,
We use a multi-objective optimization approach using evolutionary algorithms for simultaneously minimizing the predictive errorof a model and the complexity of a feature space. This introduces regularization for the feature space optimization similar to the regularization used in general in ML. By doing so, our approach does not suffer from feature bloat (like, for example, most genetic programming approaches) and is much more robust against overfitting. The complexity is calculated based on the number of features and the amount and complexity of applied functions in case of feature generation.
You can learn more in my PhD thesis here:
http://www-ai.cs.tu-dortmund.de/auto?self=$Publication_fz5hgy8b
In fact, many of my scientific publications have been dealing with automatic feature engineering for both supervised and unsupervised learning:
http://www-ai.cs.tu-dortmund.de/PERSONAL/mierswa.html
Or in this webinar here:
https://www.youtube.com/watch?v=oSLASLV4cTc
Or this white paper here:
https://rapidminer.com/resource/multi-objective-optimization-ebook/
Or in a series of blog posts here:
https://community.rapidminer.com/discussion/52782/multi-objective-feature-selection-part-4-making-better-machine-learning-models
I am sure there are also more places where we talked about this but that should be enough for now ;-)
Hope this helps,
Ingo
All comments
IngoRM
Hi,
We use a multi-objective optimization approach using evolutionary algorithms for simultaneously minimizing the predictive errorof a model and the complexity of a feature space. This introduces regularization for the feature space optimization similar to the regularization used in general in ML. By doing so, our approach does not suffer from feature bloat (like, for example, most genetic programming approaches) and is much more robust against overfitting. The complexity is calculated based on the number of features and the amount and complexity of applied functions in case of feature generation.
You can learn more in my PhD thesis here:
http://www-ai.cs.tu-dortmund.de/auto?self=$Publication_fz5hgy8b
In fact, many of my scientific publications have been dealing with automatic feature engineering for both supervised and unsupervised learning:
http://www-ai.cs.tu-dortmund.de/PERSONAL/mierswa.html
Or in this webinar here:
https://www.youtube.com/watch?v=oSLASLV4cTc
Or this white paper here:
https://rapidminer.com/resource/multi-objective-optimization-ebook/
Or in a series of blog posts here:
https://community.rapidminer.com/discussion/52782/multi-objective-feature-selection-part-4-making-better-machine-learning-models
I am sure there are also more places where we talked about this but that should be enough for now ;-)
Hope this helps,
Ingo
User118888
Hi Ingo,
Thanks for your detailed and quick answer. I'll look into these resources.
Best regards,
Murat
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)