Optimization Problem

Question

Hi, I would like to find the best classification model w.r.t. accuracy for a given example set. To achieve best results, my idea is to evaluate different supervised learners and optimize their parameters. In addition, different feature selection algorithms should be applied to provide most suitable input for the parameter optimization of each learner. So, my idea is something like a nested model: for each learner that should be evaluated for each example set determined by particular feature selection perform parameter optimization for given feature set and learner return: model model with maximal accuracy What do you think about this idea? Does it make sense to mix a feature selection and a learner parameter optimization to find the most accurate model, i.e. to first Or whould you proceed differently in that case? Are other approaches more common in practice? In am of the opinion that the most accurate model can be only found when different example sets are provided for the parameter optimization to get a high number of combinations for the performance evaluation. Correct me if I'm wrong. :-) If my idea is OK, I would ask you to help me modelling this use case in RapidMiner. It should be something like sample 05_Features/10_ForwardSelection.xml but not using just the NearestNeighbor as learner but an parameter optimization problem like 07_Meta/01_ParameterOptimization.xml. This is the code for the feature selection: but I don't get to replace the NearestNeighbor but an parameter optimization problem. Could you help me? Regards, Martin

Legacy User · Answer

Hi Tobias,

yes, that was exactly what I was looking for. Thank you.

Regards,
Martin

TobiasMalbrecht · Answer

Hi Martin, you simply have to put another learner into your process after the [tt]GridParameterOptiomization[/tt] operator. For that learner you should set the optimized parameters via a [tt]ParameterSetter[/tt]. The following process gives you an idea, how to do that: Hope that helps, Tobias

Legacy User · Answer

Hi Sebastian et all, thank you for your answer. However, I'm still not able to replace the simple NearestNeighbor model (used for the feature selection optimization) by an optimized learner model returned by a GridParameterOptimization. This is my current non-working model: The problem is the first operand of operator XValidation which expects a model but gets a ParameterSet and PerformanceVector. I have no idea how I can return the best model found by GridParameterOptimization and pass it to the next operator ApplierChain. Could you please help me extending my model? Thank you. Regards, Martin