hi,
I noticed in the sample processes under -> Template -> Churn Modeling, that there is an optimize parameter inside a X-validation.
I know there is literature that says for X-Validation, validation data is used for parameter tuning, is this how it is meant to be done like in that example?
I am just curious, because it makes not so much sense for me to do a optimize parameter inside a X-Validation, the dataset is split, and inside opt. Parameter, the model that was build on the training data will be tested on the same training data, which should result in overfitting. The parameters will be optimized for training data, not the real data and only for a subset of the data inside X-Validation, then the best parameter model is retrieved and used inside X-Validation for applying to test data... therefore you get a different model with different parameters on each run inside the X-Validation, depending probably on how the dataset is being split. However, what one tries to get is one general model (as you probably will only have 1 model at the end and not 10 different ones) that fits best to real data.
it seems to work for that model, however I'm a bit sceptical if that is a valid / good approach for modeling, because the parameters should be optimised for the test-set, which means to be optimised for the real use-case , and not to tune overfitting on training data...
what do you think? Is this valid ? Or are both approaches valid and good modeling?
I personally would put X-Validation inside the Grid Optimization operator...