Cross-validation Features

JohnNash2000
JohnNash2000 New Altair Community Member
edited November 5 in Community Q&A
Hello, I am currently performing cross-validation (CV), and within this process, "Forward Selection" is performed during training. How can I output the chosen features once CV has completed? I've tried countless solutions including using the "Weights to Data" and "Data to Weights" operators, but neither of these output the chosen features. Does anyone know how I can extract the chosen features from the "Cross Validation" process?

Thank you

Best Answer

Answers

  • JohnNash2000
    JohnNash2000 New Altair Community Member
    Hello @varunm1

    You are 100% correct, there is no final set of features since each iteration of CV will have its own feature set. You see, I recently read the blog post about contamination ("Avoiding Accidental Contamination of Data [3 Examples]"), and so I moved my feature selection process from outside of CV to inside. When the feature selection process was outside, I had a chosen set of features based on the entire training data. This is what I was looking for, and I became so blinded in finding how to do this, I never stopped to think why.

    Thank you



  • varunm1
    varunm1 New Altair Community Member
    Thats true @JohnNash2000 if we are validating a model, the preprocessing steps like sampling, feature selection should be applied on training side. If we apply on whole data it will bias the model and some times over estimates the performance.