🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

How to use standardization /normalization correctly on test/Train data set?

User: "Fred12"
New Altair Community Member
Updated by Jocelyn

hi,

I read that norm./standardization should be applied to train set separately, then the preprocessing model of the normalization/std. should be applied to the test data set,

but what about the validation set if I am doing cross-validation? should I also do a separate inner X-Validation normalization, where I apply the ranges of norm. from testdata in the XVal-set onto the validation set from the X-Validation?

 

For now, my process looks like this:

Unbenannt.PNG

 

I use once normalization on the outside "big" process, but inside the grid optimizer, I have a X-Validation with an SVM inside, however, I Am not applying further normalization on there, now my Question is, would it be better if my process looked like this:

 

Unbenannt2.PNG

 

 

where I also apply normalization to the inner X-Validation validation data (or is it called the test-data?) and if so, what about the normalization of the outside big process, how should I use that normalization for my test-data on the outside, without already using it for the traindata set for X-Validation?

 

last question:

some people say (including my supervisor) that the test-data inside cross validation is called test-data, not validation data, and that  validation data is the separate data tested outside that is entirely independent from the other X-Validation datasets. Is it not the other way around?

Find more posts tagged with