Question on Applying Model/normalizing
stereotaxon
New Altair Community Member
Hi,
I fit Weka's MLP model and choose the option to normalize my data, saved my model, and now I want to apply that model to a new dataset.
I'm wondering how RM/Weka handle the normalization. That is, I want my dataset to be scaled the same way as my old data set, for example, say the normalization did something like this
variable1 in old dataset:
1, 2, 3, 4, 5 --> .0, .2, .4, .6, .8, 1
variable1 in new dataset::
1,2,3 --> ????
would it normalize var1 in the new dataset to have values of .0, .2, .4 (desired) or 0, .5, 1 (not good)?
Thanks!
Mike
I fit Weka's MLP model and choose the option to normalize my data, saved my model, and now I want to apply that model to a new dataset.
I'm wondering how RM/Weka handle the normalization. That is, I want my dataset to be scaled the same way as my old data set, for example, say the normalization did something like this
variable1 in old dataset:
1, 2, 3, 4, 5 --> .0, .2, .4, .6, .8, 1
variable1 in new dataset::
1,2,3 --> ????
would it normalize var1 in the new dataset to have values of .0, .2, .4 (desired) or 0, .5, 1 (not good)?
Thanks!
Mike
Tagged:
0
Answers
-
Hi,
I must admit I do not exactly know how stored Weka models behave, but I assume (hope ) it stores the normalization parameters as well and hence normalizes the data according to the same parameters.
May I ask why you do not use the corresponding "native" RapidMiner operators, i.e. the [tt]Normalization[/tt] operator in combination with the [tt]NeuralNet[/tt]?
Cheers,
Tobias0 -
I use the Weka MLP because it seems to be faster and I'm working with big data. For the normalization question, I think I'll just run a small test where I fit normalized data that includes the holdout set, with missing values for the label, and then score the holdout set using the model. I'm (already) pretty sure the predictions will be the same.
thanks,
Mike0 -
Hi,
you could also create the normalization model with the "Normalization" operator with the parameter "create_preprocessing_model" turned on. Then use the Weka learner and later you can apply both models on your application data.
Cheers,
Ingo0