Normalization Issue
OK... hopefully this will make sense to you because I'm thoroughly confused...
Using Version 4.2, the first file I use, "ModelBuider_v42.xml," builds the model to predict the change in price. For example, the "ModelBuilder" file will import the raw data, normalize the data, create a simple linear regression model, write the model to a file, reload the model, then apply the model to the previous example set using the ModelApplier. After I run file the "Meta Data View" shows the following statistics for the label and prediction respectively, "avg = 0.390 +/- 7.132" and "avg = 0.390 +/- 0.261." In addition, the statistics of all the regular attributes are "avg = 0 +/- 1." Therefore, everything appears to look good thus far.
However, my second file, "ModelLoader_v42.xml," is used to import new raw data, load the model, apply the model, and save the results to a comma seperated file. But when I run this file using the same raw data file as before, the "Meta Data View" shows the following statistics for the label and prediction respectively, "avg = 0.390 +/- 7.132" and "avg = 8.846 +/- 1.677." In addition, the statistics for all the regular attributes do not appear to be normalized, i.e. "avg = 65.074 +/- 16.351, avg = 0.337 +/- 2.242, etc." Therefore, even though I selected "return_preprocessing_model" in the "Normalization" operator in the model builder file--none of the regular attributes or the predictions appear to remain normalized.
Now this is when it really gets confsing. Using Version 4.1, when I build the model using the same operators and the same raw data as before, the statistics are as follows for the label and prediction respectively, "avg = 0.390 +/- 7.132" and "avg = -1.238 +/- 2.720" And the statistics for the regular attributes appear really off, i.e. ""avg = -4.223 +/- 0.004, avg = -0.217 +/- 0.199, etc." for the same attributes as above. Moreover, when I load and run the model, the statistics for the label and prediction respectively are, "avg = 0.390 +/- 7.132" and "avg = 0.390 +/- 0.261." In addition, now the statistics of all the regular attributes are normalized again, i.e. "avg = 0 +/- 1."
What is really strange is that the results I got using verion 4.2, i.e. "ModelBuider_v42.xml" but could not duplicate using the "ModelLoader" file are the same results I got after creating the model and loading the model using version 4.1.
Could I have corrupted the results while trying to repeat the process. Or should I have uninstall Version 4.1 before I installed version 4.2.
Please let me know how I can transfer the xml and data file to you for verification...
Thanks again,
Darrell
Using Version 4.2, the first file I use, "ModelBuider_v42.xml," builds the model to predict the change in price. For example, the "ModelBuilder" file will import the raw data, normalize the data, create a simple linear regression model, write the model to a file, reload the model, then apply the model to the previous example set using the ModelApplier. After I run file the "Meta Data View" shows the following statistics for the label and prediction respectively, "avg = 0.390 +/- 7.132" and "avg = 0.390 +/- 0.261." In addition, the statistics of all the regular attributes are "avg = 0 +/- 1." Therefore, everything appears to look good thus far.
However, my second file, "ModelLoader_v42.xml," is used to import new raw data, load the model, apply the model, and save the results to a comma seperated file. But when I run this file using the same raw data file as before, the "Meta Data View" shows the following statistics for the label and prediction respectively, "avg = 0.390 +/- 7.132" and "avg = 8.846 +/- 1.677." In addition, the statistics for all the regular attributes do not appear to be normalized, i.e. "avg = 65.074 +/- 16.351, avg = 0.337 +/- 2.242, etc." Therefore, even though I selected "return_preprocessing_model" in the "Normalization" operator in the model builder file--none of the regular attributes or the predictions appear to remain normalized.
Now this is when it really gets confsing. Using Version 4.1, when I build the model using the same operators and the same raw data as before, the statistics are as follows for the label and prediction respectively, "avg = 0.390 +/- 7.132" and "avg = -1.238 +/- 2.720" And the statistics for the regular attributes appear really off, i.e. ""avg = -4.223 +/- 0.004, avg = -0.217 +/- 0.199, etc." for the same attributes as above. Moreover, when I load and run the model, the statistics for the label and prediction respectively are, "avg = 0.390 +/- 7.132" and "avg = 0.390 +/- 0.261." In addition, now the statistics of all the regular attributes are normalized again, i.e. "avg = 0 +/- 1."
What is really strange is that the results I got using verion 4.2, i.e. "ModelBuider_v42.xml" but could not duplicate using the "ModelLoader" file are the same results I got after creating the model and loading the model using version 4.1.
Could I have corrupted the results while trying to repeat the process. Or should I have uninstall Version 4.1 before I installed version 4.2.
Please let me know how I can transfer the xml and data file to you for verification...
Thanks again,
Darrell