Colleagues:
I've used the Preventitive Maintainence Machine Failure data set that comes with RapidMiner to experiment with creating various classification models. I saved a model I developed to predict machine failure using the "Write Model" operator. This model was the output of a fair amount of optimiizations, feature selection experimentation using "Optimize Parameters", "Cross Validation", and other feature selection related operators.
I'd like to use the "Read Model" operator to load the Model I developed and load new data that the model hasn't seen and apply predictions using the beforementioned model - and then set various thresholds using the "Set Threshold" and "Apply Threshold" operators (related to the confidence attributes added by applying the model) to see the effect on prediction outcomes.
The file "Create_and_Apply_Threshold_Example_No_Error.png" shows a very simple process (based on the tutorial example) in which I can set and apply thresholds - but only by using a very generic setup with a (knn in this case) learner and no cross validation. All Attributes are recognized (26 in the data and 3 added by the model) for a total of 29.
The file "Create_and_Apply_Threshold_Example_Error.png" is another process in which I load the before mentioned saved model using "Read Model" and apply it to new data - but as the error message shows, the output of "Apply Model" shows only 26 attributes and the Apply Threshold operator returns the error message shown in "Create_and_Apply_Threshold_Example_No_Error_Nr_2.jpg". For some reason, the attributes (the Fail / No Fail predictions and confidences) added by applying the model against new data are not recognized by the "Apply Threshold" opertator.
I went back to my original process in which I created the model and tried to apply thresholds against new data in my original process but I still get the same error messages. Once again, the attributes (the Fail / No Fail predictions and confidences) added by applying the model against new data within my original process are not recognized by the "Apply Threshold" opertator.
The only way I can get the "Apply Threshold" operator to work is within the most simple of processes as mentioned above.
I imagine I am missing a very obvious point as it appears setting and applying thresholds is dead simple to do. To ensure that alll metadata would be available at run time, I stored the test data, the new data, and the predictive model in my local repositiory before trying to build a process that included setting and applying thresholds using these objects.
Thanks for any suggestions and best wishes, Michael Martin