Loading model parameters
Ryujakk
New Altair Community Member
Hello,
I've recently started using RapidMiner, and have found it rather intuitive to use so far. However, now, I'm stuck !
I have created a text classification model using TextInput (BinaryOccurrences) and XValidation with LibSVMLearner (as in the example 01_TextClassificationXVal.xml). At the end of the process, I added a ModelWriter.
In order to classify unknown texts, I created a new process. In this process, I chain a TextInput with the same parameters as before, a ModelLoader, a ModelApplier, and a ClassificationPerformance.
Since I use the same text inputs for learning and testing, I would expect the same performances. However, in the learning phase, I have an accuracy of 82.58%, but in the testing phase, I get only 26.49%...
Any ideas? ???
Edit: I was using "prune_below 90%" for learning, and "prune_below -1" for testing. This gave the error message "[Warning] Kernel Model: The number of regular attributes of the given example set does not fit the number of attributes of the training example set, training: 288, application: 2836
" When I set both to the same value, I get the expected high testing accuracy. However, I don't understand why that is the case! Any explanation is still welcome
How should I go about to correctly load my model and classify an unknown text?
Edit again: Solved it for good by saving and loading the word vector used. phew! 8)
I've recently started using RapidMiner, and have found it rather intuitive to use so far. However, now, I'm stuck !
I have created a text classification model using TextInput (BinaryOccurrences) and XValidation with LibSVMLearner (as in the example 01_TextClassificationXVal.xml). At the end of the process, I added a ModelWriter.
In order to classify unknown texts, I created a new process. In this process, I chain a TextInput with the same parameters as before, a ModelLoader, a ModelApplier, and a ClassificationPerformance.
Since I use the same text inputs for learning and testing, I would expect the same performances. However, in the learning phase, I have an accuracy of 82.58%, but in the testing phase, I get only 26.49%...
Any ideas? ???
Edit: I was using "prune_below 90%" for learning, and "prune_below -1" for testing. This gave the error message "[Warning] Kernel Model: The number of regular attributes of the given example set does not fit the number of attributes of the training example set, training: 288, application: 2836
" When I set both to the same value, I get the expected high testing accuracy. However, I don't understand why that is the case! Any explanation is still welcome
How should I go about to correctly load my model and classify an unknown text?
Edit again: Solved it for good by saving and loading the word vector used. phew! 8)
Tagged:
0
Answers
-
Hi,
if you learn on one set, you have to store the word list used for attribute generation. If you load this word list during application, the same attributes will be created.
Greetings,
Sebastian0