Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
performance of testing data
rafeena
hi,
i have included images on how i have done my classification. i would like to know how to view the performance of my testing data.hopefully what i am doing here is correct
thanks
Find more posts tagged with
AI Studio
Accepted answers
lionelderkrikor
No, You can't !
Regards,
Lionel
All comments
lionelderkrikor
Hi
@rafeena
,
Your process is correct. Your performance vector is given by the
ave
output port of the
Validation
operator.
Do you encounter any error with this process ?
Regards,
Lionel
rafeena
hi
lionelderkrikor
.. it didnt give me any problem. however i would like to see the performance of my testing file, the file names retrieve testing date and i believe the performance i got now is for my training data.
IngoRM
Hi,
Just add another Performance operator after the Apply Model (2) which will then calculate the error rates for the provided test data.
Side note: what you have now is actually is not really the training error but the estimation of the test error from a cross-validation. The true training error would be if you would apply the model on the complete training data again and calculate the performance for that.
The cross-validated error and the test error should be similar (provided you have enough data and it follows the same distributions).
Hope this helps,
Ingo
rafeena
IngoRM
hi. i did it like you said but the result is not good. the accuracy is 0
IngoRM
Well, I see that you have changed your process a bit. You seem to select some column in the training path - are you sure that you do the same data transformations also on the test data?
rafeena
@IngoRM
i am doing 2 process actually one is to select features using tfidf and one using entropy. can you explain more on the data transformation because i probably didnt execute them all
lionelderkrikor
Hi
@rafeena
,
What Ingo said means that you have to apply strictly the same preprocessing steps to both your training dataset and test dataset.
From your screenshot of your previous post, it seems that your are selecting only some features (via the
Weight by Information Gain
/
Select by Weights
operators) during your training step.
You have to apply strictly the same selection to your test data.
To have a personalized response, please share your process(es) and all your dataset(s).
Regards,
Lionel
rafeena
hi
@lionelderkrikor
i have applied the same step but it says that the attributes are not a matched, however i do believe the attributes i used are all the same. any way i have included my datasets. my processes are as the pictures above
testing data2.1.xlsx
formspring-project training-2.xlsx
lionelderkrikor
Hi
@rafeena
,
In attached file, the working process.
I'm able to obtain a test performance (accuracy) of around 70 % (calculated by the
Cross Validation
operator).
Hope this helps,
Regards,
Lionel
PS : You can not calculate the "test error" from your dataset "testing data2.1" because you have not the true label...
Classification_text_mining.rmp
rafeena
thank you very much
@lionelderkrikor
. when you say test error does this mean i cannot see the accuracy for testing data 2.1?
lionelderkrikor
No, You can't !
Regards,
Lionel
rafeena
@lionelderkrikor
noted thanks for your help.
rafeena
@lionelderkrikor
. i would like to be clear on training and testing data for rapidminer. if i do it like the process in the photo the file named testind data 2.1 is not actually set as my testing data right? both my training and testing data is within the file formspring training 2 and rapidminer will choose randomly which one will be testing and training data. is this correct?
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups