Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Performance (cost) sample not behaving as expected
MaartenK
I was looking into the performance (cost) component. It comes with a tutorial. The tutorial applies naive bayes to the Golf dataset using split validation. The outcome should be that 1 of 4 items is misclassified. However, if i run it, all items are misclasified as follows (play -> prediction):
yes -> no, no-> yes, yes -> no, yes -> no).
My collegue did not have this result. I am running this on an AMD Ryzen 5 3600 with RapidMiner 9.8.001.
I did not change any of the paramters in the tutorial.
I Also rebuild the model from scratch which had the same results.
Find more posts tagged with
AI Studio
Performance
Accepted answers
lionelderkrikor
Hi
@MaartenK
,
I'm only able to reproduce what you observe if :
- I check
use local random seed
and
local random seed = 1992
in the parameters of
Split Validation
operator
Otherwise if
use local random seed
is unchecked, i have in deed 25 % of the sample misclassified like your colleague ….
Thus are your sure that you have not checked
use local random seed
in the parameters of
Split Validation
operator.
Regards,
Lionel
All comments
lionelderkrikor
Hi
@MaartenK
,
I'm only able to reproduce what you observe if :
- I check
use local random seed
and
local random seed = 1992
in the parameters of
Split Validation
operator
Otherwise if
use local random seed
is unchecked, i have in deed 25 % of the sample misclassified like your colleague ….
Thus are your sure that you have not checked
use local random seed
in the parameters of
Split Validation
operator.
Regards,
Lionel
MaartenK
Hi Lionel,
That must be it. 1992 was the default local random seed in previous versions of RM and i set at as default to be able to reproduce the results from my thesis. The current default is 2001. If I use that then it works as described in the help.
Still interesting how that would generate such different results (4 classification errors vs one). Probably due to the fact that the golf dataset is very small.
Thanks for the fast response!
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups