Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Why does Naive Bayes return a confidence either 0 or 1 for every sample?
fstarsinic
I'm just guessing but is this telling me that there is some attribute the algorithm is keying on and discarding everything else? Is there a way to take the results and look at the predictions + the other attributes together in a correlation matrix to see if that is the case? I can't picture that with NB. Seems more of an NN kinda thing or a tree thing.
Anyway, 0 and 1 only?... that can't be a good sign. What does that indicate?
Find more posts tagged with
AI Studio
Performance
Sampling
Naïve Bayes
Accepted answers
All comments
varunm1
Is there a way to take the results and look at the predictions + the other attributes together in a correlation matrix to see if that is the case
If I understand this correctly, you want to find a correlation between predicted output and regular attributes used in model. If so, yes you can use correlation matrix operator and connect it to the "exa" port of performance operator to correlation matrix and select "include special attribute" option in correlation matrix operator.
Also, what does performance metrics indicate? Is this model predicting with high accuracy?
Do let us know if you need more info.
fstarsinic
thank you. the results are not awful. many predictions make me happy so that's good. the predictions are making sense, as I would expect. I have a VERY unbalanced dataset so some of the stats are not that meaningful.
fstarsinic
this is what looks odd to me. only a few test samples here but always the same regardless of sample size. confidence (predicting 0 or 1) is always either 0% or 100%. Seems likely something is wrong.
varunm1
Are you sure it is always 0 and 1? I see some of them are less than 1 and greater than zero based on this image. Can you check the Data View instead of the statistics view or you can open charts?
fstarsinic
Yes all 0s or 1s for confidence with nothing else. I checked the data. here's a sample of it.
the vertical axis above is the number of samples. the horizontal axis shows the different confidence values (only 2)
varunm1
Oh, so the algorithm is not learning about class 0. How are your precision values for class zero? This some times happens to highly imbalanced datasets where algorithm just pushes everything to class with high samples.
fstarsinic
Well there are only 2 classes so if it's learning about class 1 wouldn't it follow that it was automatically learning about class 0?
varunm1
What I mean by not learning is?. In the case of naive Bayes, it assumes that all attributes are independent of each other (this sometimes works and sometimes doesn't). If your data has complex interactions between attributes that add more information to the model, naive Bayes fail to find these things as it works based on conditional independence (one attribute and another are not connected). When it fails to learn, these algorithms will predict all or most samples as the majority class (1 in your case). I guess your data is a case where this algorithm principle fails. Imbalance data also has a huge effect on these algorithms.
Machine learning is also based on No free lunch theorem. We never know exactly which algorithm fits our data, which is the reason we try to apply multiple models.
Marco_Barradas
fstarsinic
you said "I have a VERY unbalanced dataset" have you done all the preprocessing and sampling before applying your learner? If you didn´t do that the model is biased since it easier to predict the class that has more values on your DataSet so it need your help.
You may see how this affects and how you could solve it on this videos.
https://academy.rapidminer.com/learn/video/sampling-weighting-intro
https://academy.rapidminer.com/learn/video/sampling-weighting-demo
and for the Naive
https://academy.rapidminer.com/learn/video/naive-bayes-intro
https://academy.rapidminer.com/courses/nave-bayes-demo
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups