Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Naive bayes vs Naive bayes(kernel)
Thiru
hi all,
My data set contains numerical values, which are configured as data type " real". Im able to use both operators naive bayes as well as Naive Bayes(kernel) type., with slightly different performance. However, I also see in RM documentation, only Naive bayes(kernel) to be used for numeric attribute.
should I consider only NB(kernel) result, enventhough rapidminer accepts using normal Naive bayes operator too? or
both are acceptable for numercial attribute?
regds
thiru
Find more posts tagged with
AI Studio
Naïve Bayes
Accepted answers
BalazsBaranyRM
Hi Thiru,
the difference between stock NB and NB (kernel) is the way numeric attributes are put into the model. You can easily compare this when looking at the model output charts.
Naive Bayes (which
can
be used with numeric attributes) just assumes that the numerical inputs are normally distributed, calculates the parameters of this normal distribution, and uses it for assigning likelihoods to classes. You see two (or more) Gaussian curves in the model.
Naive Bayes (kernel) instead tries to fit a smoothed curve to the actual values. Therefore you can change some numeric parameters. If your attribute values don't follow a normal distribution, this can better fit them, so the prediction will be better, at the cost of a longer calculation time and more complex models (even with the danger of overfitting in some conditions).
If you find a good set of parameters for you use case and cross validate correctly, both will give you results you can rely on. Depending on your use case, you might want to select the variant giving better results, or the simpler model.
Regards,
Balázs
All comments
BalazsBaranyRM
Hi Thiru,
the difference between stock NB and NB (kernel) is the way numeric attributes are put into the model. You can easily compare this when looking at the model output charts.
Naive Bayes (which
can
be used with numeric attributes) just assumes that the numerical inputs are normally distributed, calculates the parameters of this normal distribution, and uses it for assigning likelihoods to classes. You see two (or more) Gaussian curves in the model.
Naive Bayes (kernel) instead tries to fit a smoothed curve to the actual values. Therefore you can change some numeric parameters. If your attribute values don't follow a normal distribution, this can better fit them, so the prediction will be better, at the cost of a longer calculation time and more complex models (even with the danger of overfitting in some conditions).
If you find a good set of parameters for you use case and cross validate correctly, both will give you results you can rely on. Depending on your use case, you might want to select the variant giving better results, or the simpler model.
Regards,
Balázs
Thiru
@BalazsBarany
, thanks for your reply. this clarifies.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups