Naive bayes vs Naive bayes(kernel)

Thiru · September 2020

hi all,
My data set contains numerical values, which are configured as data type " real". Im able to use both operators naive bayes as well as Naive Bayes(kernel) type., with slightly different performance. However, I also see in RM documentation, only Naive bayes(kernel) to be used for numeric attribute.
should I consider only NB(kernel) result, enventhough rapidminer accepts using normal Naive bayes operator too? or
both are acceptable for numercial attribute?

regds
thiru

BalazsBarany · September 2020

Hi Thiru,

the difference between stock NB and NB (kernel) is the way numeric attributes are put into the model. You can easily compare this when looking at the model output charts.

Naive Bayes (which can be used with numeric attributes) just assumes that the numerical inputs are normally distributed, calculates the parameters of this normal distribution, and uses it for assigning likelihoods to classes. You see two (or more) Gaussian curves in the model.

Naive Bayes (kernel) instead tries to fit a smoothed curve to the actual values. Therefore you can change some numeric parameters. If your attribute values don't follow a normal distribution, this can better fit them, so the prediction will be better, at the cost of a longer calculation time and more complex models (even with the danger of overfitting in some conditions).

If you find a good set of parameters for you use case and cross validate correctly, both will give you results you can rely on. Depending on your use case, you might want to select the variant giving better results, or the simpler model.

Regards,
Balázs

BalazsBarany · September 2020

Hi Thiru,

the difference between stock NB and NB (kernel) is the way numeric attributes are put into the model. You can easily compare this when looking at the model output charts.

Naive Bayes (which can be used with numeric attributes) just assumes that the numerical inputs are normally distributed, calculates the parameters of this normal distribution, and uses it for assigning likelihoods to classes. You see two (or more) Gaussian curves in the model.

Naive Bayes (kernel) instead tries to fit a smoothed curve to the actual values. Therefore you can change some numeric parameters. If your attribute values don't follow a normal distribution, this can better fit them, so the prediction will be better, at the cost of a longer calculation time and more complex models (even with the danger of overfitting in some conditions).

If you find a good set of parameters for you use case and cross validate correctly, both will give you results you can rely on. Depending on your use case, you might want to select the variant giving better results, or the simpler model.

Regards,
Balázs

Thiru · September 2020

@BalazsBarany , thanks for your reply. this clarifies.

Naive bayes vs Naive bayes(kernel)

Best Answer

Answers

Categories