Gaussian Naive Bayes Formula on RapidMiner
Hello everyone!
I know that RapidMiner is using Gaussian distribution in Naive Bayes. But after I compare my result that I count manually and my result on RapidMiner, it's really different. So I am wondering maybe RapidMiner uses a different formula or I just count it wrongly.
I use this formula to count the mean : 1/n*(sum of xi), and this one to count the variance : 1/n-1*sum of(xi-mean)^2.
I want to know what's the formula that RapidMiner uses to count Gaussian NB? Is it just same with the formula that I use above?
Thank you.
Answers
-
Hi @ikayunida123,
You can find an Excel file used to calculate the probabilities from the "Golf" dataset using NB formulas by following this link.
If you obtain differents results is maybe because RapidMiner calculate by default the probabilities with Laplace correction
and you without Laplace correction.
Regards,
Lionel
2 -
Hello @lionelderkrikor . It's works nicely, thank you.
But some data still have different result. For example some of the standard deviations, in RapidMiner they display it as 0,001, but in Ms. Excel it's come out as 0. I wonder if RapidMiner and Ms. Excel have different way to count it (?)
0 -
Hi @ikayunida123,
Have you set the number of digits after the decimal point to 3 or more in Excel ?
Regards,
Lionel
1 -
@lionelderkrikor Yes, I have set the type in format cells into number and added several decimal places. But the result still the same. I tried to browse the formula on other website, and people said that Ms. Excel is using Bessel's correction to count the standard deviation.
0 -
Hi @lionelderkrikor do you have any example of laplace correction in Ms. Excel? I'm so confused right now because my data is quite big :catsad:
5