A question about naive bayes based text classification
Hi,
I am testing the naive bayes(NB) for text classification. To my understanding, the result should not be affected by the tf-idf vector of the text. Because NB considers the frequency of each term(t) in each category(c), i.e., p(t | c), and this information is stored in WordList, not the term vectors(i.e., the ExampleSet). Right?
However, after I changed the tf-idf values in ExampleSet, for example, by multiplying a weight x, 0<x<1, the accuracy is changed differently according to different weight x. WHY?
Sincerely yours,
gfyang
I am testing the naive bayes(NB) for text classification. To my understanding, the result should not be affected by the tf-idf vector of the text. Because NB considers the frequency of each term(t) in each category(c), i.e., p(t | c), and this information is stored in WordList, not the term vectors(i.e., the ExampleSet). Right?
However, after I changed the tf-idf values in ExampleSet, for example, by multiplying a weight x, 0<x<1, the accuracy is changed differently according to different weight x. WHY?
Sincerely yours,
gfyang