Classification
Nancy
New Altair Community Member
Hi,
I have a text document .Is it possible to classify the words in the document on the basis of their dependencies? .I have applied Navie Bayes for classification but I am getting only the graph and parameters of the distribution..
Thanks,
Nancy
I have a text document .Is it possible to classify the words in the document on the basis of their dependencies? .I have applied Navie Bayes for classification but I am getting only the graph and parameters of the distribution..
Thanks,
Nancy
Tagged:
0
Answers
-
Hi Nancy,
after using the NaiveBayes operator, the relative distribution plots and the distribution parameters are displayed. If you would like to apply the Naive Bayes model to classifiy documents, you have to use the ModelApplier operator. This operator will add a prediction column to your example set and also columns with the confidence (probabilities) for each class.
Regarding depencies: Naive Bayes assumes the independence of the attributes (words) and hence does not consider any dependencies. Nevertheless it is a good text classification method and for example used by most e-mail spam filters to distinguish between spam messages and non-spam e-mail messages.
Other learning techniques can consider dependencies to some extend. Support Vector Machine (SVM) models consider attribute dependencies to some extend and linear SVMs are often very accurate text classifiers. In RapidMiner, you have the choice between several SVM implementations: JMySVM, LibSVM, EvoSVM, and others.
For evaluating the performance of a modelling technique, you can use a cross-validation, i.e. the XValidation operator.
For further information, I recommend the RapidMiner Online Tutorial (see "RapidMiner Tutorial" in the RapidMiner Help menu) and our free introductory RapidMiner webinars.
Best regards,
Ralf0