"Simple Text Classification - Help"

New Altair Community Member

Feb 14, 2011

Updated Nov 5, 2024 by Jocelyn

Hello,

I am trying to classifiy documents (.txt) [sort into groups].

What I've dont so far:

Process Documents from Files (2 categories / classes) -> Tokenize -> Filter Stopwords ==> Learner ==> Apply Model (the document to classify comes from Read Document -> Process Documents (Tokenize, Filter) as you can see below:

There are 6 documents for each class (Process Documents from Files) and a single document to classify.

Is this the right way to classify text / documents in Rapidminer ? I am asking because the results are confusing..just to make sure, I want Rapidminer to tell me "Your single .txt file belongs to class/category A or B".

Thanks in advanced!

Find more posts tagged with

AI Studio

Text Mining + NLP

Sort by:

1 - 2 of 21

New Altair Community Member

Feb 16, 2011

Search for this post in BI Processes "Example - Classify Text Language" and remove the NGgram operator. You will have a working text classifier. I use it for several text classification applications.

land

New Altair Community Member

Feb 17, 2011

Hi,
you will have to make sure that in the apply case the same word lists are used! Otherwise there won't be the same attributes and the TF-IDF will differ! So forward them from the process documents operator in training part to the input port of Process Documents on application part.

We have a Webinar that will introduce you to the text classification tasks more detailed.

Greetings,
Sebastian

🎉Community Raffle - Win $25

"Simple Text Classification - Help"

Find more posts tagged with

Quick Links