Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Text filtering problem! Please help!
karhunen
Hey community,
I'm new in working with rapidminer and I try to filter
multiple words from different pdf-files
.
First I tried to filter just one word after tokenizing the files with the "Filter Tokens (by content)" Module.
I used the condition "contains" and specified my "string". This actually works fine.
Now i want to filter multiple words but i just dont know how to do this.
Can you please help me? I would really appreciate it!
Background:
I'm trying to classify some documents by using a wordlist with positive and negative words.
Rapidminer should analyse the given pdf-files regarding the amount of positive and negative words.
Any ideas?
Find more posts tagged with
AI Studio
Accepted answers
All comments
Andrew2
Hello
You could use a word list to filter the document for those words only.
Here is an example that does more than you need.
http://rapidminernotes.blogspot.co.uk/2013/04/finding-needles-in-text-haystacks.html
You will need to make some changes for what you want.
regards
Andrew
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups