Textmining Problem - Keyword search and customized tokenization
MasseAlarm
New Altair Community Member
Dear Rapidminer Community,
for a university project I have to evaluate about 900 business reports and I want to do this via Rapidminer. Unfortunately I'm still a complete beginner regarding the software and need your help.
I have installed the Text Processing Extension for Rapidminer.
The problem:
I need to search the reports for 120 specified keywords. If this word occurs, I must extract an additional 20 words before and after the keyword in order to understand the context.
My current state:
With "Tokenize" I get a sentence output, but how does it work with exactly 20 words before and after the keyword?
With "Filter Tokens (by Content)" I can always get one of the 120 words displayed. But how do I make sure that all 120 words are directly taken into account?
I've been sitting on it for quite a while now and have searched through all kinds of forum entries without a suitable solution so far. I hope you can help me. Thanks a lot!
Best regards
for a university project I have to evaluate about 900 business reports and I want to do this via Rapidminer. Unfortunately I'm still a complete beginner regarding the software and need your help.
I have installed the Text Processing Extension for Rapidminer.
The problem:
I need to search the reports for 120 specified keywords. If this word occurs, I must extract an additional 20 words before and after the keyword in order to understand the context.
My current state:
With "Tokenize" I get a sentence output, but how does it work with exactly 20 words before and after the keyword?
With "Filter Tokens (by Content)" I can always get one of the 120 words displayed. But how do I make sure that all 120 words are directly taken into account?
I've been sitting on it for quite a while now and have searched through all kinds of forum entries without a suitable solution so far. I hope you can help me. Thanks a lot!
Best regards
Tagged:
0
Answers
-
hi @MasseAlarm I would strongly recommend getting a foundation in RapidMiner before tackling this problem:
https://academy.rapidminer.com/learning-paths/get-started-with-rapidminer-and-machine-learning
https://academy.rapidminer.com/courses/text-and-web-mining-with-rapidminer
Scott2