Problem with text preprocessing
hi
How do I use the stop word dictionary to abbreviate words and replace them with the original words and delete negative words like not?
Where do I download the stopword dictinary?
Thankful
Answers
-
help me...:smileysad:
0 -
Hi @gham
I suggest you to look at 'Text processing' extension for RapidMiner which contains lots of very useful operators for working with texts; among them, there's a group of operators named 'Filter stopwords' which are doing the exact thing you need, for different languages.
As for downloading the dictionary, I guess there's no 'unified' dictionary as every text processing task would basically need different stoword list, so this part need to be completed manually. Maybe you can just google and download any ready dictionaries / lists someone has built before and shared online.
1 -
If you are using one of the supported languages, there are operators with built-in dictionaries (e.g., English, German, etc.) , so you don't need to download them. If you are using another language, or if you want a custom stopword dictionary, you will need to create it yourself, or find one on the web and download it. That's not functionality which is handled inside RapidMiner, but a little bit of web searching will turn up many useful references.
1 -
hello @gham Some quick recommendations for you:
• Post your XML process here in this thread (see https://youtu.be/KkgB5QXWXJ8 and "Read Before Posting" on right when you reply)
• Attach your dataset if possible (use a fictionalized version if there are privacy concerns)
• Make sure you have all necessary extensions installed (see https://youtu.be/pjBqG3xtXx4)
Scott0 -
Please help my friend help me out.
0 -
-
I'm sorry - I have no idea what "remove the negation" means.
Scott0