Sentiment analysis multiple Pdfs
Anlis
New Altair Community Member
I am a master's student in Business Economics and I'm new to RapidMiner.
For my thesis, I have to pre-process multiple Pdf files by tokenizing, stemming, transforming cases etc.
If I do this for one file, I get the wanted outcome: a processed text. But when I use the loop function to process multiple pdfs, the output is never text, but tables of word counts.
How do I pre-process multiple pdf files and get all the processed texts?
Thank you for helping!
For my thesis, I have to pre-process multiple Pdf files by tokenizing, stemming, transforming cases etc.
If I do this for one file, I get the wanted outcome: a processed text. But when I use the loop function to process multiple pdfs, the output is never text, but tables of word counts.
How do I pre-process multiple pdf files and get all the processed texts?
Thank you for helping!
Tagged:
0
Answers
-
Hi @Anlis
You should start by using the process documents from files.
That will output the results of your folders and PDF to an example set containing all the pre processed files after applyting all the steps of the text mining.
Here is a link to the course on the academy
https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/comparison-classification-and-clustering?page=3
0