"Text mining"
Andreas_M_
New Altair Community Member
Hi,
I 'm new to Rapidminer and I don't quite cope with it yet.
What I want to do: I have about 300 pdf documents and one wordlist with about 100 different words. I want to find out the total occurrency of these words for each pdf document. And I would like to know the total number of words each pdf ducument contains.
Can somebody help me with modelling the process?
Thanks in anvance.
I 'm new to Rapidminer and I don't quite cope with it yet.
What I want to do: I have about 300 pdf documents and one wordlist with about 100 different words. I want to find out the total occurrency of these words for each pdf document. And I would like to know the total number of words each pdf ducument contains.
Can somebody help me with modelling the process?
Thanks in anvance.
Tagged:
0
Answers
-
Hello,
Did you finally find the process ? Would you please share it ?
I have the same concerns with many pdf documents.
Thank you0 -
Hi,
the operator to read pdf files is read Document. You can combine that with Loop Files to read several files.
Best,
Martin0 -
0