Home
Discussions
Community Q&A
text mining on specific section in pdf files
Mahmud_elabo
I wanna do text mining on a specific section(for examples just abstracts) from pdf files
anyone can help here, please
thanks so much in advance
Find more posts tagged with
AI Studio
Accepted answers
All comments
kayman
The read document operator allows you to read your pdf as text, so you can use all of the text mining / NLP magic as if it were a text file.
Mahmud_elabo
kayman
I tried that but as I mentioned I have 200 pdf files and I need to do text mining just on a specific section like Abstracts or just introductions
kayman
Then you need to combine with loop documents. Point it to your folder with your pdfs, extract the data that you need, one by one till number 200.
So basically create a process that works for one first, and then use it to loop through all your pdf's one by one. Whether it's 1, 20, 200 or 2000 pdf's doesn't make a difference.
You just have to decide if you want the outcome combined in a collection or finalise it in the loop process.
Mahmud_elabo
@kayman
thank you so much, I wonder if is there any video or tutorial showing these process
kayman
Have you tried Rapidminer academy? There is plenty of training on nlp / textmining there, and around loops. You may need to combine a few but there is a ton of info there.
Also looking at youtube will provide some good info on textmining with Rapidminer.
Mahmud_elabo
@kayman
yes I have tried rapidminer community and I looked for this process on youtube but I did not find anything about what I need
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)