How to read PDF file in rapidminer

KanikaAg15
New Altair Community Member
Hi,
I have a PDF file available with text and tabloid content. I would like to make a pipeline which can read only the specified tables from the PDF. Can anyone recommend any process for the same.
1st constraint being reading pdf into rapidminer.
2nd constraint extracting information from the PDF.
I have a PDF file available with text and tabloid content. I would like to make a pipeline which can read only the specified tables from the PDF. Can anyone recommend any process for the same.
1st constraint being reading pdf into rapidminer.
2nd constraint extracting information from the PDF.
Tagged:
0
Best Answer
-
Hi @KanikaAg15,
You'll need to add the Text Processing extension that will help let you extract the data from the pdf.
There is another extension that might useful PDF Table Extraction
And this course will be useful https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/lets-get-started
0
Answers
-
Hi @KanikaAg15,
You'll need to add the Text Processing extension that will help let you extract the data from the pdf.
There is another extension that might useful PDF Table Extraction
And this course will be useful https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/lets-get-started
0