text processing pdfs

wclaster
wclaster New Altair Community Member
edited November 2024 in Community Q&A
I am trying to build a word cloud from pdfs. Is there some sort of "demo" for this? Do I need to convert the pdfs to text first? I saw a video where he suggested converting to txt files and put them in a separate folder. ((92) Text Processing on Rapid Miner - YouTube)
I tried with a process (see attached xml) but I am getting gibberish for the output (see attached image). Any suggestions here? Thank you!

Best Answer

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hi,
    did you use read_document to read the pdf? it got a setting to read PDFs.

    Best,
    Martin

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hi,
    did you use read_document to read the pdf? it got a setting to read PDFs.

    Best,
    Martin
  • wclaster
    wclaster New Altair Community Member
    Thank you Martin. That did it

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.