how to load MS Word (.doc) into Rapidminer without corruption?

kevinace
kevinace New Altair Community Member
edited November 2024 in Community Q&A
I created a doc file called 'hello world.doc', with content only with 2 words, 'hello world'
However result with corrupted file content like the one below. I tested with DOC and DOCX, with or without Excel table. 
Please help. 



Tagged:

Best Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hi,
    there is an operator called Read Office in Operator toolbox extension. This should do the trick.

    Best,
    Martin
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    Hi @kevinace,

    You have to use the Read Office File operator from the Operator Toolbox extension.
    This extension is available in the marketplace for free.

    Regards,


    Lionel

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hi,
    there is an operator called Read Office in Operator toolbox extension. This should do the trick.

    Best,
    Martin
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    Hi @kevinace,

    You have to use the Read Office File operator from the Operator Toolbox extension.
    This extension is available in the marketplace for free.

    Regards,


    Lionel