Processing PDF documents for text mining with the Process Documents from Files operator

New Altair Community Member

Nov 30, 2017

Updated Nov 5, 2024 by Jocelyn

I tried processing large PDF documents using the Process Documents from Files operator. When running the process, RapidMiner returns an error while processing the Process Documents from Files operator. The error message is: "Process failed. javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher."

According to Marco Böck's post in this thread, the operator should be able to process PDF documents by now, if I understood him correctly.

Is there a way to process PDF documents without any workarounds? Any hints are highly appreciated. Thank you!