PDF encoding issue

New Altair Community Member

Jul 26, 2016

Updated Nov 5, 2024 by Jocelyn

Hi everyone,

I was trying to do the most simple one can do, by reading a PDF file into RM.... I have done this several times before, but now I am stuck with (I suspect) an encoding issue.

After using the "Read Document" Operator (extract text only and use file extension as type are tick-marked) I inserted a breakpoint, before I do some preprocessing of the text. However I don't get any text out of my PDF, what I get instead is something like:

¨ÉøC&13#s$ó/Y¢¬–¬³ÙÜìâì=ÙOsbsúrnåºçsOæ1óŠòvç=Ë�ËïÏŸ\ä»hÙ¢óÖê‚#…¤Â¼Â�…³‹ãoZ<]TÔUt}‰`IÃ’sK—V-ý¤˜Y,+>TB(É/ÙSòƒ,]6*›-•–¾W:#—È7Ë*¢ŠÊe¿ò^YDYÙ}U„j£êAyTù`ù#µD=¬þ¶"©b{Å³ÊôÊ+¬Ê¯: !kJ4Gµm¥ötµ}uCõ%�—®K7YV³©fFŸ¢ßYÕ.©=bàá?SŒîÆ•Æ©ºÈº‘ºçõyõ‡Ø
Ú†�ž�kï5%4ý¦m–7Ÿlqlio™Z³lG+ÔZÚz²Í¹³mzyâò]íÔöÊö?uøuôw|¿"Å±N»Îå�wW&®ÜÛeÖ¥ïº±*|ÕöÕèjõê‰5k¶¬yÝèþ¢Ç¯g°ç‡^yïkEk‡Öþ¸®lÝD_pß¶õÄõÚõ×7DmØÕÏîoê¿»1mãál {àûMÅ›Î
nßLÝlÜ<9”úO

Anyone an idea where the problem is? I would suggest that it is an encoding issue?!

If I go into the PDF file and Copy+Paste the text into a Word File there is no problem and the text is displayed in a correct manner....

Find more posts tagged with

AI Studio

PDFs

PDF encoding issue

Find more posts tagged with

Quick Links