PDF encoding issue

User: "limegreenman900"
New Altair Community Member
Updated by Jocelyn

Hi everyone,

 

I was trying to do the most simple one can do, by reading a PDF file into RM.... I have done this several times before, but now I am stuck with (I suspect) an encoding issue.

After using the "Read Document" Operator (extract text only and use file extension as type are tick-marked) I inserted a breakpoint, before I do some preprocessing of the text. However I don't get any text out of my PDF, what I get instead is something like:


¨ÉøC&13#s$ó/Y¢¬–¬³ÙÜìâì=ÙOsbsúrnåºç&#26;sOæ1óŠòvç=Ë�ËïÏŸ\ä»hÙ¢ó&#5;Ö&#5;ê‚#…¤Â¼Â�…³‹ã&#23;oZ<]&#20;TÔUt}‰`IÃ’sK­—V-ý¤˜Y,+>TB(É/ÙSòƒ,]6*›-•–¾W:#—È7Ë&#31;*¢&#21;&#3;Š&#7;Ê&#8;e¿ò^YDY&#127;Ù}U„j£êAyTù`ù#µD=¬þ¶"©b{ųÊôÊ&#15;+&#127;¬Ê¯: !kJ4Gµ&#28;m¥ötµ}uCõ%�—®K7Y&#19;V³©fFŸ¢ßY&#11;Õ.©=bàá?S&#23;ŒîƕƩºÈº‘ºçõyõ‡&#26;Ø
Ú†&#11;�ž�k&#26;ï5%4ý¦&#25;m–7Ÿlqlio™Z&#22;³lG+ÔZÚz²Í¹­³mzyâò]íÔöÊö?uøuôw|¿"&#127;űN»Îå�wW&®ÜÛe֥ﺱ*|ÕöÕèjõê‰5&#1;k¶¬yÝ­èþ¢Ç¯g°ç‡^yï&#23;kEk‡Öþ¸®lÝD_pß¶õÄõÚõ×7DmØÕÏîoê¿»1mãá&#1;l {àûMÅ›Î
&#6;&#14;nßLÝlÜ<9”úO

Anyone an idea where the problem is? I would suggest that it is an encoding issue?!

 

If I go into the PDF file and Copy+Paste the text into a Word File there is no problem and the text is displayed in a correct manner....

Find more posts tagged with