Hi everyone,
I was trying to do the most simple one can do, by reading a PDF file into RM.... I have done this several times before, but now I am stuck with (I suspect) an encoding issue.
After using the "Read Document" Operator (extract text only and use file extension as type are tick-marked) I inserted a breakpoint, before I do some preprocessing of the text. However I don't get any text out of my PDF, what I get instead is something like:
¨ÉøC&13#s$ó/Y¢¬–¬³ÙÜìâì=ÙOsbsúrnåºçsOæ1óŠòvç=Ë�ËïÏŸ\ä»hÙ¢óÖê‚#…¤Â¼Â�…³‹ãoZ<]TÔUt}‰`IÃ’sK—V-ý¤˜Y,+>TB(É/ÙSòƒ,]6*›-•–¾W:#—È7Ë*¢ŠÊe¿ò^YDYÙ}U„j£êAyTù`ù#µD=¬þ¶"©b{ųÊôÊ+¬Ê¯: !kJ4Gµm¥ötµ}uCõ%�—®K7YV³©fFŸ¢ßYÕ.©=bàá?SŒîƕƩºÈº‘ºçõyõ‡Ø
Ú†�ž�kï5%4ý¦m–7Ÿlqlio™Z³lG+ÔZÚz²Í¹³mzyâò]íÔöÊö?uøuôw|¿"űN»Îå�wW&®ÜÛe֥ﺱ*|ÕöÕèjõê‰5k¶¬yÝèþ¢Ç¯g°ç‡^yïkEk‡Öþ¸®lÝD_pß¶õÄõÚõ×7DmØÕÏîoê¿»1mãál {àûMÅ›Î
nßLÝlÜ<9”úO
Anyone an idea where the problem is? I would suggest that it is an encoding issue?!
If I go into the PDF file and Copy+Paste the text into a Word File there is no problem and the text is displayed in a correct manner....