A program to recognize and reward our most engaged community members
Hi Al,I assume these are image based PDF files, so you just get a blank window when you open them in Monarch. Is that correct?If so, you will need to run them through an OCR tool of your choice. These tools will attempt to convert the image into searchable text that should then be available within Monarch.The success will depend largely on the quality of the image in the PDF. If they are sharp, straight and of high enough resolution, you should get excellent results with the machine printed data. But if they are fuzzy or scanned at a slight angle, the success rate diminishes rapidly.Handwritten amounts and dates will require ICR instead of OCR. Tools that offer this tend to be more expensive will almost certainly have much lower accuracy. Even then, they require block written characters. I don't believe there are any ICR tools that will reliably convert handwritten cursive text.Please be aware that the accuracy of the extracted characters is beyond our control. Whatever you see in the report window of Monarch will be what the OCR/ICR tool has generated. There is no interpretation going on within Monarch.If you are looking to use the final solution within Automator, then you need to choose a tool that has an API such as ABBYY Finereader. Simple command line tools may not work in Automator if they call a GUI, even if that GUI does not require any interaction.Regards,Steve.------------------------------Steve CaielsProfessional ServicesAltair-------------------------------------------------------------------------Original Message:Sent: 04-08-2020 02:47 PMFrom: Al RiceSubject: Extracting PDF ImagesHello All:I am working with a large volume of Bank Statements containing check images which I need to extract and categorize for subsequent retrieval. Has anyone had success extracting images from PDF files in the manner I have described.As this will be a recurring requirement, I certainly would like to automate the process using Excel VBA, vbscript or java/javascript.Any suggestions would be appreciated.Thanks...------------------------------Al Rice------------------------------"
If anyone still needs any help, I can put in my two-pence here. It's definitely possible to extract images from PDF files using automation tools like Excel VBA or JavaScript. One option you might consider is using an OCR engine like Smart Engines to detect and extract the check images from the PDFs. From there, you could use VBA or JavaScript to sort and categorize the images based on your needs.Another approach could be to use Python with the PyPDF2 library to extract the images from the PDFs. This would require some programming knowledge, but there are plenty of resources available online to help you get started.