Skip to main content

Posts

Showing posts from September, 2011

I guess Python isn't so bad after all...

Not wanting to hassle with learning OpenCV and fighting with an edit-compile-execute environment, I decided to use my OpenCV project as an excuse to play around with Python. I'm still a serious beginner, but I'm beginning to understand why it gets the use it does. Anyhow, it only took a couple of days to integrate Tesseract OCR, PIL, and OpenCV such that I could open multi-frame TIFF images, perform some basic feature detection, and then use the output of feature detection to focus on a specific region for OCR. I will admit to having a few false starts.  The first was that I used an older (C++) tutorial that was using some deprecated features of OpenCV and ignoring some other features.  For example, the tutorial was using Hough Line detection to find squares on a printed page.  In order to get to that point there was thresholding, dilating, eroding, inversion, flood filling and so on.  Even then I wasn't getting the correct results. When I started over, I was using

Android on hold

Since the Android git repository is offline, I'm having to find something else to occupy my time. I've always wanted to learn more about OpenCV, so I'm working on a new project using that.  At work we have piles and piles of paper forms that have been filled out by hand.  I've already done some trial and error work and determined that both OpenCV and PIL can clean up the scanned copies enough that I can do OCR on the printed portion of the forms using Tesseract OCR.  This doesn't get any of the dynamic data, but it does allow me to identify what type of form it is. Running OCR on an entire document takes some time, so it would be better to grab smaller regions of interest and only OCR them.  Some of the interesting challenges are that some forms are portrait while others are landscape.  I'd also like to handle the case where a form was fed into the scanner upside down.