Skip to main content

I guess Python isn't so bad after all...

Not wanting to hassle with learning OpenCV and fighting with an edit-compile-execute environment, I decided to use my OpenCV project as an excuse to play around with Python.

I'm still a serious beginner, but I'm beginning to understand why it gets the use it does.

Anyhow, it only took a couple of days to integrate Tesseract OCR, PIL, and OpenCV such that I could open multi-frame TIFF images, perform some basic feature detection, and then use the output of feature detection to focus on a specific region for OCR.

I will admit to having a few false starts.  The first was that I used an older (C++) tutorial that was using some deprecated features of OpenCV and ignoring some other features.  For example, the tutorial was using Hough Line detection to find squares on a printed page.  In order to get to that point there was thresholding, dilating, eroding, inversion, flood filling and so on.  Even then I wasn't getting the correct results.

When I started over, I was using the old SWIG based interface instead of the newer stuff in the cv2 namespace.  Having used the newer stuff I've got to say it's much easier, especially when you need to move back and forth between OpenCV and PIL -- both libraries understand Numpy arrays.

Comments

Popular posts from this blog

Python and libpuzzle

As much as I've dogged on Python in the past (significant whitespace, really?), I've got to admit that it's got some cool features too. For example, I'm playing with libpuzzle  (a library for visually comparing images).  It has a command line utility and a C and PHP API.  Unfortunately, the CLI utility doesn't allow one to dump the raw comparison vector, and it's a PITA to write C just to play with a library. Python's native "ctypes" to the rescue! from ctypes import * class PuzzleCvec(Structure): _fields_ = [("sizeof_vec", c_size_t), ("vec", c_char_p)] class PuzzleCompressedCvec(Structure): _fields_ = [("sizeof_compressed_vec", c_size_t), ("vec", c_char_p)] class PuzzleContext(Structure): _fields_ = [("puzzle_max_width", c_uint), ("puzzle_max_height", c_uint), ("puzzle_lambdas", c_uint), ...

Building Amazon Linux RPMs with Fedora Mock

Fedora's 'mock' tool provides a much more convenient way to build RPMs than using 'rpmbuild'.  It creates a chroot environment for your target OS and will install required dependencies. I've been running into limitations on the version of CollectD that ships with Amazon Linux, so I thought it wouldn't be that difficult to use mock to build an updated version complete with some missing plugins.  I knew that it generally worked, as I was able to use the specfile from the project to build for EPEL6 using Fedora 26.  Man, was I wrong about how easy it would be for Amazon Linux. I won't go into details here, but it's worth mentioning that the CollectD project documentation calls out that the specfile in their contrib directory is generally out of date.  That is 100% correct, so you'll need to budget some time for tweaking it. The first issue is that there are some packages in EPEL that can't be installed in Amazon Linux.  The most aggravatin...

MSBuild, NuGet restore, and HintPath

First of, let me start by acknowledging that you're no longer supposed to use NuGet restore (with the .nuget) as of NuGet 2.7.  Then again, to quote Mick Jagger, you can't always get what you want. The biggest problem I have with this is the "HintPath" in the project file.  It took some experimentation (and a couple of wrong turns) to work out that HintPath is relative to the project directory even when it's part of a multi-project solution . You'll know that your HintPath is wrong when the NuGet package is installed but can't be found by the compiler.  Drop into the directory containing your project and confirm that you've got a valid HintPath.  One of my wrong turns was not verifying that I had a correct path (there was an extra 'lib' in the definition. You may also see this as a busted reference in Visual Studio.