Skip to main content

Python and libpuzzle

As much as I've dogged on Python in the past (significant whitespace, really?), I've got to admit that it's
got some cool features too.

For example, I'm playing with libpuzzle (a library for visually comparing images).  It has a command line utility and a C and PHP API.  Unfortunately, the CLI utility doesn't allow one to dump the raw comparison vector, and it's a PITA to write C just to play with a library.

Python's native "ctypes" to the rescue!

from ctypes import *

class PuzzleCvec(Structure):
 _fields_ = [("sizeof_vec", c_size_t),
             ("vec", c_char_p)]
             
class PuzzleCompressedCvec(Structure):     
    _fields_ = [("sizeof_compressed_vec", c_size_t),
       ("vec", c_char_p)]
 
class PuzzleContext(Structure):
 _fields_ = [("puzzle_max_width", c_uint),
             ("puzzle_max_height", c_uint),
             ("puzzle_lambdas", c_uint),
             ("puzzle_p_ratio", c_double),
             ("puzzle_noise_cutoff", c_double),
             ("puzzle_contrast_barrier_for_cropping", c_double),
             ("puzzle_max_cropping_ratio", c_double),
             ("puzzle_enable_autocrop", c_int),
             ("magic", c_ulong)]
             
libpuzzle = CDLL('libpuzzle.so')
context = PuzzleContext()
cvec = PuzzleCvec()
ccvec = PuzzleCompressedCvec()

libpuzzle.puzzle_init_context(byref(context))
libpuzzle.puzzle_init_cvec(byref(context), byref(cvec))
libpuzzle.puzzle_init_compressed_cvec(byref(context), byref(ccvec))

fileName = '/home/corey/page1.png'
retval = libpuzzle.puzzle_fill_cvec_from_file(byref(context), byref(cvec), fileName)
print "fill_cvec returned %d" % (retval)
print "Vector length: %d" % (cvec.sizeof_vec)

retval = libpuzzle.puzzle_compress_cvec(byref(context), byref(ccvec), byref(cvec))
print "compress_cvec returned %d" % (retval)
print "Compressed vector length: %d" % (ccvec.sizeof_compressed_vec)

print "Vector: %s" % (ccvec.vec)

# Clean up
libpuzzle.puzzle_free_cvec(byref(context), byref(cvec))
libpuzzle.puzzle_free_context(byref(context))

Now I can play around with the library from Python, for example dumping the vector for each image into SQLite.

Comments

Boris said…
hey man, did this work for you? When I try to do puzzle_vector_normalized_distance I get whole number values. Also, I did not find a .so file, I am using a dylib file and your demo works, but I'm not sure if the values I'm getting are correct.

Any help appreciated!

Popular posts from this blog

I guess Python isn't so bad after all...

Not wanting to hassle with learning OpenCV and fighting with an edit-compile-execute environment, I decided to use my OpenCV project as an excuse to play around with Python.

I'm still a serious beginner, but I'm beginning to understand why it gets the use it does.

Anyhow, it only took a couple of days to integrate Tesseract OCR, PIL, and OpenCV such that I could open multi-frame TIFF images, perform some basic feature detection, and then use the output of feature detection to focus on a specific region for OCR.

I will admit to having a few false starts.  The first was that I used an older (C++) tutorial that was using some deprecated features of OpenCV and ignoring some other features.  For example, the tutorial was using Hough Line detection to find squares on a printed page.  In order to get to that point there was thresholding, dilating, eroding, inversion, flood filling and so on.  Even then I wasn't getting the correct results.

When I started over, I was using the old…

Getting closer

Discovered that the "toucan" config is distinct from the "austin" config and so can dump a bunch of
drivers.

Unfortunately it looks like I have to merge the Dell board config (board-qsd8x50_austin.c) with the latest generic board config from CodeAurora (board-qsd8x50.c).

Here's where I'm at so far:

corey@patches:~/msm$ git status # On branch gingerbread_rel # Changed but not updated: # (use "git add ..." to update what will be committed) # (use "git checkout -- ..." to discard changes in working directory) # # modified: arch/arm/mach-msm/Kconfig # modified: arch/arm/mach-msm/Makefile # modified: arch/arm/mach-msm/include/mach/board.h # modified: arch/arm/mach-msm/include/mach/camera.h # modified: drivers/input/keyboard/Kconfig # modified: drivers/input/keyboard/Makefile # modified: drivers/input/misc/Makefile # modified: drivers/input/touchscreen/Kconfig # modified: drivers/input/touchscreen/Makefile # modified: dr…

Initial Speech Recognition App

I'm pretty impressed with Microsoft's System.Speech API.  It took less than 3 days to throw together a proof-of-concept application.  The hardest part was probably coming up with the grammar -- documentation for that is pretty thin on the ground.

Anyways, here's the application source code on GitHub if anyone wants a look:
ObserverLengthSampler project

If nothing else, I'd recommend it as a starting point for someone needing a number recognition SRGS grammar in an XML format.