Skip to main content

Python and libpuzzle

As much as I've dogged on Python in the past (significant whitespace, really?), I've got to admit that it's
got some cool features too.

For example, I'm playing with libpuzzle (a library for visually comparing images).  It has a command line utility and a C and PHP API.  Unfortunately, the CLI utility doesn't allow one to dump the raw comparison vector, and it's a PITA to write C just to play with a library.

Python's native "ctypes" to the rescue!

from ctypes import *

class PuzzleCvec(Structure):
 _fields_ = [("sizeof_vec", c_size_t),
             ("vec", c_char_p)]
             
class PuzzleCompressedCvec(Structure):     
    _fields_ = [("sizeof_compressed_vec", c_size_t),
       ("vec", c_char_p)]
 
class PuzzleContext(Structure):
 _fields_ = [("puzzle_max_width", c_uint),
             ("puzzle_max_height", c_uint),
             ("puzzle_lambdas", c_uint),
             ("puzzle_p_ratio", c_double),
             ("puzzle_noise_cutoff", c_double),
             ("puzzle_contrast_barrier_for_cropping", c_double),
             ("puzzle_max_cropping_ratio", c_double),
             ("puzzle_enable_autocrop", c_int),
             ("magic", c_ulong)]
             
libpuzzle = CDLL('libpuzzle.so')
context = PuzzleContext()
cvec = PuzzleCvec()
ccvec = PuzzleCompressedCvec()

libpuzzle.puzzle_init_context(byref(context))
libpuzzle.puzzle_init_cvec(byref(context), byref(cvec))
libpuzzle.puzzle_init_compressed_cvec(byref(context), byref(ccvec))

fileName = '/home/corey/page1.png'
retval = libpuzzle.puzzle_fill_cvec_from_file(byref(context), byref(cvec), fileName)
print "fill_cvec returned %d" % (retval)
print "Vector length: %d" % (cvec.sizeof_vec)

retval = libpuzzle.puzzle_compress_cvec(byref(context), byref(ccvec), byref(cvec))
print "compress_cvec returned %d" % (retval)
print "Compressed vector length: %d" % (ccvec.sizeof_compressed_vec)

print "Vector: %s" % (ccvec.vec)

# Clean up
libpuzzle.puzzle_free_cvec(byref(context), byref(cvec))
libpuzzle.puzzle_free_context(byref(context))

Now I can play around with the library from Python, for example dumping the vector for each image into SQLite.

Comments

Boris said…
hey man, did this work for you? When I try to do puzzle_vector_normalized_distance I get whole number values. Also, I did not find a .so file, I am using a dylib file and your demo works, but I'm not sure if the values I'm getting are correct.

Any help appreciated!

Popular posts from this blog

Building Amazon Linux RPMs with Fedora Mock

Fedora's 'mock' tool provides a much more convenient way to build RPMs than using 'rpmbuild'.  It creates a chroot environment for your target OS and will install required dependencies. I've been running into limitations on the version of CollectD that ships with Amazon Linux, so I thought it wouldn't be that difficult to use mock to build an updated version complete with some missing plugins.  I knew that it generally worked, as I was able to use the specfile from the project to build for EPEL6 using Fedora 26.  Man, was I wrong about how easy it would be for Amazon Linux. I won't go into details here, but it's worth mentioning that the CollectD project documentation calls out that the specfile in their contrib directory is generally out of date.  That is 100% correct, so you'll need to budget some time for tweaking it. The first issue is that there are some packages in EPEL that can't be installed in Amazon Linux.  The most aggravatin

Mass updating AWS Lambda Log Group retention

AWS Lambda and I have a love/hate relationship.  There is much about Lambda to like, but there are also some very sharp edges operationally. One of the cool things is that you get a new CloudWatch Log Group for every new Lambda function without any effort on your part.  Less cool is that it has unlimited retention.  If you haven't yet followed Yan Cui's advice , then you can use some Bash/CLI magic to fix retention on your existing Log Groups. First, get a list of all your default Lambda log groups:  aws logs describe-log-groups --log-group-name-prefix "/aws/lambda" | grep logGroupName | cut -d : -f 2 | cut -d \" -f 2 > /tmp/lambda_logs Read that into a Bash array:  readarray -t log_groups < /tmp/lambda_logs Then, add a 7 day retention policy to all those log groups:  for i in "${log_groups[@]}"; do aws logs put-retention-policy --log-group-name $i --retention-in-days 7; done It's a hack, but if you're going to put in th