Saturday, 4 October 2014

Image annotation with pyvips8

There was a blog post a few years ago about using Python and libvips to do image annotation. The solution there still works, but it is a bit ugly. We're just finishing up the new vips8 Python interface and it's a lot nicer and faster, so I thought it might be interesting to rewrite that example and compare the code.

pyvips8 will be part of the upcoming libvips-7.42, it's not quite done yet.

The task is to add a 150-pixel high red footer to the image, with text in the corners. Something like this:


In ImageMagick you can do this with:
convert $1 \
    -background Red -density 300 \
    -font /usr/share/fonts/truetype/msttcorefonts/Arial.ttf \
    -pointsize 12 -gravity south -splice 0x150 \
    -gravity southwest -annotate +50+50 'left corner' \
    -gravity southeast -annotate +50+50 'right corner' \
    +repage \
    $2

First, here's the old vips7 version:
#!/usr/bin/python

import sys
from vipsCC import *

im = VImage.VImage(sys.argv[1])

black = VImage.VImage.black(im.Xsize(), 150, 3)
red = black.lin([1, 1, 1], [255, 0, 0]).clip2fmt(VImage.VImage.FMTUCHAR)

txt = VImage.VImage.text("left corner", "sans 12", -1, 0, 300)
txt = txt.embed(0, 50, 50, im.Xsize(), 150)

txt2 = VImage.VImage.text("right corner", "sans 12", -1, 0, 300)
txt2 = txt2.embed(0, im.Xsize() - txt2.Xsize() - 50, 50, im.Xsize(), 150)

txt = txt.orimage(txt2)
txt = txt.blend(zero, red)

im = im.insert(txt, 0, im.Ysize())

im.write(sys.argv[2])
The steps are roughly: make black and red images the size of the strip we will add to the bottom. Render the text to a pair of images and expand them to the size of the bottom strip, aligning them appropriately. Combine the two text images to a single mask with 0 for black, 255 for white, and intermediate values for antialiasing. Use the mask to blend between the red and black images we made earlier. Paste that footer onto the image.

This is much longer than the ImageMagick solution. On the plus side, since it's done in a proper programming language, it's a lot more flexible. You could easily write a couple of small helper functions to do the text layout for you, for example, rather than relying on fixed offsets.

Here's the new vips8 version:
#!/usr/bin/python

import sys
from gi.repository import Vips

im = Vips.Image.new_from_file(sys.argv[1], access = Vips.Access.SEQUENTIAL)

left_text = Vips.Image.text("left corner", dpi = 300)
left = left_text.embed(50, 50, im.width, 150)

right_text = Vips.Image.text("right corner", dpi = 300)
right = right_text.embed(im.width - right_text.width - 50, 50, im.width, 150)

footer = (left | right).ifthenelse(0, [255, 0, 0], blend = True)

im = im.insert(footer, 0, im.height, expand = True)  

im.write_to_file(sys.argv[2])
There are quite a few improvements. First, we have operator overloads and array constants, so we can now write:
left | right
instead of
txt.orimage(txt2)
And we can use [255, 0, 0] as one of the arguments to ifthenelse. This will be automatically turned into a constant image of the right size by pyvips8.

Secondly, vips8 has optional named parameters. You don't need to give every value to every option, you can just supply the minimum set and rely on vips to set the others to sensible values. This helps to make the API smaller as well: vips7 had separate operators for ifthenelse and blend, but with vips8 you just use ifthenelse with the blend option.

Finally, pyvips8 is a fully dynamic binding. The whole binding is written in Python, it's only a few hundred lines of code, and it generates itself at runtime by searching the vips library. This means that the binding is always up to date and always supports every feature of vips. In this case it means that Python finally supports vips sequential mode, which can help performance a lot.

I ran these different implementations on a 10,000 by 10,000 RGB JPEG on my laptop. ImageMagick is quite quick on small images, but struggles with these larger ones. I see:
real 0m3.304s
user 0m3.094s
sys 0m0.812s
peak RSS 1.4gb
Next fastest is the vips7 version. I see:
real 0m2.475s
user 0m1.909s
sys 0m0.545s
peak RSS 70mb
So about a second faster, but much, much better memory use. The memory use is a bit misleading since vips is actually processing the image via a temporary disc file here, so that disc traffic really should be included.

Finally, pyvips8:
real 0m0.994s
user 0m1.723s
sys 0m0.081s
peak RSS 90mb
Memory use is slightly up from vips7, but sequential mode has given a really nice speedup. And there's no temporary disc file, so the sys time has fallen right down. You can see that vips is now able to make use of the multiple cores on this laptop too, which is great.

No comments:

Post a Comment