Tuesday, 21 February 2012

Sequential mode read

Development libvips has just gained an interesting new feature: sequential mode read. This will be in the upcoming 7.28, due in another month or so.

Many image formats only really support sequential reading. For example, libpng provides png_read_row(), a function which reads the next line of pixels from a file. You call it once for each line in the image to get every pixel.

However, many operations require random access to pixels. For example, a 90-degree rotate will need to read a column of pixels for every output line it writes. If it just used png_read_row() it would need to decompress the input image many, many times in order to create the output.

To prevent this libvips will first decompress the entire image to a large temporary area and then process from that. This temporary area is in memory for small files or on disc for large files.

Not all operations need random access to their source pixels. For example, thumbnailing, the process of shrinking images for display, can work strictly top-to-bottom. To help speed up operations of this type libvips has a new hint that you can give to read operations to indicate that you only need sequential access.

$ time vips --vips-leak copy wtc.jpg[sequential] wtc.tif
real   0m4.903s
user   0m1.384s
sys    0m0.512s
memory: high-water mark 70.73 MB

(where wtc.jpg is a 10,000 by 10,000 pixel RGB image and --vips-leak makes libvips print peak memory use)

Here the [sequential] hint means that we will only need top-to-bottom access to the pixels in wtc.jpg. When libvips opens this image it will not decompress to a temporary buffer but, instead, stream pixels from the jpeg decompressor directly through the operation. This saves first writing and then reading the 300 MB temporary area.

The sequential mode reader keeps a few hundred lines behind the current read point in a cache so it can handle some small amount of non-sequential access. If you try something that's very non-sequential after giving the [sequential] hint, you get an error.

$ time vips flip wtc.png[sequential] wtc.tif vertical
VipsSequential: non-sequential read --- at position 0 in file, but position 9344 requested

Without the [sequential] hint you get the current libvips behaviour, which is noticeably slower and of course makes much more disc traffic:

$ time vips copy wtc.jpg wtc.tif
real   0m6.004s
user   0m1.588s
sys    0m1.208s
memory: high-water mark 77.46 MB

Both are faster than ImageMagick:

$ time convert wtc.jpg wtc.tif
real   0m9.357s
user   0m11.393s
sys    0m0.620s
peak memuse 673 MB

The jpeg, stripped tiff and png readers now support sequential mode read. We may add sequential mode support to other readers before release, but these are the main ones.

The libvips thumbnailing utility, vipsthumbnail, now turns on sequential mode for you, so you get this nice behaviour automatically.

$ time vipsthumbnail --vips-leak wtc.tif
real   0m1.092s
user   0m0.848s
sys    0m0.216s
memory: high-water mark 44.24 MB

Again, faster than ImageMagick:

$ time convert wtc.tif -resize 128x128 tn_wtc.jpg
real    0m10.039s
user    0m9.409s
sys     0m0.820s
peak memuse 678 MB
You won't see much of a speedup with jpg images since both systems use libjpeg's handy downsample-on-load feature:

$ time vipsthumbnail wtc.jpg
real;  &nbsp0m0.358s
user;  &nbsp0m0.332s
sys ;  &nbsp0m0.024s

$ time convert -define jpeg:size=256x256 wtc.jpg -thumbnail 128x128 -unsharp 0x.5 tn_wtc.jpg
real    0m0.463s
user    0m0.436s
sys     0m0.040s

Update

Rob Hines pointed out that this will help tiff pyramid build as well. You see a modest speedup with a tif source:

$ time vips --vips-leak copy wtc.tif wtc-pyr.tif[compression=jpeg,tile,pyramid]
memory: high-water mark 72.50 MB
real 0m6.604s
user 0m4.820s
sys 0m1.076s

$ time vips --vips-leak copy wtc.tif[sequential] wtc-pyr.tif[compression=jpeg,tile,pyramid]
memory: high-water mark 62.30 MB
real 0m5.379s
user 0m4.648s
sys 0m0.560s

And a very nice speedup with a massive jpg source:

$ time vips --vips-leak copy world.topo.bathy.200405.3x21600x21600.A1.jpg A1-pyr.tif[compression=jpeg,tile,pyramid]
memory: high-water mark 162.58 MB
real 1m12.996s
user 0m22.845s
sys 0m4.552s

$ time vips --vips-leak copy world.topo.bathy.200405.3x21600x21600.A1.jpg[sequential] A1-pyr.tif[compression=jpeg,tile,pyramid]
memory: high-water mark 138.95 MB
real 0m24.218s
user 0m22.277s
sys 0m1.172s

In this case you save writing then reading back a 1.3 GB file.

No comments:

Post a Comment