Think shell scripts can deal only with text? You'll be amazed at what you can do as Dave begins his exploration of ImageMagick and its many useful tricks.
I've spent a lot of time in this column talking about text processing and analysis, with the basic assumption that if you're using the command line, you're focused on text. That's not always true, and if you work with images at all—whether JPEG, PNG, GIF or another format—there's a free-to-download suite of image-related utilities available that offers rather amazing capabilities direct from the command line and, therefore, also from within shell scripts.
I'm talking about ImageMagick, a set of programs that has grown and expanded through the years and now includes powerful Perl and Ruby interfaces too. But, pshaw! We don't need no stinkin' Perl or Ruby. We'll stick with our hard-core shell commands, thank you very much.
You'll find a downloadable binary and source both at www.imagemagick.org, and as always, I recommend you download source and compile it on your system if you can. It's far more reliable than hoping someone else's compiled version is optimized for your own hardware configuration.
A variety of different commands are included with the ImageMagick distribution that I divide into “analysis” and “editing” tools. For this article, let's stick with the analysis tools. Let me start by showing you how much more information it offers on a typical image file than the standard Linux command line.
If you've been using Linux for even a short time, you've probably learned about the file command. It can be helpful with some file types:
$ file wp-content.tar.gz wp-content.tar.gz: gzip compressed data, from Unix
But, the command is generally useless with images:
$ file pvp.jpg pvp.jpg: JPEG image data, EXIF standard
Um, what about image size? How about any useful info at all? Jeez.
Enter the ImageMagick identify command:
$ identify pvp.jpg pvp.jpg JPEG 970x311 DirectClass 114kb 0.010u 0:01
Ahh...so this particular image has the dimensions (the suite refers to dimensions as the “geometry” of the image) of 970x311. That's useful.
Do you want even more information though? The -verbose option spits out a somewhat overwhelming amount of data:
$ identify -verbose pvp.jpg Image: pvp.jpg Format: JPEG (Joint Photographic Experts Group JFIF format) Geometry: 970x311 Class: DirectClass Colorspace: RGB Type: TrueColor Depth: 8 bits Endianess: Undefined Channel depth: Red: 8-bits Green: 8-bits Blue: 8-bits Channel statistics: Red: Min: 0 Max: 255 Mean: 180.72 Standard deviation: 74.2122 Green: Min: 0 Max: 255 Mean: 168.593 Standard deviation: 76.0343 Blue: Min: 0 Max: 255 Mean: 169.459 Standard deviation: 77.244 Colors: 21864 Rendering-intent: Undefined Resolution: 72x72 Units: Undefined Filesize: 114kb Interlace: None Background Color: white Border Color: #DFDFDF Matte Color: grey74 Dispose: Undefined Iterations: 0 Compression: JPEG Orientation: Undefined JPEG-Quality: 94 JPEG-Colorspace: 2 JPEG-Sampling-factors: 1x1,1x1,1x1 signature: bc8a6a698ca35fd8feab71452423386ff98b1fb7b5ec ... Profile-xmp: 811 bytes Profile-exif: 22 bytes unknown Profile-app12: 15 bytes Tainted: False User Time: 0.020u Elapsed Time: 0:01
Truth be told, dimensions and resolution are the most useful pieces of information from this crazy-long output.
With a tiny bit of effort, you can extract just those items of information:
$ identify -verbose pvp.jpg | grep -E '(Resolution:|Geometry:)' Geometry: 970x311 Resolution: 72x72
Now imagine you are working on a Web site and want to ensure that no images on the site are greater than 72dpi, a standard screen resolution. Higher print-ready resolutions are rather pointless, because a 300dpi image will render the same on a screen as its lower-resolution brethren—it'll just load slower.
Here's one way you can identify images in a directory with incorrect resolutions:
#!/bin/sh identify=/usr/bin/identify # check images to ensure that they're all 72x72 resolution. for filename do resolution=$($identify -verbose $filename | \ grep -i "Resolution:" | grep -v 72x72) if [ ! -z "$resolution" ] ; then echo "Warning: Image $filename has $resolution" fi done exit 0
When I run this on a directory of images on my own system, a set of JPEG format files on my www.AskDaveTaylor.com site, here's what I get:
$ checkres.sh *.jpg Warning: Image auction-seller-img1.jpg has Resolution: 75x75 Warning: Image auction-seller-img2.jpg has Resolution: 75x75 Warning: Image browsing-the-photo-folder.jpg has Resolution: 96x96 Warning: Image brushed-metal.jpg has Resolution: 300x300 ...
That's a surprise! I didn't realize that I had 300x300 and these other weird resolutions. An easy way to speed up my site, therefore, is to lower the resolution on these images to the standard 72dpi. This is something that also can be done with a call to a different ImageMagick utility, but let's tackle that in another article.
Since I write a lot of scripts that harvest images or other content from sites and repurpose them for my own (generally private, not public-facing) use, I also find it is darn helpful in a shell script to be able to ascertain the size of an image I've just grabbed.
If you've guessed that identify is the key, you're right. In fact, given an image, this is an easy way to grab its height and width:
height=$(identify $image | cut -d\ -f3 | cut -dx -f1) width=$(identify $image | cut -d\ -f3 | cut -dx -f2)
There's no need for verbose output, because the geometry of the image is included in the default output.
Now it's easy to produce higher-quality HTML, for example, by including images with their proper dimensions:
echo "<img src=$image height=$height width=$width>"
What's better is that Web browsers are able to scale images automatically, so if you specify a height and width that are different from the default dimensions (oops, sorry, “geometry”) of the image, it'll scale automatically.
This means if I want to include the pvp.jpg image on an automatically generated page, but decide 970 pixels is just too wide, I can simply include it as:
<img src=pvp.jpg height=207 width=646>
and the browser—be it Chrome, Safari or even MS IE—will scale it appropriately.
Calculating the smaller size is straightforward with bc, another underappreciated Linux command. The entire sequence might look like this to scale the image to 66% of its original dimensions:
#!/bin/sh identify=/usr/bin/identify scale=0.666 image=$1 # add input validation code height=$($identify $image | cut -d\ -f3 | cut -dx -f1) width=$($identify $image | cut -d\ -f3 | cut -dx -f2) newwidth="$(echo $width \* $scale | bc | cut -d. -f1)" newheight="$(echo $height \* $scale | bc | cut -d. -f1)" echo "<img src=$image height=$newheight width=$newwidth>" exit 0
In practical use:
$ scaledown.sh pvp.jpg <img src=pvp.jpg height=646 width=207>
That's easy enough!
With some creativity, you can see how even just the identify command that's included with ImageMagick opens up a world of image file scripting possibilities, whether you're working with Web sites directly or simply seek to analyze directories of images for unusual values or settings.
I'll dig into some of the really slick editing and modification capabilities, including an easy way to add a so-called watermark to your graphics, along with ways you can automate fixing 300dpi resolution images or even scaling images, in an upcoming article.
As a final note, although I explain how you can take a large image and have it show up smaller on a Web page by using different values for height and width, it would be remiss of me not to mention that if you're going to use only the smaller size, it's smarter to resize the original image. It makes your page faster to load, less unneeded data is transferred and everything just generally is happier (including the search engines). Now you know.