Work the Shell

Bash Notational Shortcuts: Efficiency over Clarity

Dave Taylor

Issue #221, September 2012

Are shell scripts inevitably antiquated, is Dave writing Bourne Shell scripts for UNIX, not even writing about Linux? Read on to find out about his latest letter from a reader and subsequent explanation of his philosophy of writing scripts for publication.

I get letters. Well, I don't get very many letters, truth be told, but I do occasionally get interesting dispatches from the field, and a recent one took me to task for writing about UNIX, not Linux, and for focusing on the Bourne Shell, not Bash. Which is odd.

When you're on the command line or writing a shell script, things are pretty darn similar across Linux and UNIX due to the POSIX standard that defines syntax, notational conventions and so on. And in terms of the wealth of commands available? My experience is that if you rely on “non-standard” Linux commands like some of the more-sophisticated GNU utilities, you might find yourself in a right pickle when only the more lobotomized versions are available on a job site or with particular hardware that, yes, might be running a flavor of UNIX. It happens.

Same with Bash versus Bourne shell, although since I do write about shell functions and various other more-advanced features and because I never test the scripts in this column against Bourne Shell (not being Jason, after all), well, just because I'm not using your favorite Bash features and shortcuts doesn't mean I'm using that “other” shell, does it?

The most valuable part in the letter was to remind me that there are some slick notational conventions that are added to modern Bash shells that can clean up some of our conditional statements and structures. It was good reminder—old dog, new tricks, and all that. Let's have a look.

Shortening Conditional Tests

One of the first programming languages I learned was APL. You probably haven't even heard of it, but it was a remarkably powerful language characterized by lots of special notations that gave you the ability to produce sophisticated results in just a line or two. The problem was, no one could debug it, and the common belief was that it was faster to rewrite a program than to figure out what another programmer was thinking.

This recurred the first time I looked at Perl, and I even said so to Larry Wall when we bumped into each other years ago. When a script or program looks like your cat ran across the keyboard, it might be very efficient, but it's sure hard to debug later, even if it's your own code.

And onward to Linux. When working on shell scripts, you're used to seeing single brackets for conditional expressions, like this:

if [ -n $value ] ; then

What I haven't explained is that every time you write a conditional in this form, it actually invokes a subshell process to solve the equation. Write it with double brackets, however:

if [[ -n $value ]] ; then

and you'll force the test to remain within the shell itself, which will make your scripts faster and more efficient. There's also some benefit in terms of strict quoting of arguments in expressions too, because they don't have to be handed to a subshell, you can often get away with sloppier quoting using the [[ ]] notation.

The question is, how much faster is it, and is it worth making your scripts just a bit more obfuscated, particularly for us old dogs who are used to the [ ] notation? On the vast majority of systems, in the vast majority of cases, I don't think it speeds things up much at all. By their very nature, shell scripts aren't written to be maximally efficient. If you need lightening performance, there are better—albeit more complicated—languages you can use, like C++ or even Perl. Just keep your cat away from the keyboard.

The same goes for another notational convention that I eschew in the interest of writing maximally clear and readable script code. It turns out that a conditional statement like:

if [ -n $value ] ; then
  echo value $value is non-zero
fi

also can be written more succinctly as:

[ -n $value ] && echo value $value is non-zero

In this situation, && means “if the previous command had a 'true' exit status, do the next one” and || means the opposite, as in:

[ -n $value ] || echo value $value has a length of zero

More efficient? Certainly if we use [[ ]] instead of the single brackets we have now, but is it worth the obfuscation? Perhaps in code that you're delivering to a client or that you are writing as a fast throw-away script for a specific task, but the code I publish here needs to be easily understood. Then we weave in efficiency.

To get a sense of how long I've been chewing on how to write legible, easily understood code, I'll just say that when I first started coding in Fortran-77 I loved that you could have spaces in variable and function names, letting me write code that was even more like an algorithm instead of a complicated program.

Variable Expansion Tricks

Speaking of tricks and cats running across keyboards, I've also avoided some of the really complicated ${} notational options in the interest of having my scripts be as widely portable as possible. For example, I tend to write:

length=$(echo $word | wc -c) ; length=$(( $length - 1 ))

It's clunky and admittedly inefficient. A smarter way to do it:

length=$(( ${#word} ))

It turns out that the ${# notation produces the number of characters in the value of the specified variable. That's easy!

If you look at the Bash man page, you'll know that there are dozens of different syntactic shortcuts like this. Remembering which is which when for the majority of you readers shell script programming is a useful additional skill, not your main job, is probably more trouble than it's worth.

Don't believe me? Alright, what does this do?

echo ${value^^}

I'd never seen this notation before this particular reader sent me his message, but it turns out that in Bash4 (not earlier versions of Bash), it transliterates lowercase to uppercase. That's something I'd usually write like this:

$(echo $value | tr '[[:lower:]]' '[[:upper:]]')

Or, a slight variation that taps into the modern <<< notation:

$(tr '[[:lower:]]' '[[:upper:]]' <<< $value)

Which is better? Indeed, across all of these shortcuts and modern tweaks to the Bash shell, which are better?

I'll let you tell me, dear reader. Drop me a note and tell me if you would prefer us publishing sample scripts with all of these notational tricks, even at the risk of broad portability across environments and systems, or do you prefer more “standard” old-school scripting techniques that will even work on that clunky old server you administer?

And, needless to say, keep those letters coming, whether you agree with what I'm writing or vehemently disagree. We have asbestos inboxes here at Linux Journal and always want to hear what you're thinking! [Send your Letters to the Editor via www.linuxjournal.com/contact.]