Work the Shell

More Special Variables

Dave Taylor

Issue #181, May 2009

Use bash's more powerful variable substitution forms to simplify your scripts.

I realize this might throw a spanner into the editorial works here at Linux Journal, but after a two-month sidetrack on how to analyze letter usage in English to give you an edge in Hangman (yeah, I can't believe I write about this stuff either), it's time to get back to our tour of basic shell variable referencing capabilities.

In previous columns, we talked about ${var:-alt value}, ${var:=alt value}, ${var:?no value} and even ${var:start:length} as a way to extract specific ranges of characters from a variable.

This month, I want to look at what are perhaps some of the more arcane variable references you can do—calls that are definitely helpful if you're deep in the zone with your scripting. I imagine they won't be things you need for those quick five-line scripts, but when your little project has expanded to a dozen screens and you have seven functions and a dozen arrays, well, these will be of great value to you.

Expanding and Matching

In a previous column, I showed how to do substring expansion with shell variables in the form of ${var:start:length}, but it's also useful to know the length of a variable's value. This can be done with ${#var}, like this:

$ test="the rain in Spain"
$ echo ${#test}
17

One situation I've encountered in scripts is the need to set an arbitrary number of variables in the form value1, value2, value3 and so on. Later, I need to determine the names of the ones that I've set. My lazy solution is typically another variable, valuecount, which counts the number of variables I've set, but, of course, that doesn't directly give me the names. A smarter way to do this is with the ${!pattern*} notation, as shown here:

$ echo ${!t*}
test
$ thimble="full"
$ tart="pop"
$ echo ${!t*}
tart test thimble

As you can see, it lets you get a list of defined variables that match the specified pattern. I'm using t* in the example, but it just as easily could be value* to match the situation outlined earlier.

Pattern Substitution

Here's a cool thing you can do with the bash shell that I'm betting you didn't realize: pattern substitution. When I have a situation where this is required, I almost always use the clunky and CPU-expensive form of:

var=$(echo $var | sed 's/old/new')

which actually can be neatly accomplished with the shell itself by using the form ${var/old/new}. I kid you not! Check out this example:

$ test="The Rain in Spain"
$ echo ${test/ain/ixn}
The Rixn in Spain

If you're like me, your fingers are itching to add a /g suffix to the substitution. It turns out that's done a bit differently within a shell variable: you need to have the pattern start with a /, which looks a bit weird, but it does work:

$ echo ${test//ain/ixn}
The Rixn in Spixn

The general case here is ${var//pat/global subst}. There's more you can do with this notation too—notably, use the equivalent of the ^ and $ special characters you might use in sed regular expressions to root the pattern to the beginning or end of the variable's value:

$ echo ${test/#ain/ixn}
The Rain in Spain
$ echo ${test/%ain/ixn}
The Rain in Spixn

In the first situation, the pattern didn't match the first few letters of the variable value (the pattern would need to have been “The” rather than “ain”), so nothing changed. In the second situation, however, it did match the last few characters, so the substitution took place.

To be fair, using sed does give you quite a bit more power and capability, but if you're just doing something simple like removing an extension and appending a PID to a variable to make a quick temp file, you can indeed just use shell pattern replacement:

$ test="The Rain in Spain.txt"
$ echo ${test/%.*/}.$$
The Rain in Spain.10126

Personally, I think this is very cool!

Command Substitutions

We've explored just about everything you can do with variables other than delving into arrays, which we'll do next month, so I thought I'd take a bit of space to show you a few slick command substitution tricks. First off, us old-timers are used to using backticks to have a command embedded within another, as in the following:

echo the date is `date`

This is pretty commonly used, but, in fact, a better and certainly more readable notational convention is to use $() instead, as I showed earlier. This is functionally identical:

echo the date is $(date)

Using this notation gives you some interesting capabilities. For example, instead of $(cat file), you simply can use $(< file) to make the contents of the file appear.

As is always the case with the shell, when and where fields are parsed is important too. Check out the following:

$ echo the date is $(date)
the date is Wed Feb 4 08:08:35 MST 2009
$ echo the date is "$(date)"
the date is Wed Feb  4 08:08:43 MST 2009

By adding the double quotes around the second invocation of $(date), you can see that the returning values weren't parsed by the shell and normalized: notice the two spaces between Feb and 4 in the second output compared to one space in the first output.

I hope I don't need to tell you what happens if you use single quotes instead of double quotes—oh, what the heck:

$ echo the date is '$(date)'
the date is $(date)

No surprise there—single quotes disable shell expansion, just as it does in this case:

$ echo The '$HOSTNAME' is $HOSTNAME
The $HOSTNAME is soyvah33

This leads to the classic question of what if you actually do want those quotes to be part of the output? It's a bit convoluted, but this works:

$ echo The '$HOSTNAME' is \'$HOSTNAME\'
The $HOSTNAME is 'soyvah33'

Let's wrap things up here, and next month, we'll dig into the oft-confusing world of shell script arrays.