Book HomeBook TitleSearch this book

4.6. Command Substitution

From the discussion so far, we've seen two ways of getting values into variables: by assignment statements and by the user supplying them as command-line arguments (positional parameters). There is another way: command substitution, which allows you to use the standard output of a command as if it were the value of a variable. You will soon see how powerful this feature is.

The syntax of command substitution is:

$(Unix command)

The command inside the parenthesis is run, and anything the command writes to standard output (and to standard error) is returned as the value of the expression. These constructs can be nested, i.e., the Unix command can contain command substitutions.

Here are some simple examples:

Command substitution, like variable expansion, is done within double quotes. (Double quotes inside the command substitution are not affected by any enclosing double quotes.) Therefore, our rule in Chapter 1 and Chapter 3 about using single quotes for strings unless they contain variables will now be extended: "When in doubt, use single quotes, unless the string contains variables, or command substitutions, in which case use double quotes."

(For backwards compatibility, the Korn shell supports the original Bourne shell (and C shell) command substituion notation using backquotes: `...`. However, it is considerably harder to use than $(...), since quoting and nested command substitutions require careful escaping. We don't use the backquotes in any of the programs in this book.)

You will undoubtedly think of many ways to use command substitution as you gain experience with the Korn shell. One that is a bit more complex than those mentioned previously relates to a customization task that we saw in Chapter 3: personalizing your prompt string.

Recall that you can personalize your prompt string by assigning a value to the variable PS1. If you are on a network of computers, and you use different machines from time to time, you may find it handy to have the name of the machine you're on in your prompt string. Most modern versions of Unix have the command hostname(1), which prints the network name of the machine you are on to standard output. (If you do not have this command, you may have a similar one like uname.) This command enables you to get the machine name into your prompt string by putting a line like this in your .profile or environment file:

PS1="$(hostname) $ "

(Here, the second dollar sign does not need to be preceded by a backslash. If the character after the $ isn't special to the shell, the $ is included literally in the string.) For example, if your machine had the name coltrane, then this statement would set your prompt string to "coltrane $ ".

Command substitution helps us with the solution to the next programming task, Task 4-4, which relates to the album database in Task 4-1.

Task 4-4

The file used in Task 4-1 is actually a report derived from a bigger table of data about albums. This table consists of several columns, or fields, to which a user refers by names like "artist," "title," "year," etc. The columns are separated by vertical bars (|, the same as the Unix pipe character). To deal with individual columns in the table, field names need to be converted to field numbers.

Suppose there is a shell function called getfield that takes the field name as argument and writes the corresponding field number on the standard output. Use this routine to help extract a column from the data table.

The cut(1) utility is a natural for this task. cut is a data filter: it extracts columns from tabular data.[62] If you supply the numbers of columns you want to extract from the input, cut prints only those columns on the standard output. Columns can be character positions or -- relevant in this example -- fields that are separated by TAB characters or other delimiters.

[62] Some very old BSD-derived systems don't have cut, but you can use awk instead. Whenever you see a command of the form cut -fN -dC filename, use this instead: awk -FC '{ print $N }' filename.

Assume that the data table in our task is a file called albums and that it looks like this:

Coltrane, John|Giant Steps|Atlantic|1960|Ja
Coltrane, John|Coltrane Jazz|Atlantic|1960|Ja
Coltrane, John|My Favorite Things|Atlantic|1961|Ja
Coltrane, John|Coltrane Plays the Blues|Atlantic|1961|Ja
...

Here is how we would use cut to extract the fourth (year) column:

cut -f4 -d\| albums

The -d argument is used to specify the character used as field delimiter (TAB is the default). The vertical bar must be backslash-escaped so that the shell doesn't try to interpret it as a pipe.

From this line of code and the getfield routine, we can easily derive the solution to the task. Assume that the first argument to getfield is the name of the field the user wants to extract. Then the solution is:

fieldname=$1
cut -f$(getfield $fieldname) -d\| albums

If we ran this script with the argument year, the output would be:

1960
1960
1961
1961
...

Task 4-5 is another small task that makes use of cut.

Task 4-5

Assume that you are logged into a large server or mainframe that supports many simultaneous users. Send a mail message to everyone who is currently logged in.

The command who(1) tells you who is logged in (as well as which terminal they're on and when they logged in). Its output looks like this:

billr      console      May 22 07:57
fred       tty02        May 22 08:31
bob        tty04        May 22 08:12

The fields are separated by spaces, not TABs. Since we need the first field, we can get away with using a space as the field separator in the cut command. (Otherwise we'd have to use the option to cut that uses character columns instead of fields.) To provide a space character as an argument on a command line, you can surround it by quotes:

who | cut -d' ' -f1

With the above who output, this command's output would look like this:

billr
fred
bob

This leads directly to a solution to the task. Just type:

mail $(who | cut -d' '  -f1)

The command mail billr fred bob will run and then you can type your message.

Task 4-6 is another task that shows how useful command pipelines can be in command substitution.

Task 4-6

The ls command gives you pattern-matching capability with wildcards, but it doesn't allow you to select files by modification date. Devise a mechanism that lets you do this.

This task was inspired by the feature of the OpenVMS operating system that lets you specify files by date with BEFORE and SINCE parameters.

Here is a function that allows you to list all files that were last modified on the date you give as argument. Once again, we choose a function for speed reasons. No pun is intended by the function's name:

function lsd {
    date=$1
    ls -l | grep -i "^.\{41\}$date" | cut -c55-
}

This function depends on the column layout of the ls -l command. In particular, it depends on dates starting in column 42 and filenames starting in column 55. If this isn't the case in your version of Unix, you will need to adjust the column numbers.[63]

[63] For example, ls -l on GNU/Linux has dates starting in column 43 and filenames starting in column 57.

We use the grep search utility to match the date given as argument (in the form Mon DD, e.g., Jan 15 or Oct 6, the latter having two spaces) to the output of ls -l. (The regular expression argument to grep is quoted with double quotes, in order to perform the variable substitution.) This gives us a long listing of only those files whose dates match the argument. The -i option to grep allows you to use all lowercase letters in the month name, while the rather fancy argument means, "Match any line that contains 41 characters followed by the function argument." For example, typing lsd 'jan 15' causes grep to search for lines that match any 41 characters followed by jan 15 (or Jan 15).

The output of grep is piped through our ubiquitous friend cut to retrieve just the filenames. The argument to cut tells it to extract characters in column 55 through the end of the line.

With command substitution, you can use this function with any command that accepts filename arguments. For example, if you want to print all files in your current directory that were last modified today, and today is January 15, you could type:

lp $(lsd 'jan 15')

The output of lsd is on multiple lines (one for each filename), but because the variable IFS (see earlier in this chapter) contains newline by default, the shell uses newline to separate words in lsd's output, just as it normally does with space and TAB.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.