The final Korn shell feature that relates to the kinds of values that variables can hold is the typeset command. If you are a programmer, you might guess that typeset is used to specify the type of a variable (integer, string, etc.); you'd be partially right.
typeset is a rather ad hoc collection of things that you can do to variables that restrict the kinds of values they can take. Operations are specified by options to typeset; the basic syntax is:
typeset option varname[=value]
Here, option is an option letter preceded with a hyphen or plus sign. Options can be combined and multiple varnames can be used. If you leave out varname, the shell prints a list of variables for which the given option is turned on.
The available options break down into two basic categories:
String formatting operations, such as right- and left-justification, truncation, and letter case control
Type and attribute functions that are of primary interest to advanced programmers
typeset without options has an important meaning: if a typeset statement is used inside a function definition, the variables involved all become local to that function (in addition to any properties they may take on as a result of typeset options). The ability to define variables that are local to "subprogram" units (procedures, functions, subroutines, etc.) is necessary for writing large programs, because it helps keep subprograms independent of the main program and of each other.
NOTE: Local variable names are restricted to simple identifiers. When typeset is used with a compound variable name (i.e., one that contains periods), that variable is automatically global, even if the typeset statement occurs inside the body of a function.
If you just want to declare a variable local to a function, use typeset without any options. For example:
function afunc { typeset diffvar samevar=funcvalue diffvar=funcvalue print "samevar is $samevar" print "diffvar is $diffvar" } samevar=globvalue diffvar=globvalue print "samevar is $samevar" print "diffvar is $diffvar" afunc print "samevar is $samevar" print "diffvar is $diffvar"
This code prints the following:
samevar is globvalue diffvar is globvalue samevar is funcvalue diffvar is funcvalue samevar is funcvalue diffvar is globvalue
Figure 6-1 shows this graphically.
You will see several additional examples of local variables within functions in Chapter 9.
Now let's look at the various options to typeset. Table 6-6 lists the string formatting options; the first three take an optional numeric argument.
Option | Operation |
---|---|
-Ln | Left-justify. Remove leading spaces; if n is given, fill with spaces or truncate on right to length n. |
-Rn | Right-justify. Remove trailing spaces; if n is given, fill with spaces or truncate on left to length n. |
-Zn | If used with -R, add leading 0's instead of spaces if needed. If used with -L, strips leading 0's. By itself, acts the same as -RZ. |
-l | Convert letters to lowercase. |
-u | Convert letters to uppercase. |
Here are a few simple examples. Assume that the variable alpha is assigned the letters of the alphabet, in alternating case, surrounded by three spaces on each side:
alpha=" aBcDeFgHiJkLmNoPqRsTuVwXyZ "
Table 6-7 shows some typeset statements and their resulting values (assuming that each of the statements are run "independently").
Statement | Value of v |
---|---|
typeset -L v=$alpha | "aBcDeFgHiJkLmNoPqRsTuVwXyZ " |
typeset -L10 v=$alpha | "aBcDeFgHiJ" |
typeset -R v=$alpha | " aBcDeFgHiJkLmNoPqRsTuVwXyZ" |
typeset -R16 v=$alpha | "kLmNoPqRsTuVwXyZ" |
typeset -l v=$alpha | " abcdefghijklmnopqrstuvwxyz " |
typeset -uR5 v=$alpha | "VWXYZ" |
typeset -Z8 v="123.50" | "00123.50" |
When you run typeset on an existing variable, its effect is cumulative with whatever typesets may have been used previously. This has the obvious exceptions:
A typeset -u undoes a typeset -l, and vice versa.
A typeset -R undoes a typeset -L, and vice versa.
You may not combine typeset -l and typeset -u with some of the numeric attributes, such as typeset -E. Note, though, that typeset -ui creates unsigned integers.
typeset -A and typeset -n (associative array and nameref, respectively) are not cumulative.
You can turn off typeset options explicitly by typing typeset +o, where o is the option you turned on before. Of course, it is hard to imagine scenarios where you would want to turn multiple typeset formatting options on and off over and over again; you usually set a typeset option on a given variable only once.
An obvious application for the -L and -R options is one in which you need fixed-width output. The most ubiquitous source of fixed-width output in the Unix system is reflected in Task 6-3.
Task 6-3
Pretend that ls doesn't do multicolumn output; write a shell script that does it.
For the sake of simplicity, we'll assume further that our version of Unix is an ancient one, in which filenames are limited to 14 characters.[87]
[87] We don't know of any modern Unix systems that still have this restriction. But applying it here considerably simplifies the programming problem.
Our solution to this task relies on many of the concepts we have seen earlier in this chapter. It also relies on the fact that set -A (for constructing arrays) can be combined with command substitution in an interesting way: each word (separated by spaces, TABs, or newlines) becomes an element of the array. For example, if the file bob contains 50 words, the array fred has 50 elements after the statement:
set -A fred $(< bob)
Our strategy is to get the names of all files in the given directory into an array variable. We use an arithmetic for loop as we saw earlier in this chapter, to get each filename into a variable whose length has been set to 14. We print that variable in five-column format, with two spaces between each column (for a total of 80 columns), using a counter to keep track of columns. Here is the code:
set -A filenames $(ls $1) typeset -L14 fname let numcols=5 for ((count = 0; count < ${#filenames[*]} ; count++)); do fname=${filenames[count]} print -rn "$fname " if (( (count+1) % numcols == 0 )); then print # newline fi done if (( count % numcols != 0 )); then print fi
The first line sets up the array filenames to contain all the files in the directory given by the first argument (the current directory by default). The typeset statement sets up the variable fname to have a fixed width of 14 characters. The next line initializes numcols to the number of columns per line.
The arithmetic for loop iterates once for every element in filenames. In the body of the loop, the first line assigns the next array element to the fixed-width variable. The print statement prints the latter followed by two spaces; the -n option suppresses print's final newline.
Then there is the if statement, which determines when to start the next line. It checks the remainder of count + 1 divided by $numcols -- remember that dollar signs aren't necessary within a $((...)) construct -- and if the result is 0, it's time to output a newline via a print statement without arguments. Notice that even though $count increases by 1 with every iteration of the loop, the remainder goes through a cycle of 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, ...
After the loop, an if construct outputs a final newline if necessary, i.e., if the if within the loop didn't just do it.
We can also use typeset options to clean up the code for our dosmv script (Task 5-3), which translates filenames in a given directory from MS-DOS to Unix format. The code for the script is:
for filename in ${1:+$1/}* ; do newfilename=$(print $filename | tr '[A-Z]' '[a-z]') newfilename=${newfilename%.} # subtlety: quote value of $newfilename to do string comparison, # not regular expression match if [[ $filename != "$newfilename" ]]; then print "$filename -> $newfilename" mv $filename $newfilename fi done
We can replace the call to tr in the for loop with one to typeset -l before the loop:
typeset -l newfilename for filename in ${1:+$1/}* ; do newfilename=${filename%.} # subtlety: quote value of $newfilename to do string comparison, # not regular expression match if [[ $filename != "$newfilename" ]]; then print "$filename -> $newfilename" mv $filename $newfilename fi done
This way, the translation to lowercase letters is done automatically each time a value is assigned to newfilename. Not only is this code cleaner, but it is also more efficient, because the extra processes created by tr and command substitution are eliminated.
The other options to typeset are of more use to advanced shell programmers who are "tweaking" large scripts. These options are listed in Table 6-8.
Option | Operation |
---|---|
-A | Create an associative array. |
-En | Represent the variable internally as a double-precision floating-point number; improves the efficiency of floating-point arithmetic. If n is given, it is the number of significant figures to use in output. Large numbers print in scientific notation: [-]d.ddde±dd. Smaller numbers print in regular notation: [-]ddd.ddd. |
-Fn | Represent the variable internally as a double-precision floating-point number; improves the efficiency of floating-point arithmetic. If n is given, it is the number of decimal places to use in output. All values print in regular notation: [-]ddd.ddd. |
-H | On non-Unix systems, Unix-style filenames are converted into the format appropriate for the local system. |
-in | Represent the variable internally as an integer; improves the efficiency of integer arithmetic. If n is given, it is the base used for output. The default base is 10. |
-n | Create a nameref variable (see Chapter 4). |
-p | When used by itself, prints typeset statements that describe the attributes of each of the shell's variables that have attributes set. With additional options, only prints those variables that have the corresponding attribute set. Intended for dumping the shell's state into a file that can later be sourced by a different shell to recreate the original shell's state. |
-r | Make the variable read-only: forbid assignment to it and disallow it from being unset. The built-in command readonly does the same thing, but readonly cannot be used for local variables. |
-t | "Tags" the variable. The list of tagged variables is available from typeset +t. This option is obsolete. |
-uin | Represent the variable internally as an unsigned integer. This is discussed further in Appendix B. If n is given, it is the base used for output. The default base is 10.[88] |
-x | This does the same thing as the export command, but export cannot be used for local variables. |
-f | Refer to function names only; see Section 6.5.4, later in this chapter. |
[88] This feature is only in ksh93m and newer.
The -i option is the most useful. You can put it in a script when you are done writing and debugging it to make arithmetic run a bit faster, though the speedup will be apparent only if your script does a lot of arithmetic. The more readable integer is a built-in alias for typeset -i, so that integer x=5 is the same as typeset -i x=5. Similarly, the word float is a predefined alias for typeset -E.[89]
[89] C, C++, and Java programmers may find the choice of the word "float" to be suprising, since internally the shell uses double-precision floating point. We theorize that the name "float" was chosen since its meaning is more obvious to the nonprogrammer than the word "double."
The -r option is useful for setting up "constants" in shell scripts; constants are like variables except that you can't change their values once they have been initialized. Constants allow you to give names to values even if you don't want them changed; it is considered good programming practice to use named constants in large programs.
The solution to Task 6-3 contains a good candidate for typeset -r: the variable numcols, which specifies the number of columns in the output. Since numcols is an integer, we could also use the -i option, i.e., replace let numcols=5 with typeset -ri numcols=5. If we were to try assigning another value to numcols, the shell would respond with the error message ksh: numcols: is read only.
These options are also useful without arguments, i.e., to see which variables exist that have those options turned on.
The -f option has various suboptions, all of which relate to functions. These are listed in Table 6-9.
Option | Operation |
---|---|
-f | With no arguments, prints all function definitions. |
-f fname | Prints the definition of function fname. |
+f | Prints all function names. |
-ft | Turns on trace mode for named function(s). (Chapter 9) |
+ft | Turns off trace mode for named function(s). (Chapter 9) |
-fu | Defines given name(s) as autoloaded function(s). (Chapter 4) |
Two of these have built-in aliases that are more mnemonic: functions (note the s) is an alias for typeset -f and autoload is an alias for typeset -fu.
Finally, if you type typeset with no arguments, you will see a list of all variables that have attributes set (in no discernible order), preceded by the appropriate keywords for whatever typeset options are turned on. For example, typing typeset in an uncustomized shell gives you a listing of most of the shell's built-in variables and their attributes that looks like this:
export HISTFILE integer TMOUT export FCEDIT export _AST_FEATURES export TERM HISTEDIT PS2 PS3 integer PPID export MAIL export LOGNAME export EXINIT integer LINENO export PATH integer HISTCMD export _ export OLDPWD export PWD float precision 3 SECONDS export SHELL integer RANDOM export HOME export VISUAL export MANPATH export EDITOR export ENV export HISTSIZE export USER export LANG export MORE integer OPTIND integer MAILCHECK export CDPATH readonly namespace .sh
Here is the output of typeset -p:
typeset -x HISTFILE typeset -i TMOUT typeset -x FCEDIT typeset -x _AST_FEATURES typeset -x TERM typeset -x ASIS_DIR typeset HISTEDIT typeset PS2 typeset PS3 typeset -i PPID typeset -x MAIL typeset -x LOGNAME typeset -x EXINIT typeset -i LINENO typeset -x PATH typeset -i HISTCMD typeset -x _ typeset -x OLDPWD typeset -x PWD typeset -F 3 SECONDS typeset -x SHELL typeset -i RANDOM typeset -x HOME typeset -x VISUAL typeset -x MANPATH typeset -x EDITOR typeset -x ENV typeset -x HISTSIZE typeset -x USER typeset -x LANG typeset -x MORE typeset -i OPTIND typeset -i MAILCHECK typeset -x CDPATH typeset -r .sh
The following command saves the values and attributes of all the shell's variables for later reuse:
{ set ; typeset -p ;} > varlist
Copyright © 2003 O'Reilly & Associates. All rights reserved.