Dave Taylor, author of Wicked Cool Shell Scripts, begins a new series on Linux shell scripting in this issue.
If you're reading this publication, you already know that Linux is one of the most powerful and versatile operating systems available today. If you're an old-timer like me, you also know all about the command line and the geeky retro joy that typing commands rather than clicking icons offers the diligent user. Nowadays, though, the graphical interface layered atop Linux is so well designed that—though I find it a bit baffling—plenty of Linux users never go near the command line.
That's too bad. The command line is tremendously powerful, and the underlying metaphor of commands being strung together in pipes to create custom command sequences means that Linux actually offers millions of unique ways to work with the system. But, yes, there's a definite learning curve to overcome.
More than just the command line, though, it turns out that the shell offers a simple and surprisingly powerful programming environment through what we call shell script programming. In UNIX parlance, a shell is a command-line interface or CLI. Either way, it's the program that receives the commands you type in and actually does whatever it is you requested. String a bunch of these commands together, put them in a file and you have a shell script—simple and straightforward.
That's what I'm going to address in this new column here at Linux Journal, and fair warning for those über-geeks in the crowd, I'm going to go slow and make sure we cover all the basic concepts before we move into complex scripting tricks and techniques.
To start, let me briefly introduce myself. I first logged in to a BSD UNIX system way back in 1980 and have been involved with UNIX, and then Linux systems, ever since. I worked with the Open Software Foundation, helped manage the Usenet hierarchy, was one of the postmasters at hplabs back in the old UUCP days and am pretty well known as the author of The Elm Mail System. I've written 19 books, notably including Teach Yourself Unix in 24 Hours and the best-selling Wicked Cool Shell Scripts. I've contributed software to a variety of UNIX and Linux distros, including BSD 4.4 back when that was released, and I still have an open terminal window on my computer regardless of what I'm working on. I'm hooked on the command line, what can I say?
To get started, let's talk about one of the most important concepts of the Linux command line: standard input and output. When you run a program like ls to list files or date to see the date and time (sadly, the latter command doesn't help you gain a social life. If only it were so easy!), it turns out that the program actually has an input channel and two output channels. For these commands, the input channel is ignored because they don't actually read input from what's called the input stream, but they do have both an output and error output stream that are utilized. These three streams are called standard input (or stdin), standard output (or stdout) and standard error (or stderr). Why is this important? Because you can redirect any of them to come from a file or to go to a file—for any Linux command.
Let's say that you want to create a new file called rightnow, and you want it to contain the current date and time. Here's how that'd look on the command line:
date > rightnow
Easy enough. An important warning, however, is that if the output file you specify already exists, by default Linux just silently overwrites it, not infrequently leading to curses, great frustration and unhappy users. Be careful (or read up in your favorite command shell's man page about noclobber).
Let's say you want to save the date twice in the file. Now, instead of creating a new file, it's time to add the new content to the existing contents of the file. This is done thusly:
date >> rightnow
Check the file now and you'll see two time/date stamps, a few seconds apart.
Let's add another useful command to our list, wc, which counts characters, words and lines in either a specified file or in stdin (the standard input stream). First, how many characters, words and lines are in the standard output of the date command?
$ date > test $ wc test 1 6 29 test
Typical cryptic Linux output: the first value is the number of lines, the second the number of words and the third the number of characters. Let's try a variation on this too:
$ wc < test 1 6 29
Notice this time that rather than having the wc command open up a file we've specified by name, we're using a redirection to replace stdin with the contents of the specified file. That's why the wc output doesn't show the filename; it doesn't know that the input is from a file.
Let's consider one more file redirection before we wrap up this quick tour. We've seen > and >> and <. What do you think happens if you use << as a file redirection? Ah, well, that's a tricky one because it doesn't append anything, it lets you simulate a file redirection without actually having a file involved. In fact, << is known as a here document, because when used in the standard form of << EOF, it is read as “read until you reach 'here'” (the EOF sequence). This'll make more sense with an example:
$ wc << EOF > this is a simple test and should > show you how many lines, words > and characters are in this little > input sequence. > EOF 4 21 114
Now you can see where the output of wc is starting to make sense: four lines, 21 words and 114 characters. Count it for yourself! Also, notice that the > symbol at the beginning of the lines is automatically added by the shell as a continuation character to let you know that more input is expected. Once at the end of the here document, the sequence EOF appears, the input stream is fed to the specified command and wc dutifully counts lines, words and characters.
That should get us started with the basics this month. Next month, we'll explore how you can create pipelines of commands where the output of one command is the input of the next, then begin to talk about my long-term shell script programming project for this column: a rudimentary blackjack game.