This book is about a set of oddly named UNIX utilities, sed and awk. These utilities have many things in common, including the use of regular expressions for pattern matching. Since pattern matching is such an important part of their use, this book explains UNIX regular expression syntax very thoroughly. Because there is a natural progression in learning from grep to sed to awk, we will be covering all three programs, although the focus is on sed and awk.
Sed and awk are tools used by users, programmers, and system administrators--anyone working with text files. Sed, so called because it is a stream editor, is perfect for applying a series of edits to a number of files. Awk, named after its developers Aho, Weinberger, and Kernighan, is a programming language that permits easy manipulation of structured data and the generation of formatted reports. This book emphasizes the POSIX definition of awk. In addition, the book briefly describes the original version of awk, before discussing three freely available versions of awk and two commercial ones, all of which implement POSIX awk.
The focus of this book is on writing scripts for sed and awk that quickly solve an assortment of problems for the user. Many of these scripts could be called "quick-fixes." In addition, we'll cover scripts that solve larger problems that require more careful design and development.
Chapter 1, "Power Tools for Editing", is an overview of the features and capabilities of sed and awk.
Chapter 2, "Understanding Basic Operations", demonstrates the basic operations of sed and awk, showing a progression in functionality from sed to awk. Both share a similar command-line syntax, accepting user instructions in the form of a script.
Chapter 3, "Understanding Regular Expression Syntax", describes UNIX regular expression syntax in full detail. New users are often intimidated by these strange expressions, used for pattern matching. It is important to master regular expression syntax to get the most from sed and awk. The pattern-matching examples in this chapter largely rely on grep and egrep.
Chapter 4, "Writing sed Scripts", begins a three-chapter section on sed. This chapter covers the basic elements of writing a sed script using only a few sed commands. It also presents a shell script that simplifies invoking sed scripts.
Chapter 5, "Basic sed Commands", and Chapter 6, "Advanced sed Commands", divide the sed command set into basic and advanced commands. The basic commands are commands that parallel manual editing actions, while the advanced commands introduce simple programming capabilities. Among the advanced commands are those that manipulate the hold space, a set-aside temporary buffer.
Chapter 7, "Writing Scripts for awk", begins a five-chapter section on awk. This chapter presents the primary features of this scripting language. A number of scripts are explained, including one that modifies the output of the ls command.
Chapter 8, "Conditionals, Loops, and Arrays", describes how to use common programming constructs such as conditionals, loops, and arrays.
Chapter 9, "Functions", describes how to use awk's built-in functions as well as how to write user-defined functions.
Chapter 10, "The Bottom Drawer", covers a set of miscellaneous awk topics. It describes how to execute UNIX commands from an awk script and how to direct output to files and pipes. It then offers some (meager) advice on debugging awk scripts.
Chapter 11, "A Flock of awks", describes the original V7 version of awk, the current Bell Labs awk, GNU awk (gawk) from the Free Software Foundation, and mawk, by Michael Brennan. The latter three all have freely available source code. This chapter also describes two commercial implementations, MKS awk and Thomson Automation awk (tawk), as well as VSAwk, which brings awk-like capabilities to the Visual Basic environment.
Chapter 12, "Full-Featured Applications", presents two longer, more complex awk scripts that together demonstrate nearly all the features of the language. The first script is an interactive spelling checker. The second script processes and formats the index for a book or a master index for a set of books.
Chapter 13, "A Miscellany of Scripts", presents a number of user-contributed scripts that show different styles and techniques of writing scripts for sed and awk.
Appendix A, "Quick Reference for sed", is a quick reference describing sed's commands and command-line options.
Appendix B, "Quick Reference for awk", is a quick reference to awk's command-line options and a full description of its scripting language.
Appendix C, "Supplement for Chapter 12", presents the full listings for the spellcheck.awk script and the masterindex shell script described in Chapter 12, "Full-Featured Applications".
Copyright © 2003 O'Reilly & Associates. All rights reserved.