Unix Power ToolsUnix Power ToolsSearch this book

Chapter 16. Spell Checking, Word Counting, and Textual Analysis

Contents:

The Unix spell Command
Check Spelling Interactively with ispell
How Do I Spell That Word?
Inside spell
Adding Words to ispell's Dictionary
Counting Lines, Words, and Characters: wc
Find a a Doubled Word
Looking for Closure
Just the Words, Please

16.1. The Unix spell Command

On some Unix systems, the spell command reads one or more files and prints a list of words that may be misspelled. You can redirect the output to a file, use grep (Section 13.1) to locate each of the words, and then use vi or ex to make the edits. It's also possible to hack up a shell and sed script that interactively displays the misspellings and fixes them on command, but realistically, this is too tedious for most users. (The ispell (Section 16.2) program solves many -- though not all -- of these problems.)

When you run spell on a file, the list of words it produces usually includes a number of legitimate words or terms that the program does not recognize. spell is case sensitive; it's happy with Aaron but complains about aaron. You must cull out the proper nouns and other words spell doesn't know about to arrive at a list of true misspellings. For instance, look at the results on this sample sentence:

$ cat sample
Alcuin uses TranScript to convert ditroff into
PostScript output for the LaserWriter printerr.
$ spell sample
Alcuin
ditroff
printerr
LaserWriter
PostScript
TranScript

Only one word in this list is actually misspelled.

On many Unix systems, you can supply a local dictionary file so that spell recognizes special words and terms specific to your site or application. After you have run spell and looked through the word list, you can create a file containing the words that were not actual misspellings. The spell command will check this list after it has gone through its own dictionary. On certain systems, your word-list file must be sorted (Section 22.1).

If you added the special terms in a file named dict, you could specify that file on the command line using the + option:

$ spell +dict sample
printerr

The output is reduced to the single misspelling.

The spell command will make some errors based on incorrect derivation of spellings from the root words contained in its dictionary. If you understand how spell works (Section 15.4), you may be less surprised by some of these errors.

As stated at the beginning, spell isn't on all Unix systems, e.g., Darwin and FreeBSD. In these other environments, check for the existence of alternative spell checking, such as ispell (Section 16.2). Or you can download and install the GNU version of spell at http://www.gnu.org/directory/spell.html.

--DD and SP



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.