Unix Power ToolsUnix Power ToolsSearch this book

38.5. How to Make Backups to a Local Device

This article was written for Linux systems, but the advice applies everywhere. You may need to make some adjustments -- in the names of the tape drive devices and some filesystem directories, for instance. If you're making personal backups (of the files on your account, for instance), you can substitute your directory names for the system directories covered here, but the command names and techniques won't change.

38.5.1. What to Back Up

As Section 38.3 says, the simplest way to make a backup is to use tar to archive all the files on the system or only those files in a set of specific directories. Before you do this, however, you need to decide what files to back up. Do you need to back up every file on the system? This is rarely necessary, especially if you have your original installation disks or CD-ROM. If you have made specific, important changes to the system, but everything else could simply be reinstalled in case of a problem, you could get by archiving only those files you have made changes to. Over time, however, it is difficult to keep track of such changes.

In general, you will be making changes to the system configuration files in /etc. There are other configuration files as well, and it can't hurt to archive directories such as /usr/local (where various packages generally get installed) and /usr/X11R6/lib/X11 (which contains the X Window System configuration files). You may want to do filtering on these directories and back up only the configuration files, since binaries in /usr/local and things like fonts in the X11 distribution can be reinstalled from their original packages easily enough.

You should also back up your kernel sources (if you have patched your kernel sources); these are found in /usr/src/linux (/usr/src/sys on *BSD). At the very least, you'll want to back up your kernel configuration file if you've built your own kernel; it's in /usr/src/linux/.config (or /usr/src/sys/platform/conf/KERNELNAME on *BSD).

It's a good idea to keep notes on what features of the system you've changed so you can make intelligent choices when making backups. If you're truly paranoid, go ahead and back up the whole system: that can't hurt, but the cost of backup media might.

Of course, you should also back up the home directories for each user on the system; these are generally found in /home. If you have your system configured to receive electronic mail, you might want to back up the incoming mail files for each user. Many people tend to keep old and "important" electronic mail in their incoming mail spool, and it's not difficult to accidentally corrupt one of these files through a mailer error or other mistake. These files are usually found in /var/spool/mail.

38.5.2. Backing Up to Tape

Assuming you know what files or directories to back up, you're ready to roll. The tar command can be used directly, as we saw in Section 39.2, to make a backup. For example, the command:

tar cvf /dev/rft0 /usr/src /etc /home

archives all of the files from /usr/src, /etc, and /home to /dev/rft0. /dev/rft0 is the first "floppy-tape" device -- that is, for the type of tape drive that hangs off of the floppy controller. Many popular tape drives for the PC use this interface. If you have a SCSI tape drive, the device names are /dev/st0, /dev/st1, and so on, based on the drive number. Those tape drives with another type of interface have their own device names; you can determine these by looking at the documentation for the device driver in the kernel.

You can then read the archive back from the tape using a command such as:

tar xvf /dev/rft0

This is exactly as if you were dealing with a tar file on disk, as in Section 39.2.

When you use the tape drive, the tape is seen as a stream that may be read from or written to in one direction only. Once tar is done, the tape device will be closed, and the tape will rewind (if you're using the default tape device; see below on how to prevent this). You don't create a filesystem on a tape, nor do you mount it or attempt to access the data on it as files. You simply treat the tape device itself as a single "file" to create or extract archives from.

Be sure your tapes are formatted before you use them if you are using a tape drive that needs it. This ensures that the beginning-of-tape marker and bad-blocks information has been written to the tape. At the time of this writing, no tools exist for formatting QIC-80 tapes (those used with floppy tape drivers) under Linux; you'll have to format tapes under MS-DOS or use preformatted tapes.

Creating one tar file per tape might be wasteful if the archive requires a fraction of the capacity of the tape. To place more than one file on a tape, you must first prevent the tape from rewinding after each use, and you must have a way to position the tape to the next "file marker," both for tar file creation and for extraction.

The way to do this is to use the nonrewinding tape devices, which are named /dev/nrft0, /dev/nrft1, and so on for floppy-tape drivers, and /dev/nrst0, /dev/nrst1, and so on for SCSI tapes. When this device is used for reading or writing, the tape will not be rewound when the device is closed (that is, once tar has completed). You can then use tar again to add another archive to the tape. The two tar files on the tape won't have anything to do with each other. Of course, if you later overwrite the first tar file, you may overwrite the second file or leave an undesirable gap between the first and second files (which may be interpreted as garbage). In general, don't attempt to replace just one file on a tape that has multiple files on it.

Using the nonrewinding tape device, you can add as many files to the tape as space permits. To rewind the tape after use, use the mt command. mt is a general-purpose command that performs a number of functions with the tape drive. For example, the command:

mt /dev/nrft0 rewind

rewinds the tape in the first floppy-tape device. (In this case, you can use the corresponding rewinding tape device as well; however, the tape will rewind just as a side effect of the tape device being closed.)

Similarly, the command:

mt /dev/nrft0 reten

retensions the tape by winding it to the end and then rewinding it.

When reading files on a multiple-file tape, you must use the nonrewinding tape device with tar and the mt command to position the tape to the appropriate file.

For example, to skip to the next file on the tape, use the command:

mt /dev/nrft0 fsf 1

This skips over one file on the tape. Similarly, to skip over two files, use:

mt /dev/nrft0 fsf 2

Be sure to use the appropriate nonrewinding tape device with mt. Note that this command does not move to "file number two" on the tape; it skips over the next two files based on the current tape position. Just use mt to rewind the tape if you're not sure where the tape is currently positioned. You can also skip back; see the mt manual page for a complete list of options.

You need to use mt every time you read a multifile tape. Using tar twice in succession to read two archive files usually won't work; this is because tar doesn't recognize the file marker placed on the tape between files. Once the first tar finishes, the tape is positioned at the beginning of the file marker. Using tar immediately will give you an error message, because tar will attempt to read the file marker. After reading one file from a tape, just use:

mt device fsf 1

to move to the next file.

38.5.3. Backing Up to Floppies or Zip Disks

Just as we saw in the last section, the command:

tar cvf /dev/fd0 /usr/src /etc /home

makes a backup of /usr/src, /etc, and /home to /dev/fd0, the first floppy device. You can then read the backup using a command such as:

tar xvf /dev/fd0

If we use /dev/hdd instead of /dev/fd0 (and our Zip drive is the slave drive on the second IDE controller), we'll be writing to and reading from a Zip disk instead of a floppy. (Your device name may vary depending on your OS.) Because floppies and Zip disks have a rather limited storage capacity, GNU tar allows you to create a "multivolume" archive. (This feature applies to tapes as well, but it is far more useful in the case of smaller media.) With this feature, tar prompts you to insert a new volume after reading or writing each disk. To use this feature, simply provide the M option to tar, as in:

tar cvMf /dev/fd0 /usr/src /etc /home

Be sure to label your disks well, and don't get them out of order when attempting to restore the archive.

One caveat of this feature is that it doesn't support the automatic gzip compression provided by the z option. However, there are various reasons why you may not want to compress your backups created with tar, as discussed later. At any rate, you can create your own multivolume backups using tar and gzip in conjunction with a program that reads and writes data to a sequence of disks (or tapes), prompting for each in succession. One such program is backflops, available on several Linux distributions and on the FTP archive sites. A do-it-yourself way to accomplish the same thing would be to write the backup archive to a disk file and use dd or a similar command to write the archive as individual chunks to each disk. If you're brave enough to try this, you can figure it out for yourself. [Aw, come on, guys, have a heart! (Psst, readers: look at the end of Section 21.9.) -- JP]

38.5.4. To gzip, or Not to gzip?

There are good arguments both for and against compression of tar archives when making backups. The overall problem is that neither tar nor gzip is particularly fault-tolerant, no matter how convenient they are. Although compression using gzip can greatly reduce the amount of backup media required to store an archive, compressing entire tar files as they are written to floppy or tape makes the backup prone to complete loss if one block of the archive is corrupted, say, through a media error (not uncommon in the case of floppies and tapes). Most compression algorithms, gzip included, depend on the coherency of data across many bytes to achieve compression. If any data within a compressed archive is corrupt, gunzip may not be able to uncompress the file at all, making it completely unreadable to tar. The same applies to bzip2. It may compress things better than gzip, but it has the same lack of fault-tolerance.

This is much worse than if the tar file were uncompressed on the tape. Although tar doesn't provide much protection against data corruption within an archive, if there is minimal corruption within a tar file, you can usually recover most of the archived files with little trouble, or at least those files up until the corruption occurs. Although far from perfect, it's better than losing your entire backup.

A better solution would be to use an archiving tool other than tar to make backups. There are several options available. cpio (Section 38.13) is an archiving utility that packs files together, much like tar. However, because of the simpler storage method used by cpio, it recovers cleanly from data corruption in an archive. (It still doesn't handle errors well on gzipped files.)

The best solution may be to use a tool such as afio. afio supports multivolume backups and is similar in some respects to cpio. However, afio includes compression and is more reliable because each individual file is compressed. This means that if data on an archive is corrupted, the damage can be isolated to individual files, instead of to the entire backup.

These tools should be available with your Linux distribution, as well as from all of the Internet-based Linux archives. A number of other backup utilities, with varying degrees of popularity and usability, have been developed or ported for Linux. If you're serious about backups, you should look into them.[122]

[122]Of course, this section was written after the author took the first backup of his Linux system in nearly four years of use!

--MW, MKD, and LK



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.