![]() | ![]() |
[11]Not all implementations of NFS have this duplicate request cache. Current releases of Solaris, Compaq's Tru64 Unix, and other current operating systems implement the cache to improve the performance and "correctness" of NFS. A few, older implementations of NFS do not reject nonidempotent, duplicate requests. This produces some strange and often incorrect results when requests are retransmitted. An NFS client that sends the same remove operation to such a server may find that the designated file was removed, but the RPC call returns the "No such file or directory" error.
on an NFS client, the client may need to send two or more remove requests to the NFS server before it receives an acknowledgment. It's up to the NFS server to weed out the duplicate remove requests, even if they are a second or so apart. However, if you execute rm foo on Monday, and then on Tuesday you execute the same command in the same directory (where the file has already been removed), you would be very surprised if rm did not return an error. Executing this "duplicate request" a day later should produce this familiar error:% rm foo
To distinguish between duplicates generated due to an RPC timeout and retry and duplicates due to you repeating a command (whether it be a day later or a second later), NFS servers record a 32-bit RPC transaction identifier (xid ) with each entry in the duplicate request cache. The xid is part of every RPC request's header, and it is expected that the NFS client will generate unique xids.% rm foo rm: foo: No such file or directory
When the NFS client reaches the bin component in the pathname, it realizes that there is an NFS filesystem mounted on this directory, and it sends its lookup requests to server2 instead of server1. If the NFS client passed the whole pathname to server1, it might get the wrong answer on its lookup: server1 has its own /usr/local/bin directory that may or may not be the same directory that Client A has mounted. While this may seem to be a very expensive series of operations, the kernel keeps a directory name lookup cache (DNLC) that prevents every look-up request from going to an NFS server. The lookup operation takes a filename and a filehandle for a directory, and returns a filehandle pointing to the named file on the server. How then does the pathname traversal get started, if every lookup requires a filehandle from a previous pathname resolution? The mount operation seeds the lookup process by providing a filehandle for the root of the mounted filesystem. Within NFS, the only procedure that accepts full pathnames is the mount RPC, which turns the pathname into a filehandle for the mounted filesystem. Let's look at how NFS turns the pathname /usr/local/bin/emacs into an NFS filehandle, assuming that it's on a filesystem mounted on /usr/local from server wahoo:clientA# mount server1:/usr/local /usr/local clientA# mount server2:/usr/local/bin.mips /usr/local/bin
then the client will ask wahoo for a filehandle for the /tools/local directory.[12]wahoo:/tools/local - /usr/local nfs - yes ro,hard
[12]Asking the mountd daemon isn't the only way to get the filehandle for a filesystem. Recall that Chapter 6, "System Administration Using the Network File System" briefly mentioned the public option to the mount command. We will discuss this in more detail in Chapter 12, "Network Security".
Client A | Client B |
---|---|
cd /mnt/test | cd /mnt |
rm -rf test | |
stat(.)-->Stale file handle |
If one client removes a file and then creates a new file that re-uses the freed inode, other filehandles (on other clients) that point to the re-used inode must be marked stale. Inode generation numbers were added to the basic Unix filesystem to add a time history to an inode. In addition to the inode number, the filehandle must match the current generation number of the inode, or it is marked stale. When the inode is re-used for a new file, its generation number is incremented. Stale filehandles become a problem when one user's work tramples on an area in use by another, or when a filesystem on a server is rebuilt from a backup tape. When restoring from a dump tape onto a fresh filesystem, all of the inode generation numbers in the filesystem are set to random numbers. This causes every filehandle in use for that filesystem to become stale -- every inode pointed to by a pre-restore filehandle now probably points to a completely different file on the disk.
Therefore, a quick way to cripple an NFS network is to restore a fileserver from a dump tape without rebooting the NFS clients. When you rebuild the server's filesystems, all of the inode generation numbers are reset; when you load the tape, files end up with different inode numbers and different inode generation numbers than they had on the original filesystem. All NFS client filehandles are now invalid because of the new generation numbers and the (random) renumbering of each file's inode. Any attempt to use an open filehandle results in stale filehandle errors. If you are going to restore an NFS-exported filesystem from tape, unmount it from its clients or reboot the clients.![]() | ![]() | ![]() |
7. Network File System Design and Operation | ![]() | 7.3. NFS components |
Copyright © 2002 O'Reilly & Associates. All rights reserved.