NFS stale-filehandles

Roger Heflin rheflin at atipa.com
Fri Dec 15 17:23:30 UTC 2006


T. Horsnell wrote:
> FC6, NFS V3
> -----------
> 
> I'm trying to get to the state where
>  
> 1. I can have a bunch of NFS clients which have mounted
>    filesystems exported by a fileserver,
> 
> 2. The fileserver is stopped, the (SCSI) disks containing
>    the filesystems are moved to different positions on the SCSI bus,
>    and the fileserver is rebooted.
> 
> 3. The clients can still continue to access the correct filesystems,
>    without having to umount/re-mount anything, even though their position
>    on the server's SCSI bus has changed. 
> 
> The clients are members of a compute farm. They mount the NFS 
> filesystems with the 'hard' option so that their NFS requests 
> stall if the server is off for any reason, and so far this works
> well. However, if the server disks are moved, there are problems.
> 
> 
> I started off mounting the server filesystems by label, with lines
> in /etc/fstab  like:
> 
> LABEL=/filesys1		/filesystem1	ext3	defaults	0 2
> 
> This gave me pseudo-persistence in that wherever the disks were on
> the SCSI bus, they were always mounted on the correct mountpoint.
> With this setup, the underlying device (/dev/sda /dev/sdb etc)
> as shown by 'mount -t ext3' changed when the disk position changed.
> I then discovered that if I swapped two disks over on the server
> while those filesystems were still NFS-mounted by a client, the
> client didnt notice the swap, but continued to access the disks
> in the unswapped position, and hence access the wrong filesystem.
> There were no NFS complaints, just complaints from users.
> 
> 
> I'm now using a bunch of udev rules to give me device-name persistence
> instead of relying on the partition label, and I have lines in fstab like:
> 
> /dev/dsk0_1		/filesystem1	ext3	defaults	0 2
> 
> Now, wherever I shift a disk to on the server SCSI bus, the underlying
> device-name stays constant, but the client objects with a 'stale NFS filehandle'
> error when the disk-position is changed, and I have to umount and remount at
> the client. Its a slight improvement in that user-processes on the client
> cant inadvertently use the wrong filesystem, but I would much prefer it to
> be transparent. Is this possible with NFS
> 
> Cheers,
> Terry
> 

Yeap.

A client knows a filesystem by server ip address, and by fsid (see
man exports), the default generated fsid is based on the underlying
device id, so if the device id changes (moved on scsi bus)
then the fsid changes, and from the clients point of view the
filesystem is no longer there.   And you can get odd failures,
if someones "new" fsid matches someone elses "old" fsid you
will get interesting bad results, ie /home will say it is /home
on a client by the directories will look like /opt.   If you have
large fsids it is harder to get, but I have seen it happen.

You can in the exports file set the fsid to something, and note
that you can get into this same issue using LVM, and just adjusting
the order of turning on the VG's (first gets id 1, second 2, changing
order changes this).    The fsids need to be unique across a
given server.

                                  Roger




More information about the fedora-list mailing list