[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Off-topic]cvs hangs due a dangling nfs.

Rick Stevens wrote, On 10/30/2008 07:04 PM:
Marcelo M. Garcia wrote:

I use automount to share a partition via NFS so every machine can mount this share. For example, the machine "abc" has a partitio /abc and other machines mounts /users/abc.

One the machines was removed (retired), but now a few clients can't use the CVS server. After the command "cvs history...", it simply sits and wait.

Please tell us that your $CVSROOT contains :pserver:|:ext:|:extssh: and not a :fork:|:local:|direct pointer to the file system...

because using CVS across a Network File System (including smb|cif) has been the recipe for corrupted repository data for a very long time.

If you do use cvs across nfs, may I suggest you investigate the use of verify_repo running on what ever machine now physically hosts the repository.

verify_repo	A perl script to check an entire repository for corruption.
	        Contributed by Donald Sharp <sharpd cisco com>.

I know that the problem is in accessing the retired machine because of the command "strace cvs history".

Is there a way to find where is this reference to the old machine? Something like /etc/mtab?

Part II (Really Off-topic)
Is there a way to find which machines are client of a NFS share? So that before the shutdown I could umount in the clients?

"showmount -d" or "showmount -a" on the NFS server should give you a
list of the clients that have made a mount request.  There's no
guarantee that they STILL have a valid mount...if a client doesn't issue
an umount command (e.g. simply goes tits up), the server will still
think there's a valid mount.  See "man showmount" for details.

You should make sure the clients mount "soft".  That way if the server
goes away, any request made by the client should eventually time out.
If they do a hard mount, the request will hang until the server comes

I would like to respectfully disagree with the suggestion to use the soft mount option, or at least suggest only using it if you understand and can live with the implications.
*"soft           If an NFS file operation has a major timeout then report
                an  I/O error to the calling program."
and some programs don't deal with the error well, by that I mean I have seen**:
cp and tar when writing large (>1MB) files across the net introduce multiple faults in the file being copied (i.e. the md5|sha1 sums do NOT match afterwards) with out issuing a fault message at all. gnome & Firefox (on account initialization), create a sort-of file&directory&link combination (it was supposed to be a directory to hold files, but it turned into a monster NODE that was interesting to remove from the server even AS root). These kinds of errors stopped immediately after changing clients to hard,intr, and would return if we changed back to soft.

The better option most times is hard,intr which will "continue retrying indefinitely" but " then allow signals to interupt the file operation and cause it to return EINTR to the calling program."

*see man nfs for quoted material. :)
**on a network with over 30 folks using nfs for home, project work, and included other network traffic at moderate to high volume (10 to 75% of 100Mb links).

Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]