[Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference

Colin Simpson Colin.Simpson at iongeo.com
Fri Jul 8 18:36:45 UTC 2011


Very interesting. 

Certainly in our application it would be highly unlikely that samba and
NFS would try to write to the same file simultaneously that would very
much be an edge case (and users would know the result would be
undefined). I can certainly personally live with that level of potential
file corruption, though I can see others may not.

But I guess you are also telling me that file locking between the two
wouldn't be helping here either? (I rule out NFSv2 as something we have
thankfully eliminated). NFSv3 could be gone for us if we are lucky by
2012 (when RHEL 4 goes EOL and if RHEL5's NFSv4 is robust enough).

Currently by default RHEL6 clusters export (with the standard RA) on
NFSv4 and RHEL6 (and Fedora etc) mount these as NFSv4. So I'd hope
supported.....I haven't as yet tried to wrap any security round these,
from a discussion here a while back that looks like hard work. I'd
certainly love to have pNFS to allow multiple active nodes. 

OT: My main NFS issue I have just now is supporting laptops with the
automounter. NFS is just so undynamic. Once a mount is in place the
client changing IP will leave the mount hung. And laptops do this all
the time (on, off wired, wireless VPN etc). We have some nasty scripts
that clean up the mounts when laptops move network (lots of forcing
kills of autofs and umount -fl's). Mostly works ok. Again if a user
disconnects a laptop during an ongoing file operation they can expect
undefined file contents. It's better than the alternative of hung
mounts, lots of things hate that. We aren't talking complex file formats
or operations here, copying source files, data, docs to the local disk,
so no nasty binary file corruption issues. Maybe not such a great thing
to do, but users like to have a consistent file system view that matches
the office based systems. 

Sadly it looks like NFS is the least dynamic network component left in a
Linux distro. I posted a longer version of this problem to the linux-nfs
mailing list, I heard from someone that basically said the NFS committee
and developers (not just Linux) are largely targeting NFS as Enterprise
Storage protocol. I presume he means storage servers using NFS to share
to say front end web servers. So less interested in certainly my use
case. Possibly the best bet (in a while) for desktop network file
sharing will be the Samba, they seem to be trying to target cifs (with
full Unix extension) as being a solution for this. 

Thanks

Colin

On Fri, 2011-07-08 at 18:36 +0100, Alan Brown wrote:
> On Fri, 8 Jul 2011, Colin Simpson wrote:
> 
> > That's not ideal either when Samba isn't too happy working over NFS,
> and
> > that is not recommended by the Samba people as being a sensible
> config.
> 
> I know but there's a real (and demonstrable) risk of data corruption
> for
> NFS vs _anything_ if NFS clients and local processes (or clients of
> other
> services such as a samba server) happen to grab the same file for
> writing
> at the same time.
> 
> Apart from that, the 1 second granularity of NFS timestamps can (and
> has)
> result in writes made by non-nfs processes to cause NFS clients which
> have
> that file opened read/write to see "stale filehandle" errors due to
> the
> inode having changed when they weren't expecting it.
> 
> We (should) all know NFS was a kludge. What's surprising is how much
> kludge stll remains in the current v2/3 code (which is surprisingly
> opaque
> and incredibly crufty, much of it dates from the early 1990s or
> earlier)
> 
> As I said earlier, V4 is supposed to play a lot nicer but I haven't
> tested
> it - as as far as I know it's not suported on GFS systems anyway (That
> was
> the RH official line when I tried to get it working last time..)
> 
> I'd love to get v4 running properly in active/active/active setup from
> multiple GFS-mounted fileservers to the clients. If anyone knows how
> to
> reliably do it on EL5.6 systems then I'm open to trying again as I
> believe
> that this would solve a number of issues being seen locally (including
> various crash bugs).
> 
> On the other hand, v2/3 aren't going away anytime soon and some effort
> really needs to be put into making them work properly.
> 
> On the gripping hand, I'd also like to see viable alternatives to NFS
> when
> it comes to feeding 100+ desktop clients
> 
> Making them mount the filesystems using GFS might sound like an
> alternative until you consider what happens if any of them
> crash/reboot
> during the day. Batch processes can wait all day, but users with
> frozen
> desktops get irate - quickly.
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 

This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed.  If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.






More information about the Linux-cluster mailing list