[Linux-cluster] Fenced node never reboots properly


We are running some virtual machines on top of ESX that are giving us some performance problems but also provide us with a good test-case for fencing operations.

Due to some (as of yet unknown) problem the two nodes in the GFS cluster do not properly respond to the heartbeats, so node 1 kicks node 2 out of the cluster. Node 2 correctly reports in its syslog that it has been asked to leave the cluster, node 1 fences node 2 and node 2 initiates a shutdown - so far so good.

However during shutdown node 2 executes /etc/rc6.d/S31umountnfs (it's a Debian system) which also attempts to unmount the GFS disk - result: kernel OOPS. The system continues shutdown until it says 'Will now restart.' but that's the end of it. I've tried setting the /proc/sys/kernel/panic and added 'panic=5' to the kernel boot options but to no avail.

I'm really at a loss here - does anybody have any suggestions on how to solve this problem?


