[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Cluster-devel] [PATCH 0/5] NLM Failover - introduction



This revised patch set is submitted to address active-active lock
failover issues with NFS v2/v3 as discussed in:

o http://www.redhat.com/archives/linux-cluster/2006-June/msg00050.html
  (interface discussion).
o https://www.redhat.com/archives/cluster-devel/2006-June/msg00231.html
  (code review - RFC).
o http://www.redhat.com/archives/cluster-devel/2006-August/msg00000.html
  (code review - first submission). 

The major change made in this submission is switching the driving
interface from floating ip address to exported filesystem id (fsid -
check out "man exports" for details). With previous patches, if we drop
the old server's locks based on one particular (floating) ip address,
lock requests coming in from other interfaces may still hang around.
Failover server could end up with (filesystem) umount failure and
subsequently abort the overall transaction.

The issue with RESTRICTED_STATD flag with nfs-utils package is addressed
in patch 5-4 as an optional patch. The relevant steps of NLM failover
now look like the following:

1) Failover server exports filesystem with "fsid" option as:
   /etc/exports entry> /mnt/ext3/exports *(fsid=1234,sync,rw)

2) Failover server drops locks based on fsid by:
   shell> echo 1234 > /proc/fs/nfsd/nlm_unlock

3) Takeover server enters per fsid grace period by:
   shell> echo 1234 > /proc/fs/nfsd/nlm_set_igrace

4) Takeover server notifies clients for lock reclaim by:
   shell> rpc.statd -n floating_ip -N -P alternative_sm_dir

Patch Summary:
5-1: implement /proc/fs/nfsd/nlm_unlock
5-2: implement /proc/fs/nfsd/nlm_set_igrace
5-3: record and pass incoming server ip interface to rpc.statd
5-4: user mode rpc.statd patch
5-5: (for reference purpose) kernel nlm deadlock workaround

Note and Restriction:
o It is expected the RESTRICTED_STATD is tuned on in nfs-utils package.
o IPV6 changes will follow if requested.
o There is an existing NLM deadlock bug that can be triggered with
  and without this patch set. We include the temporary workaround here
  as PATCH 5-5 for reference purpose. The real fix has been worked on:
  http://sourceforge.net/mailarchive/forum.php?
thread_id=30052343&forum_id=4930

-- Wendy


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]