[Cluster-devel] Re: [NFS] [PATCH 0/4 Revised] NLM - lock failover

Wed Apr 25 14:10:31 UTC 2007

J. Bruce Fields wrote:
> On Thu, Apr 05, 2007 at 05:50:55PM -0400, Wendy Cheng wrote:
>   
>> 1) Failover server exports filesystem with "fsid" option as:
>>     /etc/exports entry> /mnt/shared/exports *(fsid=1234,sync,rw)
>> 2) Failover server dispatch rpc.statd with "-H" option.
>> 3) Failover server drops locks based on fsid by:
>>     shell> echo 1234 > /proc/fs/nfsd/nlm_unlock
>> 4) Takeover server enters per fsid grace period by:
>>     shell> echo 1234 > /proc/fs/nfsd/nlm_set_igrace
>> 5) Takeover server notifies clients for lock reclaim by:
>>     shell> /usr/sbin/sm-notify -f -v floating_ip_address -P an_sm_directory
>>     
>
> I don't understand statd and lockd as well as I should.  Where exactly
> does the takeover server stop serving requests, and the failover server
> start?  If this isn't done carefully, you can leave a window between
> steps 3 and 4 where a client could acquire a lock before its rightful
> owner reclaims it, right?
>
>   
The detailed overall steps were described in the first email we sent 
*long* time (> 6 months, I think) ago. The first step of the whole 
process is tearing down the floating IP from the failover server. The IP 
is not accessible until filesystem is safely fail-over and SM_NOTIFY 
ready to be sent.

Last round of discussion gave me an impression that as long as I rebased 
the code into akpm's mm tree, these patches would get accepted. So I 
have been quite careless in this submission and just realized people 
have a very short memory :) .. Will do the write-up and put it somewhere 
so we don't need to go thru this again.

-- Wendy