[NFS] [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover
Jeff Layton
jlayton at poochiereds.net
Fri Apr 27 11:15:15 UTC 2007
On Fri, Apr 27, 2007 at 04:00:13PM +1000, Neil Brown wrote:
> On Thursday April 26, wcheng at redhat.com wrote:
> > Neil Brown wrote:
> >
> > >On Thursday April 26, wcheng at redhat.com wrote:
> > >
> > >
> > >>A convincing argument... unfortunately, this happens to be a case where
> > >>we need to protect server from client's misbehaviors. For a local
> > >>filesystem (ext3), if any file reference count is not zero (i.e. some
> > >>clients are still holding the locks), the filesystem can't be
> > >>un-mounted. We would have to fail the failover to avoid data corruption.
> > >>
> > >>
> > >
> > >I think this is a tangential problem.
> > >"removing locks held by troublesome clients so that I can unmount my
> > >filesystem" is quite different from "remove locks held by client
> > >clients using virtual-NAS-foo so they can be migrated".
> > >
> > >
> > The reason to unmount is because we want to migrate the virtual IP.
>
> The reason to unmount is because we want to migrate the filesystem. In
> your application that happens at the same time as migrating the
> virtual IP, but they are still distinct operations.
>
> > IMO
> > they are the same issue but it is silly to keep fighting about this. In
> > any case, one interface is better than two, if you allow me to insist on
> > this.
>
> How many interfaces depends somewhat on how many jobs to do.
> You want to destroy state that will be rebuilt on a different server,
> and you want to force-unmount a filesystem. Two different jobs. Two
> interfaces seems OK.
> If they could both be done with one simple interface that would be
> ideal, but I'm not sure they can.
>
> And no-one gets to insist on anything.
> You are writing the code. I am accepting/rejecting it. We both need
> to agree or we won't move forward. (Well... I could just write code
> myself, but I don't plan to do that).
>
> >
> > So how about we do RPC call to lockd to tell it to drop the locks owned
> > by the client/local-IP pair as you proposed, *but* add an "OR" with fsid
> > to fool proof the process ? Say something like this:
> >
> > RPC_to_lockd_with (client_host, client_ip, fsid);
> > if ((host == client_host && vip == client_ip) ||
> > (get_fsid(file) == client_fsid))
> > drop_the_locks();
> >
> > This logic (RPC to lockd) will be triggered by a new command added to
> > nfs-util package.
> >
> > If we can agree on this, the rest would be easy. Done ?
>
> Sorry, but we cannot agree with this, and I think the rest is still
> easy.
>
> The more I think about it, the less I like the idea of using an fsid.
> The fsid concept was created simply because we needed something that
> would fit inside a filehandle. I think that is the only place it
> should be used.
> Outside of filehandles, we have a perfectly good and well-understood
> mechanism for identifying files and filesystems. It is a "path name".
> The functionality "drop all locks held by lockd on a particular
> filesystem" is potentially useful outside of any fail-over
> configuration, and should work on any filesystem, not just one that
> was exported with 'fsid='.
>
> So if you need that, then I think it really must be implemented by
> something a lot like
> echo -n /path/name > /proc/fs/nfs/nlm_unlock_filesystem
>
> This is something that we could possible teach "fuser -k" about - so
> it can effectively 'kill' that part of lockd that is accessing a given
> filesystem. It is useful to failover, but definitely useful beyond
> failover.
Just a note that I posted a patch ~ a year ago that did precisely that. The
interface was a little bit different. I had userspace echoing in a dev_t
number, but it wouldn't be too hard to change it to use a pathname instead.
Subject was:
[PATCH] lockd: add procfs control to cue lockd to release all locks on a device
...if anyone is interested in having me resurrect it.
-- Jeff
>
>
> Everything else can be done in the RPC interface between lockd and
> statd, leveraging the "my_name" field to identify state based on which
> local network address was used. All this other functionality is
> completely agnostic about the particular filesystem and just looks at
> the virtual IP that was used.
> All this other functionality is all that you need unless you have a
> misbehaving client.
> You would do all the lockd/statd/rpc stuff. Then try to unmount the
> filesystem. If that fails, try "fuser -k -m /whatever" and try the
> unmount again.
>
> Another interface alternative might be to hook in to
> umount(MNT_FORCE), but that would require even broader review, and
> probably isn't worth it....
>
> NeilBrown
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> NFS maillist - NFS at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>
More information about the Cluster-devel
mailing list