[Linux-cluster] Re: GFS, what's remaining

Daniel Phillips phillips at istop.com
Sun Sep 4 19:51:56 UTC 2005


On Sunday 04 September 2005 03:28, Andrew Morton wrote:
> If there is already a richer interface into all this code (such as a
> syscall one) and it's feasible to migrate the open() tricksies to that API
> in the future if it all comes unstuck then OK.  That's why I asked (thus
> far unsuccessfully):
>
>    Are you saying that the posix-file lookalike interface provides
>    access to part of the functionality, but there are other APIs which are
>    used to access the rest of the functionality?  If so, what is that
>    interface, and why cannot that interface offer access to 100% of the
>    functionality, thus making the posix-file tricks unnecessary?

There is no such interface at the moment, nor is one needed in the immediate 
future.  Let's look at the arguments for exporting a dlm to userspace:

  1) Since we already have a dlm in kernel, why not just export that and save
     100K of userspace library?  Answer: because we don't want userspace-only
     dlm features bulking up the kernel.  Answer #2: the extra syscalls and
     interface baggage serve no useful purpose.

  2) But we need to take locks in the same lockspaces as the kernel dlm(s)!
     Answer: only support tools need to do that.  A cut-down locking api is
     entirely appropriate for this.

  3) But the kernel dlm is the only one we have!  Answer: easily fixed, a
     simple matter of coding.  But please bear in mind that dlm-style
     synchronization is probably a bad idea for most cluster applications,
     particularly ones that already do their synchronization via sockets.

In other words, exporting the full dlm api is a red herring.  It has nothing 
to do with getting cluster filesystems up and running.  It is really just 
marketing: it sounds like a great thing for userspace to get a dlm "for 
free", but it isn't free, it contributes to kernel bloat and it isn't even 
the most efficient way to do it.

If after considering that, we _still_ want to export a dlm api from kernel, 
then can we please take the necessary time and get it right?  The full api 
requires not only syscall-style elements, but asynchronous events as well, 
similar to aio.  I do not think anybody has a good answer to this today, nor 
do we even need it to begin porting applications to cluster filesystems.

Oracle guys: what is the distributed locking API for RAC?  Is the RAC team 
waiting with bated breath to adopt your kernel-based dlm?  If not, why not?

Regards,

Daniel




More information about the Linux-cluster mailing list