[Cluster-devel] DLM thoughts - multi threaded recvd

Wed Dec 20 15:41:46 UTC 2006

Hi,

On Wed, 2006-12-20 at 15:01 +0000, Patrick Caulfield wrote:
> One of the things that Andrew Morton commented on when taking the DLM was that that there is a single dlm_recvd process (and sendd
> too) processing incoming requests and it could become a bottleneck on large systems.
> 
> So, I've been thinking how to make this scale a little better and have come up with several things.
> 
> 1. How do we decide how many threads to start?
> 
> My first thought was to start one per CPU. But how do we cope with CPU hotplug events (if we do at all). This
> is also slightly wasteful in a two-node SMP cluster where you could have 2 machines each with 4 cores each running 4 dlm_recvd
> threads with only really work for 1 per machine.  We can't split up messages from one machine over threads because the packets may
> be fragmented *
> 
> Is there a reasonable API in the kernel for getting the (current) number of CPUs in a system ?
> 
Not that I know of, although there are two different "numbers of CPUs" I
think, one being the current number and one being the max number. I had
the same problem when I was looking into hashing the rwlocks for the
glock hash table and settled for using the max number and hoping for the
best, though I think you need to be more accurate than I did.

As an alternative suggestion - is it possible to do this without any
threads at all? In that case the receive processing would run in softirq
context, and on the same CPU that did the tcp receive processing. That
would potentially save two context switches per message delivered.

I'm not so sure that its worth having the extra threads unless you are
able to bind each thread to a CPU and ensure that it only processes
packets delivered on that CPU.

Are you just talking about reading here? I assume that the accept per of
it isn't going to be a problem here so that could potentially stay as it
is?

There is an example of something similar to what I'm suggesting in
net/sunrpc/xprtsock.c:xs_tcp_data_recv() and xs_tcp_data_ready().

> 2. Do we need an additional sysfs parameter to the DLM that tells it how many threads to start which defaults
> to the number of CPUs in the system?
>
> 
> 3. Is it worth multi-threading dlm_sendd too?
> 
> I'm not sure it is. dlm_sendd's job is very simple...to put stuff on the TCP (or SCTP) send queue. If that queue is full then the
> request is simply requeued inside the DLM. It's not like dlm_recvd which does actual locking operations.
> 
Its single threaded anyway as soon as it hits the tcp send queue. I
don't know if thats true of SCTP as well.

Steve.