[Linux-cluster] GFS cluster / DLM locking - Mostly idle but high load

Gordan Bobic gordan at bobich.net
Wed Oct 17 07:24:09 UTC 2007


On Wed, 17 Oct 2007, Nikolas Lam wrote:

>> I have a cluster (3 nodes at the moment, may grow up to 16) for handling a
>> lot of small files (Maildir). When I test the system by sending around 3-5
>> messages/second I see the load on the cluster nodes go up to about 20-30,
>> even though the CPUs on the cluster are about 90% idle at all times.
>>
>> I am guessing that this is due to the clustered machines waiting for DLM
>> locks to be established, which causes a lot of processes to be fighting to
>> run, but since they don't get to run very soon, they back up and cause the
>> load averages to go up.
>>
>> Assuming the DLM runs over the interface specified by IP and MAC in
>> cluster.conf, it is running over gigabit ethernet.
>>
>> Are there any configuration changes or tuning parameters I can apply to
>> DLM to alleviate this condition? The machine I'm running the test from
>> (the one sending messages) is about 1/4 of the spec of each of the cluster
>> nodes, and it's running a load average of about 0.4. It seems crazy that a
>> single low-spec node should be able to completely overwhelm a cluster 12x
>> it's spec several times over.
>
> I don't know alot about GFS but since no one else has replied yet, my
> understanding is that it's not suitable for an applications like what
> you describe (many small files being opened frequently). I think GFS2,
> which is still a tech preview, has been redesigned to improve this
> situation.

Indeed, I am aware that GFS2 is still broken, but I seem to be getting no 
worse a performance out of GFS than I get out of NFS. The only penalty is 
the high load, but the throughput is actually similar. The advantage that 
makes GFS win is that I don't need an arbitrating server to handle the NFS 
exports, which makes the clustering and redundancy a bit tidier.

Gordan




More information about the Linux-cluster mailing list