[Linux-cluster] GFS2 locking in a VM based cluster (KVM)

Mon Mar 21 11:22:29 UTC 2011

Hi,

On Thu, 2011-03-17 at 17:17 +0200, C.D. wrote:
> Hello,
> 
> sorry guys to resurrect an old thread, but I have to say I can confirm
> that, too. I have a libvirt setup with multipathed FC SAN devices and
> KVM guests running on top of it. The physical machine is HP 465c G7 (2
> x 12 Core Magny-Cours with 96GB RAM). The host OS is Fedora 14. The
> guests are Scientific Linux 6. With gfs2 10GB shared LUN I can manage
> ~600k plocks/sec while both machines mounted the LUN. I started:
> ping_pong some_file 3 on one of the VMs and got those 600k plocks.
> Then I started ping_pong the_same_file 3 on the second machines and
> got around 360 plocks/sec (that is 360, not 360 000). No matter what I
> tried I couldn't optimize it. If I stop the ping_pong on one of the
> VMs the plocks wen't up to around 500-550 plocks/sec (again 550 not
> 550k). Stopping the process. Waiting a while and starting again on a
> single machine still got me around 600k plocks. This I could reproduce
> both with tcp and sctp and tried bunch of different settings.
> 
That is expected, since when only one node is using the plocks then the
lock will be kept locally and will thus be very fast. I assume that you
have plock ownership turned on in the cluster.conf?

As soon as you introduce the second node, this will no longer be the
case, and the net result is that it will take a lot longer to grant the
lock.

> Then I decided to give ocfs2 a change. Compiling the module on SL6,
> and I suppose on RHEL6, is not the most straight forward taks, buth
> half an hour later I got the module compiled from the sources of the
> EL kernel. Stripped all debug symbols. Copied the ocfs2 kernel module
> dir to both VM machines. Did depmod -a, I set up the oracle fs on top
> of the same LUN. Used ping_pong the_same_file_i_used_in_the_first_test
> 3 on just one machine, while both VMs have mounted the LUN. 1600k
> plocks/sec (as in ~1 600 000 ). Started ping_pong on the second host.
> The plocks did not move at all. Still 1600k plocks/sec. Tested with
> the real life app. It worked very well, unlike gfs2, which was
> painfully slow with just 2 users. I created the ocfs2 with -T mail, I
> didn't do any tuning on it, either.
> 
Unless you are using OCFS2 with the RHCS cluster suite, it does not
support clustered fcntl locks. As a result you are probably measuring
the speed of local fcntl locking, not clustered locking.

What tuning did you do on GFS2? What options do you have in cluster.conf
relating to fcntl locks?

> I'm not trying to bash gfs2, actually I would definitely prefer it
> over ocfs2 anytime, however it seems it doesn't work well with VM for
> some reason. I have used both mtu 1500 and 9000 also, it just didn't
> make any diffence, no matter what I have tried.I haven't tested the
> same setup on top of two physical nodes, but I have the feeling it
> will work just as good as ocfs2 on the VMs. I didn't test with
> hugepages for the VMs, but I somehow doubt that would make much of a
> difference.
> 
> I think this should be investigates by someone at RH possibly because
> they are the driving force behind both KVM, libvirt, the cluster soft
> and gfs2.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

The mtu is unlikely to make much of a difference. With locking the most
important aspect is latency, rather than throughput,

Steve.