[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] Ongoing work on lock contention in qemu driver?

On Thu, May 16, 2013 at 12:09:39PM -0400, Peter Feiner wrote:
> Hello Daniel,
> I've been working on improving scalability in OpenStack on libvirt+kvm
> for the last couple of months. I'm particularly interested in reducing
> the time it takes to create VMs when many VMs are requested in
> parallel.
> One apparent bottleneck during virtual machine creation is libvirt. As
> more VMs are created in parallel, some libvirt calls (i.e.,
> virConnectGetLibVersion and virDomainCreateWithFlags) take longer
> without a commensurate increase in hardware utilization.
> Thanks to your patches in libvirt-1.0.3, the situation has improved.
> Some libvirt calls OpenStack makes during VM creation (i.e.,
> virConnectDefineXML) have no measurable slowdown when many VMs are
> created in parallel. In turn, parallel VM creation in OpenStack is
> significantly faster with libvirt-1.0.3. On my standard benchmark
> (create 20 VMs in parallel, wait until the VM is ACTIVE, which is
> essentially after virDomainCreateWithFlags returns), libvirt-1.0.3
> reduces the median creation time from 90s to 60s when compared to
> libvirt-0.9.8.

How many CPU cores are you testing on ?  That's a good improvement,
but I'd expect the improvement to be greater as # of core is larger.

Also did you tune /etc/libvirt/libvirtd.conf at all ? By default we
limit a single connection to only 5 RPC calls. Beyond that calls
queue up, even if libvirtd is otherwise idle. OpenStack uses a
single connection for everythin so will hit this. I suspect this
would be why  virConnectGetLibVersion would appear to be slow. That
API does absolutely nothing of any consequence, so the only reason
I'd expect that to be slow is if you're hitting a libvirtd RPC
limit causing the API to be queued up.

> I'd like to know if your concurrency work in the qemu driver is
> ongoing. If it isn't, I'd like to pick the work up myself and work on
> further improvements. Any advice or insight would be appreciated.

I'm not actively doing anything in this area. Mostly because I've got not
clear data on where any remaining bottlenecks are. 

One theory I had was that the virDomainObjListSearchName method could
be a bottleneck, becaue that acquires a lock on every single VM. This
is invoked when starting a VM, when we call virDomainObjListAddLocked.
I tried removing this locking though & didn't see any performance
benefit, so never persued this further.  Before trying things like
this again, I think we'd need to find a way to actually identify where
the true bottlenecks are, rather than guesswork.

|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]