[libvirt] [PATCH] qemu: conf: Work around race condition on libvirt start

Daniel P. Berrange berrange at redhat.com
Mon Dec 9 11:20:41 UTC 2013


On Mon, Dec 09, 2013 at 12:15:52PM +0100, Peter Krempa wrote:
> On 12/09/13 11:56, Daniel P. Berrange wrote:
> > On Sat, Dec 07, 2013 at 02:03:00AM -0500, Adam Walters wrote:
> >> This patch works around a race condition present during
> >> libvirt startup. The race condition only applies when
> >> using storage pool volumes for domain disks, and even
> >> then, only when restarting libvirt with running domains.
> >>
> >> The gist of the patch is simply to enter a (limited)
> >> retry loop during qemuTranslateDiskSourcePool. This
> >> particular implementation does have a slight drawback,
> >> though. Within that function, I can not determine if
> >> we are currently starting libvirt, or if we are just
> >> starting up a domain. In the latter case, this could
> >> cause a 800ms delay in reporting an error that the
> >> storage pool is inactive.
> >>
> >> I am happy to report, however, that with this patch,
> >> domains continue to run without restarts regardless
> >> of how often I restart libvirt. I have a fairly fast
> >> hypervisor, and have never seen it try more than one
> >> iteration of the retry loop unless I purposely set
> >> one of the storage pools to be inactive.
> >> ---
> >>  src/qemu/qemu_conf.c | 11 +++++++++++
> >>  1 file changed, 11 insertions(+)
> >>
> >> diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c
> >> index c28908a..2e52fbf 100644
> >> --- a/src/qemu/qemu_conf.c
> >> +++ b/src/qemu/qemu_conf.c
> >> @@ -1359,6 +1359,8 @@ qemuTranslateDiskSourcePool(virConnectPtr conn,
> >>      virStorageVolInfo info;
> >>      int ret = -1;
> >>      virErrorPtr savedError = NULL;
> >> +    int attempt = 0;
> >> +    int poolAutostart;
> >>  
> >>      if (def->type != VIR_DOMAIN_DISK_TYPE_VOLUME)
> >>          return 0;
> >> @@ -1369,7 +1371,16 @@ qemuTranslateDiskSourcePool(virConnectPtr conn,
> >>      if (!(pool = virStoragePoolLookupByName(conn, def->srcpool->pool)))
> >>          return -1;
> >>  
> >> +retry:
> >>      if (virStoragePoolIsActive(pool) != 1) {
> >> +        if (!(virStoragePoolGetAutostart(pool, &poolAutostart) < 0))
> >> +            if (poolAutostart && attempt < 4) {
> >> +                VIR_DEBUG("Waiting for storage pool '%s' to activate",
> >> +                          def->srcpool->pool);
> >> +                usleep(200*1000); /* sleep 200ms */
> >> +                attempt++;
> >> +                goto retry;
> >> +            }
> >>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> >>                         _("storage pool '%s' containing volume '%s' "
> >>                           "is not active"),
> > 
> > This is rather dubious and points toa bug elsewhere. The storage driver
> > should complete its autostart before the QEMU driver even starts its
> > own initialization.
> 
> We definitely need to make sure that storage is available at that point.
> 
> This particular regression was introduced by my commit e1a4d08baf9a
> although was latently present for a few releases now as the volume code
> isn't used that much probably.
> 
> Prior to my patch that added the check whether the pool is available we
> were blindly assuming that the pool was online. Pool drivers like
> gluster don't have their internal structures initialized if the pool
> isn't started and thus the translation would fail either way.
> 
> Also the translation function is called separately from the reconnection
> code, thus we can pass different arguments to it so we don't spoil the
> normal code paths with unnecessary delays and other stuff.
> 
> The question is what to do with domains that have storage on pools that
> can't be initialized. Should we kill those? Should we skip translation
> of the source and then something later may fail?

We do we need todo translation when restarting libvirtd ?  Surely we
should only do translation when initialy starting a guest, and then
remember that data thereafter. If we ever try re-translating data later
for a running guest, we risk getting different answers which would be
bad.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list