[libvirt PATCH] nodedev: fix race in API usage vs initial device enumeration

Daniel P. Berrangé berrange at redhat.com
Mon Mar 16 16:09:01 UTC 2020


On Mon, Mar 16, 2020 at 05:03:59PM +0100, Michal Prívozník wrote:
> On 13. 3. 2020 13:00, Daniel P. Berrangé wrote:
> > During startup the udev node device driver impl uses a background thread
> > to populate the list of devices to avoid blocking the daemon startup
> > entirely. There is no synchronization to the public APIs, so it is
> > possible for an application to start calling APIs before the device
> > initialization is complete.
> > 
> > This was not a problem in the old approach where libvirtd was started
> > on boot, as initialization would easily complete before any APIs were
> > called.
> > 
> > With the use of socket activation, however, APIs are invoked from the
> > very moment the daemon starts. This is easily seen by doing a
> > 
> >   'virsh -c nodedev:///system list'
> > 
> > the first time it runs it will only show one or two devices. The second
> > time it runs it will show all devices. The solution is to introduce a
> > flag and condition variable for APIs to synchronize against before
> > returning any data.
> > 
> > Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
> > ---
> >  src/conf/virnodedeviceobj.h          |  2 ++
> >  src/node_device/node_device_driver.c | 44 ++++++++++++++++++++++++++++
> >  src/node_device/node_device_hal.c    | 15 ++++++++++
> >  src/node_device/node_device_udev.c   | 13 ++++++++
> >  4 files changed, 74 insertions(+)
> > 
> 
> For both drivers:
> 
> > @@ -1744,6 +1745,11 @@ nodeStateInitializeEnumerate(void *opaque)
> >      if (udevEnumerateDevices(udev) != 0)
> >          goto error;
> >  
> > +    nodeDeviceLock();
> > +    driver->initialized = true;
> > +    nodeDeviceUnlock();
> > +    virCondBroadcast(&driver->initCond);
> 
> Should this broadcast be in the critical section? Just asking, for this
> simple case it may be okay.

The critical section protects the updating of the master variable,
"initialized" in this case.

The condition variable signalling just wakes up the other thread
which then acquires this same mutex to read "initialized". Spurious
wake ups are permitted with condition variables - that's why the
waiter side of a condition variable always uses a while() loop
to check the variable & optionally go back to sleep

IOW, condition variable signalling does not need to be part of
any critical section. It merely a means to kick another thread.

> > @@ -1806,6 +1812,13 @@ nodeStateInitialize(bool privileged,
> >          VIR_FREE(driver);
> >          return VIR_DRV_STATE_INIT_ERROR;
> >      }
> > +    if (virCondInit(&driver->initCond) < 0) {
> > +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> > +                       _("Unable to initialize condition variable"));
> 
> And perhaps, virReportSystemError() would be nicer.

Yeah, we really ought to just push the error reporting down a level
into the method, as every caller repeats this.
> 
> Reviewed-by: Michal Privoznik <mprivozn at redhat.com>
> 
> Michal

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list