[libvirt] [PATCH v2] nodedev: Document the udevEventHandleThread

Erik Skultety eskultet at redhat.com
Fri Nov 2 12:05:44 UTC 2018


On Fri, Nov 02, 2018 at 07:46:35AM -0400, John Ferlan wrote:
>
>
> On 11/2/18 3:48 AM, Erik Skultety wrote:
> > On Thu, Nov 01, 2018 at 01:48:39PM -0400, John Ferlan wrote:
> >> Commit cdbe1332 neglected to document the API. So let's add some
> >> details about the algorithm and why it was used to help future
> >> readers understand the issues encountered. Based largely off a
> >> description provided by Erik Skultety <eskultet at redhat.com>.
> >
> > Oh, I missed ^this last sentence. Since you're blaming the commit already and
> > the description based on a mailing list thread will not be part of the log, you
> > should IMHO drop it.
> >
> >>
> >> NB: Management of the processing udev device notification is a
> >> delicate balance between the udev process, the scheduler, and when
> >> exactly the data from/for the socket is received. The balance is
> >> particularly important for environments when multiple devices are
> >> added into the system more or less simultaneously such as is done
> >> for mdev.
> >
> > "or SRIOV"
> >
> > Note: I can't really remember what I used for reproducer, mdevs or a SRIOV
> > card.
> >
> >> In these cases scheduler blocking on the udev recv() call
> >
> > I don't think scheduler is blocking anywhere, it's rather old libudev blocking
> > on the recv call ;)
> >
> >> is more noticeable. It's expected that future devices will follow
> >> similar algorithms. Even though the algorithm does present some
> >> challenges for older OS's (such as Centos 6), trying to rewrite
> >> the algorithm to fit both models would be more complex and involve
> >> pulling the monitor object out of the private data lockable object
> >> and would need to be guarded by a separate lock. Devising such an
> >> algorithm to work around issues with older OS's at the expense
> >> of more modern OS algorithms in newer event processing code may
> >> result in unexpected issues, so the choice is to encourage use
> >> of newer OS's with newer udev event processing code.
> >>
> >> Signed-off-by: John Ferlan <jferlan at redhat.com>
> >> ---
> >>
> >>  v1: https://www.redhat.com/archives/libvir-list/2018-October/msg01053.html
> >>
> >>  Changes are from code review with some minor tweaks/editing as I
> >>  went along. Mistakes are my own!
> >>
> >>  src/node_device/node_device_udev.c | 28 ++++++++++++++++++++++++++++
> >>  1 file changed, 28 insertions(+)
> >>
> >> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> >> index 22897591de..f2c2299d4d 100644
> >> --- a/src/node_device/node_device_udev.c
> >> +++ b/src/node_device/node_device_udev.c
> >> @@ -1591,6 +1591,26 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv,
> >>  }
> >>
> >>
> >> +/**
> >> + * udevEventHandleThread
> >> + * @opaque: unused
> >> + *
> >> + * Thread to handle the udevEventHandleCallback processing when udev
> >> + * tells us there's a device change for us (add, modify, delete, etc).
> >> + *
> >> + * Once notified there is data to be processed, the actual @device
> >> + * data retrieval by libudev may be delayed due to how threads are
> >> + * scheduled. In fact, the event loop could be scheduled earlier than
> >> + * the handler thread, thus potentially emitting the very same event
> >> + * the handler thread is currently trying to process, simply because
> >> + * the data hadn't been retrieved from the socket.
> >> + *
> >
> > ...
> >
> >> + * NB: Usage of event based socket algorithm causes some issues with
> >> + * older platforms such as CentOS 6 in which the data socket is opened
> >
> > ^Sounds a bit too generic, as an event based algorithm is not forced to open a
> > socket without O_NONBLOCK - just put something like "older platforms' libudev
> > opens sockets without the NONBLOCK flag which might cause issues with event
> > based algorithm" or something alike in there.
> >
>
>  * NB: Some older distros, such as CentOS 6, libudev opens sockets
>  * without the NONBLOCK flag which might cause issues with event
>  * based algorithm. Although the issue can be mitigated by resetting
>  * priv->dataReady for each event found; however, the scheduler issues
>  * would still come into play.

Sounds good...

Erik




More information about the libvir-list mailing list