[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 2/5] nodedev: udev: Remove the udevEventHandleCallback on fatal error




On 07/26/2017 02:44 AM, Erik Skultety wrote:
> So we have a sanity check for the udev monitor fd. Theoretically, it
> could happen that the udev monitor fd changes (due to our own wrongdoing,
> hence the 'sanity' here) and if that happens it means we are handling an
> event from a different entity than we think, thus we should remove the
> handle if someone somewhere somehow hits this hypothetical case.
> 
> Signed-off-by: Erik Skultety <eskultet redhat com>
> ---
>  src/node_device/node_device_udev.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 

Should we perhaps "try again" either here (or in nodeStateReload if we
find priv->watch == -1)?  I know a separate patch, but nodedev is fairly
braindead without the udev event handling. In fact, without it being set
up initially, initialization fails.

> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index 80c346e96..ea10dc3ce 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1611,10 +1611,19 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
>  
>      udev_fd = udev_monitor_get_fd(udev_monitor);
>      if (fd != udev_fd) {
> +        udevPrivate *priv = driver->privateData;
> +
>          virReportError(VIR_ERR_INTERNAL_ERROR,
>                         _("File descriptor returned by udev %d does not "
>                           "match node device file descriptor %d"),
>                         fd, udev_fd);
> +
> +        /* this is a non-recoverable error, let's remove the handle, so that we
> +         * don't get in here again because of some spurious behaviour and report
> +         * the same error multiple times
> +         */
> +        virEventRemoveHandle(priv->watch);
> +

Set priv->watch = -1 so that nodeStateCleanup doesn't retry.

Although I think what ends up happening in patch 5 perhaps "resolves"
some weird corner condition...

>          goto cleanup;
>      }
>  
> 

Reviewed-by: John Ferlan <jferlan redhat com>

since this is "better" than previously w/r/t not getting notified
multiple times, fine, but I would hope we could do something to restart
the event mgmt and perhaps even "re" enumerate the devices, but would
need a "testable way" to make it happen.  Still I wonder if comments in
patch 5 end up making this go away.

John


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]