[libvirt] [PATCH 1/3] Domain events - primary implementation

Daniel P. Berrange berrange at redhat.com
Thu Oct 9 17:50:15 UTC 2008


On Thu, Oct 09, 2008 at 11:22:38AM -0400, Ben Guthro wrote:
> This patch is somewhat large, and touches many files.

For large patches like this, its useful to split it up
in order of:

 - The public header API & libvirt.c/driver.c glue layer
 - The daemon code
 - One patch per internal driver

I hope the review that follows doesn't come off too negative - its
basically all just hinging around one core point. Namely that the
tracking & dispatch of callbacks needs to be pushed further down
into the drivers themselves. This is to avoid the driver having to
keep so many references back to the public facing objects & state
which rather tangles things up.


> diff --git a/include/libvirt/libvirt.h b/include/libvirt/libvirt.h
> index d519452..9b6e1da 100644
> --- a/include/libvirt/libvirt.h
> +++ b/include/libvirt/libvirt.h
> @@ -987,6 +987,50 @@ char *                  virStorageVolGetXMLDesc         (virStorageVolPtr pool,
>  
>  char *                  virStorageVolGetPath            (virStorageVolPtr vol);
>  
> +/*
> + * Domain Event Notification
> + */
> +
> +typedef enum {
> +      VIR_DOMAIN_EVENT_ADDED,
> +      VIR_DOMAIN_EVENT_REMOVED,
> +      VIR_DOMAIN_EVENT_STARTED,
> +      VIR_DOMAIN_EVENT_SUSPENDED,
> +      VIR_DOMAIN_EVENT_RESUMED,
> +      VIR_DOMAIN_EVENT_STOPPED,
> +      VIR_DOMAIN_EVENT_SAVED,
> +      VIR_DOMAIN_EVENT_RESTORED,
> +} virDomainEventType;
> +
> +typedef int (*virConnectDomainEventCallback)(virConnectPtr conn,
> +                                             virDomainPtr dom,
> +                                             int event,
> +                                             void *opaque);
> +
> +int virConnectDomainEventRegister(virConnectPtr conn,
> +                                  virConnectDomainEventCallback cb,
> +                                  void *opaque);
> +
> +int virConnectDomainEventDeregister(virConnectPtr conn,
> +                                    virConnectDomainEventCallback cb);
> +
> +/**
> + * virEventHandleCallback: callback for receiving file handle events
> + *
> + * @fd: file handle on which the event occurred
> + * @events: bitset of events from POLLnnn constants

We probably want to explicit define these constants in our header
file, since now this is public we need it to be portable, regardless
of the internal impl.

> diff --git a/qemud/qemud.h b/qemud/qemud.h
> index 91cb939..63784cd 100644
> --- a/qemud/qemud.h
> +++ b/qemud/qemud.h
> @@ -132,6 +132,10 @@ struct qemud_client {
>       */
>      virConnectPtr conn;
>  
> +    /* This client supports events, and has registered for at least
> +       one event type. This is a bitmask of requested event types */
> +    int events_registered;

I'm not sure why the daemon needs to know this. I'd expect
that it just gets a RPC call for each virConnectDomainEventRegister()
and virConnectDomainEventDeregister() invocation, and pass these
straight into the 'virConnectPtr' it has, not register one global
callback and try to filter events itself. 

>  
> +static int remoteRelayDomainEvent (virConnectPtr conn ATTRIBUTE_UNUSED,
> +                                   virDomainPtr dom,
> +                                   int event,
> +                                   void *opaque)
> +{
> +    struct qemud_server *server = opaque;
> +    REMOTE_DEBUG("Relaying domain event %d", event);
> +
> +    struct qemud_client *c = server->clients;
> +    while(c) {
> +        if ( c->conn == conn &&
> +             (c->events_registered & virDomainEvent) ) {

This check shoudln't be needed if you're only registering the
callbacks when the client actually asks for them.

> +            remoteDispatchDomainEventSend (c, dom, event);
> +            qemudDispatchClientWrite(server,c);
> +        } else {
> +            REMOTE_DEBUG("Event class %d not registered for client", virDomainEvent);
> +        }
> +        c = c->next;
> +    }
> +    return 0;
> +}

> @@ -436,6 +469,11 @@ remoteDispatchOpen (struct qemud_server *server ATTRIBUTE_UNUSED,
>          ? virConnectOpenReadOnly (name)
>          : virConnectOpen (name);
>  
> +    /* Register event delivery callback */
> +    if(client->conn) {
> +        REMOTE_DEBUG("%s","Registering to relay remote events");
> +        virConnectDomainEventRegister(client->conn, remoteRelayDomainEvent, server);
> +    }
>      return client->conn ? 0 : -1;
>  }

Hence this shouldn't be needed at time of open.

> +
> +/***************************
> + * Enabe / disable event classes
> + ***************************/
> +static int remoteDispatchEventsEnable (struct qemud_server *server ATTRIBUTE_UNUSED,
> +                                       struct qemud_client *client,
> +                                       remote_message_header *req ATTRIBUTE_UNUSED,
> +                                       remote_events_enable_args *args,
> +                                       remote_events_enable_ret *ret)
> +{
> +    CHECK_CONN(client);
> +    if(args->enable) {
> +        client->events_registered |= args->event_class;
> +    } else {
> +        client->events_registered &= ~args->event_class;
> +    }
> +    ret->success = 1;
> +    return 0;
> +}

This is where we should be registering the callbacks. And equivalent RPC
handlers for unregistering. This also doesn't take into account that
a single virConnectPtr object could have multiple  callbacks registered
to it. eg, if you register 2 callbacks, and then unregister one, the
2nd will no longer get events either.

> diff --git a/qemud/remote_protocol.x b/qemud/remote_protocol.x
> index f848ae5..09acd80 100644
> --- a/qemud/remote_protocol.x
> +++ b/qemud/remote_protocol.x
> @@ -965,6 +965,21 @@ struct remote_storage_vol_get_path_ret {
>      remote_nonnull_string name;
>  };
>  
> +/* Events */
> +struct remote_events_enable_args {
> +    int event_class;
> +    int enable;
> +};

Rather than having an enable flag, the wire protocol should mirror the
public API, with register & unregister RPC calls.

> +
> +struct remote_events_enable_ret {
> +    int success;
> +};

The success flag is redundant - we always serialize an explicit
error reply if something fails.

> diff --git a/src/driver.h b/src/driver.h
> index 655cd05..005fe03 100644
> --- a/src/driver.h
> +++ b/src/driver.h
> @@ -40,6 +40,13 @@ typedef enum {
>      VIR_DRV_OPEN_ERROR = -2,
>  } virDrvOpenStatus;
>  
> +
> +/* Event Classes. (bitmasked value) */
> +typedef enum {
> +    virDomainEvent = 1,
> +    virNodeEvent = 2, /* NYI */
> +} virEventClass;
> +
>  /* Feature detection.  This is a libvirt-private interface for determining
>   * what features are supported by the driver.
>   *
> @@ -280,6 +287,15 @@ typedef unsigned long long
>      (*virDrvNodeGetFreeMemory)
>                      (virConnectPtr conn);
>  
> +typedef int
> +    (*virDrvEventsEnableEventClass)
> +                    (virConnectPtr conn,
> +                     virEventClass event_class,
> +                     int enable);
> +
> +typedef int
> +    (*virDrvDomainEventEmitted)
> +                    (virDomainEventType evt);

Likewise the internal driver API shoudl mirror the public API
pretty much 1-for-1, rather than multiplexing multiple public
APIs onto 1 private API. 

>  /**
>   * _virDriver:
>   *
> @@ -296,6 +312,8 @@ struct _virDriver {
>      int	       no;	/* the number virDrvNo */
>      const char * name;	/* the name of the driver */
>      unsigned long ver;	/* the version of the backend */
> +    virConnectPtr conns; /* the list of active connections */
> +

The static driver callback records should not need to know 
this information - in fact they should be declared 'const'

Any registered callbacks should be tracked internally to the
drivers - if applicable. In the Xen driver there's no need
to track all virConnect objects because each virConnect 
instance will explicitly be getting events from xenstored
independantly of any other virConnect object. 

For stateful drivers, like QEMU, the callbacks should eb tracked
in the internal driver state - eg 'struct qemud_driver' or 
the equivalent.

> diff --git a/src/event.c b/src/event.c
> index 49a9e61..9a39ab7 100644
> --- a/src/event.c
> +++ b/src/event.c
> @@ -70,16 +70,28 @@ int virEventRemoveTimeout(int timer) {
>      return removeTimeoutImpl(timer);
>  }
>  
> -void __virEventRegisterImpl(virEventAddHandleFunc addHandle,
> -                            virEventUpdateHandleFunc updateHandle,
> -                            virEventRemoveHandleFunc removeHandle,
> -                            virEventAddTimeoutFunc addTimeout,
> -                            virEventUpdateTimeoutFunc updateTimeout,
> -                            virEventRemoveTimeoutFunc removeTimeout) {
> +void virEventRegisterHandleImpl(virEventAddHandleFunc addHandle,
> +                                virEventUpdateHandleFunc updateHandle,
> +                                virEventRemoveHandleFunc removeHandle) {
>      addHandleImpl = addHandle;
>      updateHandleImpl = updateHandle;
>      removeHandleImpl = removeHandle;
> +}
> +
> +void __virEventRegisterTimeoutImpl(virEventAddTimeoutFunc addTimeout,
> +                                  virEventUpdateTimeoutFunc updateTimeout,
> +                                  virEventRemoveTimeoutFunc removeTimeout) {
>      addTimeoutImpl = addTimeout;
>      updateTimeoutImpl = updateTimeout;
>      removeTimeoutImpl = removeTimeout;
>  }
> +
> +void __virEventRegisterImpl(virEventAddHandleFunc addHandle,
> +                            virEventUpdateHandleFunc updateHandle,
> +                            virEventRemoveHandleFunc removeHandle,
> +                            virEventAddTimeoutFunc addTimeout,
> +                            virEventUpdateTimeoutFunc updateTimeout,
> +                            virEventRemoveTimeoutFunc removeTimeout) {
> +    virEventRegisterHandleImpl(addHandle,  updateHandle,  removeHandle);
> +    virEventRegisterTimeoutImpl(addTimeout, updateTimeout, removeTimeout);
> +}

There's no need to keep around this API now we've split it up - it was an
internal only symbol, so just update the libvirtd code to call to the
new APIs instead.

> diff --git a/src/internal.h b/src/internal.h
> index a3d48fa..67a3e5b 100644
> --- a/src/internal.h
> +++ b/src/internal.h
> @@ -191,6 +191,18 @@ extern int debugFlag;
>  #define VIR_IS_STORAGE_VOL(obj)		((obj) && (obj)->magic==VIR_STORAGE_VOL_MAGIC)
>  #define VIR_IS_CONNECTED_STORAGE_VOL(obj)	(VIR_IS_STORAGE_VOL(obj) && VIR_IS_CONNECT((obj)->conn))
>  
> +/**
> + * Domain Event Callbacks
> + */
> +struct _virDomainEventCallbackList {
> +    virConnectPtr conn;
> +    virConnectDomainEventCallback cb;
> +    void *opaque;
> +    struct _virDomainEventCallbackList *next;
> +};
> +
> +typedef struct _virDomainEventCallbackList virDomainEventCallbackList;
> +
>  /*
>   * arbitrary limitations
>   */
> @@ -237,6 +249,12 @@ struct _virConnect {
>      virHashTablePtr storagePools;/* hash table for known storage pools */
>      virHashTablePtr storageVols;/* hash table for known storage vols */
>      int refs;                 /* reference count */
> +
> +    /* Domain Callbacks */
> +    virDomainEventCallbackList *domainEventCallbacks;
> +
> +    /* link to next conn of this driver type */
> +    struct _virConnect *next;
>  };

This doesn't make any sense to me - you're making all the virConnect
objects cross reference each other. Each object only needs to know
about the callbacks registered to itself - it doesn't care about
callbacks other virConnect objects have. I'd expect something 
more like

   struct _virDomainEventCallback {
      virConnectDomainEventCallback cb;
      void *opaque;
   };
   typedef struct _virDomainEventCallback virDomainEventCallback;
   typedef virDomainEventCallback *virDomainEventCallbackPtr;

   struct _virDomainEventCallbackList {
     unsigned int ndomainCallbacks;
     struct __virDomainEventCallbackPtr *domainCallbacks;
   }
   typedef struct _virDomainEventCallbackList virDomainEventCallbackList;

I used a explicit array here, because we're trying to kill off the linked
lists, since it simplifies thread safety work, to be able to lock individual
objects, and not worry about its peers pointing to it - only its parent.

The internal drivers would then have virDomainEventCallbackList intances
as they need - eg, in the _xenUnifiedPrivate struct for Xen drivers, or
in the 'struct qemud_driver' for QEMU, etc.

> diff --git a/src/libvirt.c b/src/libvirt.c
> index e06e9f3..9472646 100644
> --- a/src/libvirt.c
> +++ b/src/libvirt.c
> @@ -785,6 +785,8 @@ do_open (const char *name,
>          if (res == VIR_DRV_OPEN_ERROR) goto failed;
>          else if (res == VIR_DRV_OPEN_SUCCESS) {
>              ret->driver = virDriverTab[i];
> +            ret->next = ret->driver->conns;
> +            ret->driver->conns = ret;

The virDriver tables should be considered const & not changed.


>  
>      if (virUnrefConnect(conn) < 0)
> @@ -1427,6 +1442,7 @@ virDomainLookupByName(virConnectPtr conn, const char *name)
>  int
>  virDomainDestroy(virDomainPtr domain)
>  {
> +    int ret;
>      virConnectPtr conn;
>  
>      DEBUG("domain=%p", domain);
> @@ -1442,8 +1458,14 @@ virDomainDestroy(virDomainPtr domain)
>          return (-1);
>      }
>  
> -    if (conn->driver->domainDestroy)
> -        return conn->driver->domainDestroy (domain);
> +    if (conn->driver->domainDestroy) {
> +        ret = conn->driver->domainDestroy (domain);
> +        if(!ret &&
> +            conn->driver->domainEventEmitted &&
> +            !conn->driver->domainEventEmitted(VIR_DOMAIN_EVENT_STOPPED))
> +            virBroadcastDomainEvent(domain, VIR_DOMAIN_EVENT_STOPPED);
> +        return ret;
> +    }

This and all the other chunks doing the same thing are not required.

When you call down into the internal driver's 'domainDestroy' method
it should take care of raising events (if required). In the case of
Xen, there'll be no need for the driver itself to ever raise events,
because these get raised indirectly by the xenstore watches. In the
case of the QEMU driver (and other stateful drivers), the intenral
driver state (eg, struct qemud_driver) should have a list of all 
active callbacks, and will invoke them when domains start/die/pause
etc. 

So explicitly raising events in the libvirt.c file is unneccessary,
and can actually result in bogus and/or missed events. eg if you
send SIGHUP to the daemon it will reload QEMU config files and 
autostart guests. None of this goes via the libvirt.c file, so
if we were to rely on libvirt.c for raising events that would 
imply it would miss many events.

Or in the domain destroy example - an admin can destroy a domain
outside of libvirt by simply sending kill -TERM to the QEMU process.
So again, the qemu driver itself must take responsbility for 
raising the events. I see further down you've already got the QEMU
driver raising events when needed, so I imagine you can just delete
all this stuff in libvirt.c

> + * virConnectDomainEventRegister:
> + * @conn: pointer to the connection
> + * @cb: callback to the function handling domain events
> + * @opaque: opaque data to pass on to the callback
> + *
> + * Adds a Domain Event Callback
> + *
> + * Returns 0 on success, -1 on failure
> + */
> +int virConnectDomainEventRegister(virConnectPtr conn,
> +                                  virConnectDomainEventCallback cb,
> +                                  void *opaque)
> +{
> +    int ret = 0;
> +    int first = conn->domainEventCallbacks == NULL;
> +    virDomainEventCallbackList *newNode;
> +
> +    if (VIR_ALLOC(newNode) < 0)
> +        return -1;
> +
> +    newNode->next   = conn->domainEventCallbacks;
> +    newNode->conn   = conn;
> +    newNode->cb     = cb;
> +    newNode->opaque = opaque;
> +    conn->domainEventCallbacks = newNode;

The callbacks should be passed right down into the drivers for
them to track as they need. 

> +
> +    /* Registering for a domain callback will enable delivery by default */
> +    if (conn->driver && conn->driver->enableEventClass && first)
> +        ret = conn->driver->enableEventClass (conn, virDomainEvent, 1);

And the internal method naming should map to the public API naming.

> diff --git a/src/qemu_conf.h b/src/qemu_conf.h
> index 88dfade..4d22119 100644
> --- a/src/qemu_conf.h
> +++ b/src/qemu_conf.h
> @@ -68,6 +68,7 @@ struct qemud_driver {
>      char *vncListen;
>  
>      virCapsPtr caps;
> +    virDriverPtr vir_driver;
>  };

The QEMU driver needs to be tracking the callbacks against its
'struct qemud_driver' struct explicitly - the virDriver should
be considered readonly.

>  
> +static int qemudDomainEventSupported(virDomainEventType evt)
> +{
> +    switch(evt) {
> +        case VIR_DOMAIN_EVENT_STARTED:
> +        case VIR_DOMAIN_EVENT_STOPPED:
> +        case VIR_DOMAIN_EVENT_SUSPENDED:
> +        case VIR_DOMAIN_EVENT_RESUMED:
> +        case VIR_DOMAIN_EVENT_SAVED:
> +        case VIR_DOMAIN_EVENT_RESTORED:
> +            DEBUG("%s: %d", __FUNCTION__, (int)evt);
> +            return true;
> +        default:
> +            return false;
> +    }
> +    return false;
> +}

Not sure I see where this is exposed / used in the public API ?

> diff --git a/src/remote_internal.c b/src/remote_internal.c
> index 06b0f4f..2501e1f 100644
> --- a/src/remote_internal.c
> +++ b/src/remote_internal.c


>  static int really_write (virConnectPtr conn, struct private_data *priv,
> @@ -4362,19 +4395,30 @@ call (virConnectPtr conn, struct private_data *priv,
>      }
>      xdr_destroy (&xdr);
>  
> +    /* Lock on the connection semaphore, so we do not pull
> +     * data off the wire if an async event fires while we
> +     * are waiting on the response */
> +    pthread_mutex_lock(&priv->lock);
> +
>      /* Send length word followed by header+args. */
>      if (really_write (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer2, sizeof buffer2) == -1 ||
> -        really_write (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer, len-4) == -1)
> +        really_write (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer, len-4) == -1) {
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
> +        }
>  
> +retry_read:
>      /* Read and deserialise length word. */
> -    if (really_read (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer2, sizeof buffer2) == -1)
> +    if (really_read (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer2, sizeof buffer2) == -1) {
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
> +    }
>  
>      xdrmem_create (&xdr, buffer2, sizeof buffer2, XDR_DECODE);
>      if (!xdr_int (&xdr, &len)) {
>          error (flags & REMOTE_CALL_IN_OPEN ? NULL : conn,
>                 VIR_ERR_RPC, _("xdr_int (length word, reply)"));
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
>      }
>      xdr_destroy (&xdr);
> @@ -4385,18 +4429,22 @@ call (virConnectPtr conn, struct private_data *priv,
>      if (len < 0 || len > REMOTE_MESSAGE_MAX) {
>          error (flags & REMOTE_CALL_IN_OPEN ? NULL : conn,
>                 VIR_ERR_RPC, _("packet received from server too large"));
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
>      }
>  
>      /* Read reply header and what follows (either a ret or an error). */
> -    if (really_read (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer, len) == -1)
> +    if (really_read (conn, priv, flags & REMOTE_CALL_IN_OPEN, buffer, len) == -1) {
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
> +    }
>  
>      /* Deserialise reply header. */
>      xdrmem_create (&xdr, buffer, len, XDR_DECODE);
>      if (!xdr_remote_message_header (&xdr, &hdr)) {
>          error (flags & REMOTE_CALL_IN_OPEN ? NULL : conn,
>                 VIR_ERR_RPC, _("invalid header in reply"));
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
>      }
>  
> @@ -4407,6 +4455,7 @@ call (virConnectPtr conn, struct private_data *priv,
>                           VIR_ERR_RPC, VIR_ERR_ERROR, NULL, NULL, NULL, 0, 0,
>                           _("unknown program (received %x, expected %x)"),
>                           hdr.prog, REMOTE_PROGRAM);
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
>      }
>      if (hdr.vers != REMOTE_PROTOCOL_VERSION) {
> @@ -4415,19 +4464,30 @@ call (virConnectPtr conn, struct private_data *priv,
>                           VIR_ERR_RPC, VIR_ERR_ERROR, NULL, NULL, NULL, 0, 0,
>                           _("unknown protocol version (received %x, expected %x)"),
>                           hdr.vers, REMOTE_PROTOCOL_VERSION);
> +        pthread_mutex_unlock(&priv->lock);
>          return -1;
>      }
>  
> -    /* If we extend the server to actually send asynchronous messages, then
> -     * we'll need to change this so that it can recognise an asynch
> -     * message being received at this point.
> -     */
> +    if (hdr.proc == REMOTE_PROC_EVENTS_DOMAIN_EVENT &&
> +        hdr.direction == REMOTE_MESSAGE) {
> +        /* An async message has come in while we were waiting for the
> +         * response. Process it to pull it off the wire, and try again
> +         */
> +        DEBUG0("Encountered an event while waiting for a response");
> +
> +        remoteDomainProcessEvent(conn, &xdr);
> +
> +        DEBUG0("Retrying read");
> +        xdr_destroy (&xdr);
> +        goto retry_read;
> +    }

There's a small safety problem here - it is not re-entrant safe. eg the
app could be in a virDomainCreate() call, get an event saying that a
domain has paused, invoke its callback, and the callback then calls
into another libvirt API, ad-infinitum. Aside from the mutex locks 
here in the call() method, this is also cause problems with locks in
the application itself which we can't rely on being recursive mutexes.

So for sake of safety, we need to queue up all events we receive while
reading a RPC reply. Then at the end of the call() method, register a
timeout event with 0 second timeout. So when the current libvirt API
call completes, and the application goes back into its event loop,
the timeout will immediately fire, and we can dispatch the queued
libvirt events from the known safe context.

In fat I think this scheme ought to let you do away with the mutex
locking completely. The contract for virConnectPtr dictates that
you are forbidden to use a single virConnectPtr object from more than
one thread at once, so if we're queueing & dispatching events from
and timeout handler, we shouldn't ever get a reentrancy/locking 
problem.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|




More information about the libvir-list mailing list