[Libvirt-cim] [PATCH] [RFC] Implement libvirt event callback management

Hollis Blanchard hollisb at us.ibm.com
Tue Aug 18 23:45:49 UTC 2009


On Tue, 2009-08-18 at 16:24 -0700, Kaitlin Rupert wrote:
> Hi Hollis,
> 
> I applied your patch and the patches Richard submitted to do some 
> testing. I noticed a few issues commented below...
> 
> 
> > +/* Delete all watches marked for deletion. */
> > +static void event_watch_free_deleted(void)
> > +{
> > +        struct watch *cur;
> > +        struct watch **link;
> > +
> > +        CU_DEBUG("%s", __func__);
> > +
> > +        pthread_mutex_lock(&watch_list_mutex);
> > +
> > +        cur = watch_list; 
> > +        link = &watch_list;
> > +        while (cur != NULL) {
> > +                struct watch *next = cur->next;
> > +
> > +                if (cur->deleted) {
> > +                        *link = next;
> > +
> > +                        cur->ff(cur->opaque);
> 
> I'm seeing a seg fault here because once we reach this point, cur is NULL.
> 
> > +                        free(cur);
> > +                        watch_count--;
> > +                } else
> > +                        link = &cur->next;
> > +
> > +                cur = next;
> > +        }
> > +
> > +        pthread_mutex_unlock(&watch_list_mutex);
> > +}
> > +

Am I missing something obvious? The loop only runs while cur is not
NULL.

Moreover, the list is locked, so concurrent list modifications wouldn't
even explain it... unless the other user didn't take the lock (which we
do on purpose sometimes). From the CU_DEBUG log, can you tell if there's
any concurrency going on?

> > +/* One thread to watch all fds for all events for all libvirt threads. */
> > +static void *event_thread(void *ptr)
> > +{
> > +        while (1) {
> > +                struct watch *cur;
> > +                struct pollfd *pollfds;
> > +                struct pollfd *pollfd;
> > +                int timeout;
> > +                int i;
> > +
> > +                pollfds = malloc(sizeof(struct pollfd) * watch_count);
> > +
> > +                /* fill in pollfds array from our watch list */
> > +                for (pollfd = &pollfds[0], cur = watch_list;
> > +                     cur != NULL;
> > +                     pollfd++, cur = cur->next) {
> > +                        pollfd->fd = cur->fd;
> > +                        pollfd->events = libvirt_to_poll_events(cur->events);
> > +                }
> > +
> > +                timeout = event_next_timeout();
> > +
> > +                poll(pollfds, watch_count, timeout);
> > +
> > +                /* invoke callbacks */
> > +                for (i = 0; i < watch_count; i++)
> > +                        for (cur = watch_list; cur != NULL; cur = cur->next)
> > +                                if (cur->fd == pollfds[i].fd
> 
> When I generate a event, poll never seems to catch it.  I tried forcing 
> this by changing the timeout value to a minute.  And then generating the 
> event. I can see from the libvirt debug that the event has been generated:
> 
> 15:00:34.232: debug : virEventRunOnce:567 : Poll got 1 event
> 15:00:34.239: debug : virEventDispatchHandles:450 : Dispatch n=2 f=8 w=3 
> e=1 0x7f2a57d6e6b0

These messages are from qemud? So the event is generated on the other
side of the "remote" pipe?

> In the eventAddHandle() call, we have:
> 
> event.c(75): eventAddHandle
> event.c(82): ++++++++++++++watch->id is 0
> event.c(84): ++++++++++++++watch->fd is 11
> event.c(86): ++++++++++++++watch->events is 1
> event.c(88): ++++++++++++++watch->cb is 0x6b8f4c0, cb is 0x6b8f4c0
> event.c(90): ++++++++++++++watch->opaque is 0x7fffdc014900

These seem reasonable, so we should have a list with one entry.

> 
> I haven't tracked down what is happening here.
> 
> 
> > +                                    && !cur->deleted) {
> > +                                        invoke_callback(cur, &pollfds[i]);
> > +                                        break;
> > +                                }
> > +
> > +                free(pollfds);
> > +
> > +                event_watch_free_deleted();
> > +                event_timer_free_deleted();
> > +        }
> > +
> > +        return NULL;
> > +}

Hmm, actually I'd expect the opposite: this code seems to *always*
invoke the callback, even when no event is pending (oops!).

Since that's not happening, it may be that the pollfds structure isn't
being created properly, in particular the fd member.

Another possibility is that watch_count is somehow 0, which would also
hose pollfds allocation...

I guess strace output (for poll()), CU_DEBUG logs, and some gdb work
would help.

-- 
Hollis Blanchard
IBM Linux Technology Center




More information about the Libvirt-cim mailing list