[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] help debugging event timing

On 01/01/2014 03:44 PM, Eric Blake wrote:
> In my recent work on events, I'm running into something where I'm
> stumped on timing behavior.  I'm using
> virEventRegisterDefaultImpl()/virEventRunDefaultImpl() as my event loop,
> and trying to test that an event is firing correctly.

Ping.  I'm not seeing the delays in my 'virsh event' patches, where the
event loop is being handled in a separate thread; so the problem may
only be in situations where the event loop is being drained by a single
thread (pointing to a problem in libvirt.so and not in the server).  But
I'm still stumped on how to pinpoint what the problem is.

> With the test:///default (or any other local URI), if I register an
> event callback function, the callback gets invoked almost right after
> the event happens.  This is the same behavior under test in our
> testsuite (in objecteventtest.c). But with qemu:///system (or any other
> remote URI), when I register an event callback function, my callback
> does not get invoked until I make some other RPC call on the connection.
>  At first, I thought the events were not being wired up at all, and that
> I was just blocking forever because I missed something, but while
> stepping through gdb on the client side, I noticed that calling
> virConnectDomainEventDeregisterAny() in my shutdown code then indeed
> proceeded to go through the entire backlog of events that had been
> generated, but at that point I was no longer registered so my callback
> was not called.  So I ended up using virEventAddTimeout() to set up a
> periodic interrupt (at 10 seconds, slow enough to see the delays), where
> in the handler I called virConnectGetLibVersion() as a way to force RPC
> traffic and at least get events as a result of my timer.  If it matters,
> I was initially using keepalive_interval=-1 in
> /etc/libvirt/libvirtd.conf in order to avoid killing my debug sessions;
> but even when I switched to keepalive_interval=3, I'm still seeing the
> same delayed behavior (if the keepalive were to make a difference, I
> would have expected seeing the event at 3 seconds instead of my timer at
> 10 seconds).
> I thought that the RPC protocol was supposed to allow asynchronous
> events, and with minimal downtime.  gdb in libvirtd says that
> remoteDispatchObjectEventSend() is called in a timely manner, but that
> merely queues things up, and I got lost trying to follow where
> virNetSocketUpdateIOCallback() turned into something that wakes up the
> actual write() to the socket.  On the other side, I don't see a call to
> virDomainEventDispatchDefaultFunc() which then calls my callback until
> the next RPC call/response, but again, I got lost on trying to trace
> whether a read() of the socket was getting data in a timely manner but
> just not firing anything in my event loop, or whether things were
> getting stuck client side after the read but with nothing waking up my
> event loop.  Any advice on how to better debug where things are getting
> stuck, to determine whether the holdup is in the server not sending or
> in the client not being responsive to the send?  Is it something where
> systemtap might give me a better picture?  Or better yet, a patch to fix
> the problem would be nice.

Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]