[libvirt] high outage times for qemu virtio network links during live migration, trying to debug

Chris Friesen chris.friesen at windriver.com
Tue Jan 26 17:49:34 UTC 2016


On 01/26/2016 11:31 AM, Paolo Bonzini wrote:
>
>
> On 26/01/2016 18:21, Chris Friesen wrote:
>>>>
>>>> My question is, why doesn't qemu continue processing virtio packets
>>>> while the dirty page scanning and memory transfer over the network is
>>>> proceeding?
>>>
>>> QEMU (or vhost) _are_ processing virtio traffic, because otherwise you'd
>>> have no delay---only dropped packets.  Or am I missing something?
>>
>> I have separate timestamps embedded in the packet for when it was sent
>> and when it was echoed back by the target (which is the one being
>> migrated).  What I'm seeing is that packets to the guest are being sent
>> every msec, but they get delayed somewhere for over a second on the way
>> to the destination VM while the migration is in progress.  Once the
>> migration is over, a bunch of packets get delivered to the app in the
>> guest and are then processed all at once and echoed back to the sender
>> in a big burst (and a bunch of packets are dropped, presumably due to a
>> buffer overflowing somewhere).
>
> That doesn't exclude a bug somewhere in net/ code.  It doesn't pinpoint
> it to QEMU or vhost-net.
>
> In any case, what I would do is to use tracing at all levels (guest
> kernel, QEMU, host kernel) for packet rx and tx, and find out at which
> layer the hiccup appears.

Is there a straightforward way to trace packet processing in qemu (preferably 
with millisecond-accurate timestamps)?

Chris




More information about the libvir-list mailing list