[libvirt] [PATCH] Fix race condition reconnecting to vms & loading configs

Cole Robinson crobinso at redhat.com
Mon Oct 28 17:22:39 UTC 2013


On 10/28/2013 01:14 PM, Daniel P. Berrange wrote:
> On Mon, Oct 28, 2013 at 01:08:45PM -0400, Cole Robinson wrote:
>> On 10/28/2013 01:06 PM, Daniel P. Berrange wrote:
>>> On Mon, Oct 28, 2013 at 01:03:49PM -0400, Cole Robinson wrote:
>>>> On 10/28/2013 07:52 AM, Daniel P. Berrange wrote:
>>>>> From: "Daniel P. Berrange" <berrange at redhat.com>
>>>>>
>>>>> The following sequence
>>>>>
>>>>>  1. Define a persistent QMEU guest
>>>>>  2. Start the QEMU guest
>>>>>  3. Stop libvirtd
>>>>>  4. Kill the QEMU process
>>>>>  5. Start libvirtd
>>>>>  6. List persistent guets
>>>>>
>>>>> At the last step, the previously running persistent guest
>>>>> will be missing. This is because of a race condition in the
>>>>> QEMU driver startup code. It does
>>>>>
>>>>>  1. Load all VM state files
>>>>>  2. Spawn thread to reconnect to each VM
>>>>>  3. Load all VM config files
>>>>>
>>>>> Only at the end of step 3, does the 'virDomainObjPtr' get
>>>>> marked as "persistent". There is therefore a window where
>>>>> the thread reconnecting to the VM will remove the persistent
>>>>> VM from the list.
>>>>>
>>>>> The easy fix is to simply switch the order of steps 2 & 3.
>>>>>
>>>>> Signed-off-by: Daniel P. Berrange <berrange at redhat.com>
>>>>> ---
>>>>>  src/qemu/qemu_driver.c | 3 +--
>>>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>>>> index c613967..9c3daad 100644
>>>>> --- a/src/qemu/qemu_driver.c
>>>>> +++ b/src/qemu/qemu_driver.c
>>>>> @@ -816,8 +816,6 @@ qemuStateInitialize(bool privileged,
>>>>>  
>>>>>      conn = virConnectOpen(cfg->uri);
>>>>>  
>>>>> -    qemuProcessReconnectAll(conn, qemu_driver);
>>>>> -
>>>>>      /* Then inactive persistent configs */
>>>>>      if (virDomainObjListLoadAllConfigs(qemu_driver->domains,
>>>>>                                         cfg->configDir,
>>>>> @@ -828,6 +826,7 @@ qemuStateInitialize(bool privileged,
>>>>>                                         NULL, NULL) < 0)
>>>>>          goto error;
>>>>>  
>>>>> +    qemuProcessReconnectAll(conn, qemu_driver);
>>>>>  
>>>>>      virDomainObjListForEach(qemu_driver->domains,
>>>>>                              qemuDomainSnapshotLoad,
>>>>>
>>>>
>>>> I tried testing this patch to see if it would fix:
>>>>
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1015246
>>>>
>>>> from current master I did:
>>>>
>>>> git revert a924d9d083c215df6044387057c501d9aa338b96
>>>> reproduce the bug
>>>> git am <your-patch>
>>>>
>>>> But the daemon won't even start up after your patch is built:
>>>>
>>>> (gdb) bt
>>>> #0  qemuMonitorOpen (vm=vm at entry=0x7fffd4211090, config=0x0, json=false,
>>>>     cb=cb at entry=0x7fffddcae720 <monitorCallbacks>,
>>>>     opaque=opaque at entry=0x7fffd419b840) at qemu/qemu_monitor.c:852
> 
>> Sorry for not being clear: The daemon crashes, that's the backtrace.
> 
> Hmm config is NULL - does the state XML files not include the
> monitor info perhaps ?
> 

I see:

pidfile for busted VM in /var/run/libvirt/qemu
nothing in /var/cache/libvirt/qemu
no state that I can see in /var/lib/libvirt/qemu

But I'm not sure where it's supposed to be stored.

FWIW reproducing this state was pretty simple: revert
a924d9d083c215df6044387057c501d9aa338b96, edit an existing x86 guest to remove
all <video> and <graphics> devices, start the guest, libvirtd crashes.

Thanks,
Cole




More information about the libvir-list mailing list