[libvirt] [PATCH v2 1/3] daemon: keep daemon until all hypervisors drivers are cleaned up
John Ferlan
jferlan at redhat.com
Thu Oct 27 14:53:58 UTC 2016
On 10/27/2016 10:04 AM, Nikolay Shirokovskiy wrote:
>
>
> On 27.10.2016 15:59, John Ferlan wrote:
>>
>>
>> On 10/04/2016 10:27 AM, Nikolay Shirokovskiy wrote:
>>> This fix SEGV with the backtrace [1]
>>>
>>> This happens due to race on simultaneous finishing of libvirtd and
>>> qemu process. We need to keep daemon object until all hypervisor
>>> drivers are cleaned up. The other option is to take reference to
>>> daemon in every hypervisor driver but this will take more work
>>> and we really don't need to. Drivers cleanup procedure is synchronous
>>> thus let's just remove last reference to daemon after drivers cleanup.
>>>
>>> [1] backtrace (shortened a bit)
>>>
>>> 1 0x00007fd3a791b2e6 in virCondWait (c=<optimized out>, m=<optimized out>) at util/virthread.c:154
>>> 2 0x00007fd3a791bcb0 in virThreadPoolFree (pool=0x7fd38024ee00) at util/virthreadpool.c:266
>>> 3 0x00007fd38edaa00e in qemuStateCleanup () at qemu/qemu_driver.c:1116
>>> 4 0x00007fd3a79abfeb in virStateCleanup () at libvirt.c:808
>>> 5 0x00007fd3a85f2c9e in main (argc=<optimized out>, argv=<optimized out>) at libvirtd.c:1660
>>>
>>> Thread 1 (Thread 0x7fd38722d700 (LWP 32256)):
>>> 0 0x00007fd3a7900910 in virClassIsDerivedFrom (klass=0xdfd36058d4853, parent=0x7fd3a8f394d0) at util/virobject.c:169
>>> 1 0x00007fd3a7900c4e in virObjectIsClass (anyobj=anyobj at entry=0x7fd3a8f2f850, klass=<optimized out>) at util/virobject.c:365
>>> 2 0x00007fd3a7900c74 in virObjectLock (anyobj=0x7fd3a8f2f850) at util/virobject.c:317
>>> 3 0x00007fd3a7a24d5d in virNetDaemonRemoveShutdownInhibition (dmn=0x7fd3a8f2f850) at rpc/virnetdaemon.c:547
>>> 4 0x00007fd38ed722cf in qemuProcessStop (driver=driver at entry=0x7fd380103810, vm=vm at entry=0x7fd38025b6d0, reason=reason at entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob at entry=QEMU_ASYNC_JOB_NONE,
>>> flags=flags at entry=0) at qemu/qemu_process.c:5786
>>> 5 0x00007fd38edd9428 in processMonitorEOFEvent (vm=0x7fd38025b6d0, driver=0x7fd380103810) at qemu/qemu_driver.c:4588
>>> 6 qemuProcessEventHandler (data=<optimized out>, opaque=0x7fd380103810) at qemu/qemu_driver.c:4632
>>> 7 0x00007fd3a791bb55 in virThreadPoolWorker (opaque=opaque at entry=0x7fd3a8f1e4c0) at util/virthreadpool.c:145
>>> ---
>>> daemon/libvirtd.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>
>> I'd like to adjust the commit message just a bit... In particular
>> removing the "The other option..." sentence.
>
> ok
>
>>
>> So the "problem" discovered is that virStateInitialize requires/uses the
>> "dmn" reference as the argument to the daemonInhibitCallback and by
>> dereferencing the object too early and then calling the virStateCleanup
>> which calls the daemonInhibitCallback using the dmn we end up with the SEGV.
>
> yep
>
>>
>>> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
>>> index 95c1b1c..eefdefc 100644
>>> --- a/daemon/libvirtd.c
>>> +++ b/daemon/libvirtd.c
>>> @@ -1628,7 +1628,6 @@ int main(int argc, char **argv) {
>>> virObjectUnref(qemuProgram);
>>> virObjectUnref(adminProgram);
>>> virNetDaemonClose(dmn);
>>> - virObjectUnref(dmn);
>>> virObjectUnref(srv);
>>> virObjectUnref(srvAdm);
>>> virNetlinkShutdown();
>>> @@ -1658,6 +1657,9 @@ int main(int argc, char **argv) {
>>> driversInitialized = false;
>>> virStateCleanup();
>>> }
>>
>> Another option would have been to move the above hunk up... Since
>> setting the daemonStateInit is one of the last things done before
>> running the event loop, it stands to reason shutting down should be done
>> in the opposite order thus causing those InhibitCallbacks to be run
>> perhaps after virNetlinkEventServiceStopAll and before virNetDaemonClose.
>
> By the above hunk you mean virStateCleanup() et all? If yes - then I don't
> think so. The thing is that start is tricky.
>
> 1. We start the network part, but keep it in disabled state.
> In this state it creates sockets and listens on them but not accept
> connections.
>
> 2. Initialize all the hyper drivers and then enable the network part
> to let it accept connections.
>
> Thus effectively the order is reverse to what is seems -
> first hyper drivers and then network servers.
>
> Actually there is a direct reasoning why hyper drivers shutted down after
> network part - otherwise requests can be delivered to hyper drivers in
> invalid state.
>
Yeah - it's an intricate and delicate balance, especially that inhibit
callback stuff. Since you only have 1 domain and 1 libvirtd
I'll keep the order as is...
Tks -
John
> Nikolay
>
>>
>> Any thoughts to that?
>>
>> This I assume works, but do we run into "other" issues because we're
>> running those callbacks after virNetDaemonClose and virNetlinkShutdown?
>>
>> Of course the "fear" of moving is that it causes "other" timing issues
>> with previous adjustments here, in particular:
>>
>> commit id '5c756e580' and 'a602e90bc1' which introduced and adjusted
>> driversInitialized
>>
>> I'm OK with leaving things as is, but perhaps someone else will see this
>> an comment as well.
>>
>> John
>>
>>
>>> + /* unref daemon only here as hypervisor drivers can
>>> + call shutdown inhibition functions on cleanup */
>>> + virObjectUnref(dmn);
>>>
>>> return ret;
>>> }
>>>
More information about the libvir-list
mailing list