[libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

Nikolay Shirokovskiy nshirokovskiy at virtuozzo.com
Thu Nov 2 07:49:35 UTC 2017



On 01.11.2017 21:51, John Ferlan wrote:
> 
> 
> On 10/31/2017 02:54 AM, Nikolay Shirokovskiy wrote:
>>
>>
>> On 30.10.2017 19:21, Martin Kletzander wrote:
>>> On Mon, Oct 30, 2017 at 07:14:39AM -0400, John Ferlan wrote:
>>>> From: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>>>>
>>>> The problem is incorrect order of qemu driver shutdown and shutdown
>>>> of netserver threads that serve client requests (thru qemu driver
>>>> particularly).
>>>>
>>>> Net server threads are shutdown upon dispose which is triggered
>>>> by last daemon object unref at the end of main function. At the same
>>>> time qemu driver is shutdown earlier in virStateCleanup. As a result
>>>> netserver threads see invalid driver object in the middle of request
>>>> processing.
>>>>
>>>> Let's move shutting down netserver threads earlier to virNetDaemonClose.
>>>>
>>>> Note: order of last daemon unref and virStateCleanup
>>>> is introduced in 85c3a182 for a valid reason.
>>>>
>>>
>>> I must say I don't believe that's true.  Reading it back, that patch is
>>> wrong IMHO.  I haven't gone through all the details of it and I don't
>>> remember them from when I rewrote part of it, but the way this should
>>> be handled is different.
>>>
>>> The way you can clearly identify such issues is when you see that one
>>> thread determines the validity of data (pointer in this case).  This
>>> must never happen.  That means that the pointer is used from more places
>>> than how many references that object has.  However each of the pointers
>>> for such object should have their own reference.
>>>
>>> So the problem is probably somewhere else.
>>
>> If I understand you correctly we can fix issue in 85c3a182 in 
>> a differenct way. Like adding reference to daemon for every
>> driver that uses it for shutdown inhibition. It will require to
>> pass unref function to driver init function or a bit of wild casting
>> to virObjectPtr for unref. Not sure it is worth it in this case.
> 
> caveat: I'm still in a post KVM Forum travel brain fog...
> 
> Perhaps a bit more simple than that... Since the 'inhibitCallback' and
> the 'inhibitOpaque' are essentially the mechanism that would pass/use
> @dmn, then it'd be up to the inhibitCallback to add/remove the reference
> with the *given* that the inhibitOpaque is a lockable object anyway...
> 
> Thus, virNetDaemonAddShutdownInhibition could use virObjectRef and
> virNetDaemonRemoveShutdownInhibition could use virObjectUnref... The
> "catch" is that for Shutdown the Unref would need to go after Unlock.
> 
> I believe that then would "replicate" the virObjectRef done in
> daemonStateInit and the virObjectUnref done at the end of
> daemonRunStateInit with respect to "some outside thread" being able to
> use @dmn and not have it be Unref'd for the last time at any point in
> time until the "consumer" was done with it.
> 
> Moving the Unref to after all the StateCleanup calls were done works
> because it accomplishesd the same/similar task, but just held onto @dmn
> longer because the original implementation didn't properly reference dmn
> when it was being used for some other consumer/thread. Of course that
> led to other problems which this series is addressing and I'm not clear
> yet as to the impact vis-a-vis your StateShutdown series.

Ok, 85c3a182 can be rewritten this way. It is more staightforward to
ref/unref by consumer thread instead of relying on consumer structure
(in this case 85c3a182 relies upon the fact that after virStateCleanup
no hypervisor driver would use @dmn object).

But we still have the issues address by this patch series and state
shutdown series because the order in which @dmn and other objects
will be disposed would be same for 85c3a182 and proposed 85c3a182
replacement.

Nikolay

> 
>>
>> Anyway the initical conditions for current patch stay the same - 
>> daemon object will be unrefed after state drivers cleanup and 
>> RPC requests could deal with invalid state driver objects.
>>
>>>
>>>> Signed-off-by: John Ferlan <jferlan at redhat.com>
>>>> ---
>>>> src/rpc/virnetdaemon.c | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c
>>>> index 8c21414897..33bd8e3b06 100644
>>>> --- a/src/rpc/virnetdaemon.c
>>>> +++ b/src/rpc/virnetdaemon.c
>>>> @@ -881,6 +881,7 @@ virNetDaemonClose(virNetDaemonPtr dmn)
>>>>     virObjectLock(dmn);
>>>>
>>>>     virHashForEach(dmn->servers, daemonServerClose, NULL);
>>>> +    virHashRemoveAll(dmn->servers);
>>>>
>>>
>>> If I get this correctly, you are removing the servers so that their workers get
>>> cleaned up, but that should be done in a separate step.  Most probably what
>>> daemonServerClose should be doing.  This is a workaround, but not a proper fix.
> 
> Are you sure?  The daemonServerClose is the callback for virHashForEach
> which means the table->iterating would be set thus preventing
> daemonServerClose from performing a virHashRemoveEntry of the server
> element from the list.
> 
> So this is to me the right fix.
> 
> John
> 
>>
>> Well, original patch has a different approach for stopping RPC calls (see [1]) close
>> to what you suggest. I think in this case it is matter of taste because there are
>> no usecases which help us to prefer one way over another so I'm ok with both of them.
>>
>> [1] https://www.redhat.com/archives/libvir-list/2017-September/msg00999.html
>> [2] https://www.redhat.com/archives/libvir-list/2017-October/msg00643.html   (thread dicussing [1])
>>
>>> If that's not true than the previous commit mentioned here should also be fixed
>>> differently.
>>
>> Sorry I don't understand it. Can you elaborate on this point?
>>
>>
>> Nikolay
>>>
>>> So this is a clear NACK from my POV.
>>>
>>>>     virObjectUnlock(dmn);
>>>> }
>>>> -- 
>>>> 2.13.6
>>>>
>>>> -- 
>>>> libvir-list mailing list
>>>> libvir-list at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/libvir-list
>>>
>>>
>>> --
>>> libvir-list mailing list
>>> libvir-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/libvir-list
>>>




More information about the libvir-list mailing list