[libvirt] [PATCH 0/9] Resolve libvirtd hang on termination with connected long running client

John Ferlan jferlan at redhat.com
Fri Jul 6 22:29:15 UTC 2018



On 07/06/2018 05:15 AM, Marc Hartmayer wrote:
> On Tue, Jul 03, 2018 at 09:21 PM +0200, John Ferlan <jferlan at redhat.com> wrote:
>>>
>>> Is there any update so far? I’m asking because I’m still getting
>>> segmentation faults and hang-ups on termination of libvirtd (using the
>>> newest version of libvirt).
>>>
>>> Example for a hang-up:
>>> ➤  bt
>>> #0  0x000003fffca8df84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
>>> #1  0x000003fffdac29ca in virCondWait (c=<optimized out>, m=<optimized out>) at ../../src/util/virthread.c:154
>>> #2  0x000003fffdac381c in virThreadPoolFree (pool=<optimized out>) at ../../src/util/virthreadpool.c:290
>>> #3  0x000003fffdbb21ae in virNetServerDispose (obj=0x1000cc640) at ../../src/rpc/virnetserver.c:803
>>> #4  0x000003fffda97286 in virObjectUnref (anyobj=<optimized out>) at ../../src/util/virobject.c:350
>>> #5  0x000003fffda97a5a in virObjectFreeHashData (opaque=<optimized out>, name=<optimized out>) at ../../src/util/virobject.c:591
>>> #6  0x000003fffda66576 in virHashFree (table=<optimized out>) at ../../src/util/virhash.c:305
>>> #7  0x000003fffdbaff82 in virNetDaemonDispose (obj=0x1000cc3c0) at ../../src/rpc/virnetdaemon.c:105
>>> #8  0x000003fffda97286 in virObjectUnref (anyobj=<optimized out>) at ../../src/util/virobject.c:350
>>> #9  0x0000000100026cd6 in main (argc=<optimized out>, argv=<optimized out>) at ../../src/remote/remote_daemon.c:1487
>>>
>>> And segmentation faults happen for RPC jobs that are still running.
>>>
>>
>> There has been zero of my cycles spent thinking about this. Partially
>> because I'm busy in other areas, partially because I know Daniel is
>> planning changes in libvirtd
>> (https://www.redhat.com/archives/libvir-list/2018-May/msg01307.html),
>> and partially because I'm not sure I have a {reliable|simple} reproducer
>> (at least I don't recall).
> 
> I do a simple start/destroy loop and send a SIGTERM to libvirtd. This
> leads in almost every case to a segmentation fault.
> 
>>
>> I do still have various branches in various states of disarray that are
>> way behind current head (easy to happen it seems lately).
>>

I found/updated my branches and started recalling where things were left
off. I had R-By's for patch 1 & 2 until you noted the issue with the
global variables which was fixed in my branch.

Since there's been a lot of code motion since and what I have now is
different I figure I should repost the series and go from there. I
haven't addressed the use pipe instead of polling comment you had in
patch7.  I haven't put much thought into it, but in general given how
things work how would you expect pipe signaling to work in this model?
Essentially we have a client that either is not responding or is in the
middle of something large.

NB: I won't be around for the first half of next week to see any
responses or answer questions...  Perhaps it's vacation mode that isn't
allowing me to think about the pipe signaling logic right now ;-)

John




More information about the libvir-list mailing list