[libvirt] [PATCH v2]daemon: Fix a crash during virNetlinkEventServiceStopAll
Haitaoliu
Haitao.Liu at windriver.com
Tue Aug 13 06:54:34 UTC 2019
HI peter,
Could you help me to review it ? I sent it about two months ago.
thanks,
haitao
On 6/12/19 3:18 PM, Liu Haitao wrote:
> When reboot the host, a core dump file would be generated.
>
> The call traces are:
>
> Note.In this case, the main thread is thread 5.
>
> (gdb) thread 5
> [Switching to thread 5 (LWP 4142)]
> (gdb) bt
> 0 0x00007f00a6838273 in futex_wait_cancelable (private=<optimized out>,
> expected=0, futex_word=0x7f004c0125c0)
> at /usr/src/debug/glibc/2.24-r0/git/sysdeps/unix/sysv/linux/futex-internal.h:88
> 1 __pthread_cond_wait_common (abstime=0x0, mutex=0x7f004c012540,
> cond=0x7f004c012598)
> at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_cond_wait.c:502
> 2 __pthread_cond_wait (cond=0x7f004c012598, mutex=0x7f004c012540)
> at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_cond_wait.c:655
> 3 0x00007f00aa467246 in virCondWait (c=<optimized out>, m=<optimized out>)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:154
> 4 0x00007f00aa467eb0 in virThreadPoolFree (pool=<optimized out>)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthreadpool.c:286
> 5 0x00007f0074349f9d in qemuStateCleanup ()
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:1036
> 6 0x00007f00aa5e9486 in virStateCleanup ()
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/libvirt.c:682
> 7 0x000055a687ab86a4 in main (argc=<optimized out>, argv=<optimized out>)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/remote/remote_daemon.c:1473
>
> (gdb) thread 1
> [Switching to thread 1 (LWP 4403)]
> (gdb) bt
> 0 __GI___pthread_mutex_lock (mutex=mutex at entry=0x0)
> at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_mutex_lock.c:67
> 1 0x00007f00aa467165 in virMutexLock (m=m at entry=0x0)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:89
> 2 0x00007f00aa43c0f9 in virNetlinkEventServerLock (driver=<optimized out>)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetlink.c:799
> 3 virNetlinkEventRemoveClient (watch=watch at entry=0,
> macaddr=macaddr at entry=0x7f0088014944, protocol=protocol at entry=0)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetlink.c:1197
> 4 0x00007f00aa4341df in virNetDevMacVLanDeleteWithVPortProfile (
> ifname=<optimized out>, macaddr=macaddr at entry=0x7f0088014944,
> linkdev=0x7f0088014920 "eth1", mode=mode at entry=1,
> virtPortProfile=virtPortProfile at entry=0x0,
> stateDir=stateDir at entry=0x7f004c12fa90 "/var/run/libvirt/qemu")
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetdevmacvlan.c:1112
> 5 0x00007f0074312251 in qemuProcessStop (driver=driver at entry=0x7f004c0ecef0,
> vm=vm at entry=0x7f0088000b00,
> reason=reason at entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN,
> asyncJob=asyncJob at entry=QEMU_ASYNC_JOB_NONE, flags=<optimized out>)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_process.c:7291
> 6 0x00007f007437a5ea in processMonitorEOFEvent (vm=0x7f0088000b00, driver=0x7f004c0ecef0)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:4756
> 7 qemuProcessEventHandler (data=0x55a687d6df10, opaque=0x7f004c0ecef0)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:4859
> 8 0x00007f00aa467c5b in virThreadPoolWorker (
> opaque=opaque at entry=0x55a687d6c110)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthreadpool.c:163
> 9 0x00007f00aa466fe8 in virThreadHelper (data=<optimized out>)
> at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:206
> 10 0x00007f00a68323f4 in start_thread (arg=0x7f00699df700)
> at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_create.c:456
> 11 0x00007f00a616e10f in clone ()
> at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
>
>
> 1. The execution flow of main thread (Thread 5 LWP 4142):
> main()
> -->virNetDaemonRun()
> -->virNetDaemonClose(dmn) //cleanup
> -->virNetlinkEventServiceStopAll()
> -->virStateCleanup()
> -->qemuStateCleanup()
> -->virThreadPoolFree()
> -->__pthread_cond_wait()
>
> virNetDaemonRun()
> -->virEventRunDefaultImpl
> -->virEventPollRunOnce
> -->virEventPollDispatchHandles
> -->qemuMonitorIO
> -->qemuProcessHandleMonitorEOF
> -->processEvent->eventType = QEMU_PROCESS_EVENT_MONITOR_EOF
> -->virThreadPoolSendJob()
>
> After typing reboot command on the host, the main thread would send an event message to another thread.
> Here it would let thread 1 to handle the shutdown of qemu process. But it could
> not be executed immediately.
>
> virNetlinkEventServiceStopAll()
> --> virNetlinkEventServiceStop()
> --> server[protocol] = NULL; // set server to null
>
> IN virNetlinkEventServiceStopAll(), some variables related to network are freed,
> like (static virNetlinkEventSrvPrivatePtr server).
>
> virStateCleanup()
> -->qemuStateCleanup()
> -->virThreadPoolFree()
> -->__pthread_cond_wait()
>
> In virThreadPoolFree() it will wait other thread to end up.
>
> 2. The execution flow of thread 5 (LWP 4403):
> qemuProcessStop()
> -->virNetDevMacVLanDeleteWithVPortProfile()
> -->virNetlinkEventRemoveClient()
> --> srv = server[protocol]
>
>
> Although the main thread had sent the message to thread 1(4403), it could not be
> run instantly. It means that the virNetlinkEventServiceStopAll() might be
> executed earlier than virNetlinkEventRemoveClient(). We could get it from the log file.
>
> ""
> 2019-06-12 00:10:09.230+0000: 4142: info : virNetlinkEventServiceStopAll:941 : stopping all netlink event services
> 2019-06-12 00:10:09.230+0000: 4142: info : virNetlinkEventServiceStop:904 : stopping netlink event service
> 2019-06-12 00:10:21.165+0000: 4403: debug : virNetlinkEventRemoveClient:1190 : removing client watch=0, mac=0x7f0088014944.
> "
>
> In virNetlinkEventRemoveClient() the variable server is used again, but now it
> is null that is freed by virNetlinkEventServiceStopAll().So it would case a crash .
>
> The virNetlinkEventServiceStopAll() should be executed behind virStateCleanup(),
>
> Signed-off-by: Liu Haitao <haitao.liu at windriver.com>
> ---
> src/remote/remote_daemon.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c
> index c3782971f1..7da20a6644 100644
> --- a/src/remote/remote_daemon.c
> +++ b/src/remote/remote_daemon.c
> @@ -1464,8 +1464,6 @@ int main(int argc, char **argv) {
> /* Keep cleanup order in inverse order of startup */
> virNetDaemonClose(dmn);
>
> - virNetlinkEventServiceStopAll();
> -
> if (driversInitialized) {
> /* NB: Possible issue with timing window between driversInitialized
> * setting if virNetlinkEventServerStart fails */
> @@ -1473,6 +1471,8 @@ int main(int argc, char **argv) {
> virStateCleanup();
> }
>
> + virNetlinkEventServiceStopAll();
> +
> virObjectUnref(adminProgram);
> virObjectUnref(srvAdm);
> virObjectUnref(qemuProgram);
More information about the libvir-list
mailing list