[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH v2]daemon: Fix a crash during virNetlinkEventServiceStopAll



HI peter,

Could you help me to review it ?  I  sent it about two months ago.

thanks,

haitao

On 6/12/19 3:18 PM, Liu Haitao wrote:
When reboot the host, a core dump file would be generated.

The call traces are:

Note.In this case, the  main thread is thread 5.
(gdb) thread 5
[Switching to thread 5 (LWP 4142)]
(gdb) bt
0  0x00007f00a6838273 in futex_wait_cancelable (private=<optimized out>,
     expected=0, futex_word=0x7f004c0125c0)
     at /usr/src/debug/glibc/2.24-r0/git/sysdeps/unix/sysv/linux/futex-internal.h:88
1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f004c012540,
     cond=0x7f004c012598)
     at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_cond_wait.c:502
2  __pthread_cond_wait (cond=0x7f004c012598, mutex=0x7f004c012540)
     at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_cond_wait.c:655
3  0x00007f00aa467246 in virCondWait (c=<optimized out>, m=<optimized out>)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:154
4  0x00007f00aa467eb0 in virThreadPoolFree (pool=<optimized out>)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthreadpool.c:286
5  0x00007f0074349f9d in qemuStateCleanup ()
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:1036
6  0x00007f00aa5e9486 in virStateCleanup ()
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/libvirt.c:682
7  0x000055a687ab86a4 in main (argc=<optimized out>, argv=<optimized out>)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/remote/remote_daemon.c:1473

(gdb) thread 1
[Switching to thread 1 (LWP 4403)]
(gdb) bt
0  __GI___pthread_mutex_lock (mutex=mutex entry=0x0)
     at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_mutex_lock.c:67
1  0x00007f00aa467165 in virMutexLock (m=m entry=0x0)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:89
2  0x00007f00aa43c0f9 in virNetlinkEventServerLock (driver=<optimized out>)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetlink.c:799
3  virNetlinkEventRemoveClient (watch=watch entry=0,
     macaddr=macaddr entry=0x7f0088014944, protocol=protocol entry=0)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetlink.c:1197
4  0x00007f00aa4341df in virNetDevMacVLanDeleteWithVPortProfile (
     ifname=<optimized out>, macaddr=macaddr entry=0x7f0088014944,
     linkdev=0x7f0088014920 "eth1", mode=mode entry=1,
     virtPortProfile=virtPortProfile entry=0x0,
     stateDir=stateDir entry=0x7f004c12fa90 "/var/run/libvirt/qemu")
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virnetdevmacvlan.c:1112
5  0x00007f0074312251 in qemuProcessStop (driver=driver entry=0x7f004c0ecef0,
     vm=vm entry=0x7f0088000b00,
     reason=reason entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN,
     asyncJob=asyncJob entry=QEMU_ASYNC_JOB_NONE, flags=<optimized out>)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_process.c:7291
6  0x00007f007437a5ea in processMonitorEOFEvent (vm=0x7f0088000b00, driver=0x7f004c0ecef0)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:4756
7  qemuProcessEventHandler (data=0x55a687d6df10, opaque=0x7f004c0ecef0)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/qemu/qemu_driver.c:4859
8  0x00007f00aa467c5b in virThreadPoolWorker (
     opaque=opaque entry=0x55a687d6c110)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthreadpool.c:163
9  0x00007f00aa466fe8 in virThreadHelper (data=<optimized out>)
     at /usr/src/debug/libvirt/5.3.0-r0/libvirt-5.3.0/src/util/virthread.c:206
10 0x00007f00a68323f4 in start_thread (arg=0x7f00699df700)
     at /usr/src/debug/glibc/2.24-r0/git/nptl/pthread_create.c:456
11 0x00007f00a616e10f in clone ()
     at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105


1. The execution flow of main thread (Thread 5 LWP 4142):
main()
   -->virNetDaemonRun()
   -->virNetDaemonClose(dmn)  //cleanup
   -->virNetlinkEventServiceStopAll()
   -->virStateCleanup()
	 -->qemuStateCleanup()
	   -->virThreadPoolFree()
	     -->__pthread_cond_wait()

virNetDaemonRun()
     -->virEventRunDefaultImpl
       -->virEventPollRunOnce
        -->virEventPollDispatchHandles
         -->qemuMonitorIO
           -->qemuProcessHandleMonitorEOF
             -->processEvent->eventType = QEMU_PROCESS_EVENT_MONITOR_EOF
              -->virThreadPoolSendJob()

After typing reboot command on the host, the main thread would send an event message to another thread.
Here it would let thread 1 to handle the shutdown of qemu process. But it could
not be executed immediately.

virNetlinkEventServiceStopAll()
	--> virNetlinkEventServiceStop()
	  --> server[protocol] = NULL;   // set server to null

IN virNetlinkEventServiceStopAll(), some variables related to network are freed,
like (static virNetlinkEventSrvPrivatePtr server).

virStateCleanup()
	-->qemuStateCleanup()
	   -->virThreadPoolFree()
	     -->__pthread_cond_wait()

In virThreadPoolFree() it will wait other thread to end up.

2. The execution flow of thread 5 (LWP 4403):
qemuProcessStop()
    -->virNetDevMacVLanDeleteWithVPortProfile()
	  -->virNetlinkEventRemoveClient()
	     --> srv = server[protocol]


Although the main thread had sent the message to thread 1(4403), it could not be
run instantly. It means that the  virNetlinkEventServiceStopAll() might be
executed earlier than virNetlinkEventRemoveClient(). We could get it from the log file.

""
2019-06-12 00:10:09.230+0000: 4142: info : virNetlinkEventServiceStopAll:941 : stopping all netlink event services
2019-06-12 00:10:09.230+0000: 4142: info : virNetlinkEventServiceStop:904 : stopping netlink event service
2019-06-12 00:10:21.165+0000: 4403: debug : virNetlinkEventRemoveClient:1190 : removing client watch=0, mac=0x7f0088014944.
"

In virNetlinkEventRemoveClient() the variable server is used again, but now it
is null that is freed by virNetlinkEventServiceStopAll().So it would case a crash .

The virNetlinkEventServiceStopAll() should be executed behind virStateCleanup(),

Signed-off-by: Liu Haitao <haitao liu windriver com>
---
  src/remote/remote_daemon.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c
index c3782971f1..7da20a6644 100644
--- a/src/remote/remote_daemon.c
+++ b/src/remote/remote_daemon.c
@@ -1464,8 +1464,6 @@ int main(int argc, char **argv) {
      /* Keep cleanup order in inverse order of startup */
      virNetDaemonClose(dmn);
- virNetlinkEventServiceStopAll();
-
      if (driversInitialized) {
          /* NB: Possible issue with timing window between driversInitialized
           * setting if virNetlinkEventServerStart fails */
@@ -1473,6 +1471,8 @@ int main(int argc, char **argv) {
          virStateCleanup();
      }
+ virNetlinkEventServiceStopAll();
+
      virObjectUnref(adminProgram);
      virObjectUnref(srvAdm);
      virObjectUnref(qemuProgram);


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]