[libvirt] [BUG] event's bug?

Wen Congyang wency at cn.fujitsu.com
Mon Jul 11 09:00:25 UTC 2011


Steps to produce this bug:
1. # virsh migrate vm1 --p2p qemu+tls://<remote host>/system
   error: End of file while reading data: Input/output error

Now the libvirtd crashed.

This bug only happened twice.

I use gdb to analyze the core file:
(gdb) info threads 
  6 Thread 3952  0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5 Thread 3951  0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4 Thread 3950  0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3 Thread 3949  0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2 Thread 3948  0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1 Thread 3947  0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000003f2a834150 in _gnutls_string_resize (dest=0x7f92740079d8, new_size=<value optimized out>) at gnutls_str.c:192
#2  0x0000003f2a81a614 in _gnutls_io_read_buffered (session=0x7f9274006d30, iptr=0x7fffb3476148, sizeOfPtr=5, recv_type=<value optimized out>) at gnutls_buffers.c:515
#3  0x0000003f2a816031 in _gnutls_recv_int (session=0x7f9274006d30, type=GNUTLS_APPLICATION_DATA, htype=4294967295, data=0x7f92740155e8 "", sizeofdata=4) at gnutls_record.c:904
#4  0x00007f9285bd2ec7 in virNetTLSSessionRead (sess=0x7f927400a5d0, buf=0x7f92740155e8 "", len=4) at rpc/virnettlscontext.c:812
#5  0x00007f9285bd50a4 in virNetSocketReadWire (sock=0x7f9274006ba0, buf=0x7f92740155e8 "", len=4) at rpc/virnetsocket.c:801
#6  0x00007f9285bd5815 in virNetSocketRead (sock=0x7f9274006ba0, buf=0x7f92740155e8 "", len=4) at rpc/virnetsocket.c:981
#7  0x00007f9285bce40f in virNetClientIOReadMessage (client=0x7f9274015590) at rpc/virnetclient.c:711
#8  0x00007f9285bce461 in virNetClientIOHandleInput (client=0x7f9274015590) at rpc/virnetclient.c:730
#9  0x00007f9285bcf0f4 in virNetClientIncomingEvent (sock=0x7f9274006ba0, events=1, opaque=0x7f9274015590) at rpc/virnetclient.c:1119
#10 0x00007f9285bd5b87 in virNetSocketEventHandle (fd=13, watch=20, events=1, opaque=0x7f9274006ba0) at rpc/virnetsocket.c:1052
#11 0x00007f9285b09325 in virEventPollDispatchHandles (nfds=10, fds=0xfe51d0) at util/event_poll.c:469
#12 0x00007f9285b09a7e in virEventPollRunOnce () at util/event_poll.c:610
#13 0x00007f9285b07ec5 in virEventRunDefaultImpl () at util/event.c:247
#14 0x0000000000449cc2 in virNetServerRun (srv=0xfc3490) at rpc/virnetserver.c:662
#15 0x000000000041e3c5 in main (argc=2, argv=0x7fffb3476b68) at libvirtd.c:1561

The debug log in /var/log/libvirt/libvirtd.log:
...
11:18:27.838: 1848: debug : virEventPollRemoveHandle:171 : Remove handle w=13
11:18:27.838: 1847: debug : virEventPollDispatchHandles:454 : i=3 w=4
11:18:27.838: 1847: debug : virEventPollDispatchHandles:454 : i=4 w=5
11:18:27.838: 1847: debug : virEventPollDispatchHandles:454 : i=5 w=6
11:18:27.838: 1847: debug : virEventPollDispatchHandles:454 : i=6 w=12
11:18:27.838: 1847: debug : virEventPollDispatchHandles:454 : i=7 w=13
11:18:27.838: 1847: debug : virEventPollDispatchHandles:467 : Dispatch n=7 f=20 w=13 e=1 0x7f00780aaa80
11:18:27.838: 1848: debug : virEventPollRemoveHandle:184 : mark delete 7 20
11:18:27.838: 1848: debug : virEventPollInterruptLocked:677 : Interrupting
11:18:27.838: 1848: debug : virNetSocketFree:627 : sock=0x7f00780aaa80 fd=20
11:18:27.838: 1848: debug : virEventPollRemoveTimeout:276 : Remove timer 10
11:18:27.838: 1848: debug : virEventPollInterruptLocked:677 : Interrupting
11:18:27.838: 1848: debug : virDomainObjUnref:1142 : obj=0x7f007800ce10 refs=2
11:18:27.838: 1848: debug : virDomainObjUnref:1142 : obj=0x7f007800ce10 refs=1


     ====== end of log =====

The reason is that: we dispatch handle(fd = 20, watch=13) and remove the handle(watch=13) almost
at the same time(The order is: dispatch, remove)
We remove the handle when remote connection is closed, and we will call virNetSocketFree() to free sock,
but we still use sock in another thread. It's very dangerous!!!

I think we should wait dispatching a handle when removing the same handle.




More information about the libvir-list mailing list