[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[libvirt] [libvirt-1.0.5]deadlock in child process after call function backtrace, any suggestions is appreciate!



Hi ALL,

In order to catch the calltrace of deadlock in libvirtd, I modified the function virMutexLock as follows:

struct virMutex {
    pthread_mutex_t lock;
    void *trace[TRACE_SIZE];    //added for test
    int ntrace;                         //added for test
};

void virMutexLock(virMutexPtr m)
{
    struct timespec ts;

    if (0 == clock_gettime(CLOCK_REALTIME, &ts)) {
        ts.tv_sec += LOCK_TIMEOUT;
        if (pthread_mutex_timedlock(&m->lock, &ts) == ETIMEDOUT) {
            if (m->ntrace > 0)
                virLogBacktrace(m->ntrace, m->trace);
            pthread_mutex_lock(&m->lock);
        }

        m->ntrace = backtrace(m->trace, TRACE_SIZE);            //record the call trace information.
    } else {
        pthread_mutex_lock(&m->lock);
    }
}

The original is :
void virMutexLock(virMutexPtr m){
    pthread_mutex_lock(&m->lock);
}

But, unfortunatly, sometimes, deadlock happened in child process after virFork, the father libvirtd process' pid is 2987, and the child libvirtd process id is 29509, which is forked in order to run a shell script.

root cvk143:~# ps -ef | grep libvirtd
root      2987     1 52 08:36 ?        00:40:38 /usr/sbin/libvirtd -d
root     29509  2987  0 09:38 ?        00:00:00 /usr/sbin/libvirtd -d
root cvk143:~#

the child process's call trace is as follow:
(gdb) bt
#0  0x00007f4d5e7fd89c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f4d5e7f9080 in _L_lock_903 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f4d5e7f8f19 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007f4d5e5607db in dl_iterate_phdr () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007f4d5c3c88b6 in _Unwind_Find_FDE () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#5  0x00007f4d5c3c5d70 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#6  0x00007f4d5c3c6490 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7  0x00007f4d5c3c6d3e in _Unwind_Backtrace () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8  0x00007f4d5e53b1c8 in backtrace () from /lib/x86_64-linux-gnu/libc.so.6
#9  0x00007f4d5f6f8c92 in virMutexLock (m=0x7f4d5fadedc0) at util/virthreadpthread.c:128
#10 0x00007f4d5f6d0a3d in virLogLock () at util/virlog.c:152
#11 0x00007f4d5f6d0f66 in virLogReset () at util/virlog.c:311
#12 0x00007f4d5f6b3139 in virFork (pid=0x7f4d5a1a4310) at util/vircommand.c:281
#13 0x00007f4d5f6b3b06 in virExec (cmd=0x7f4d2000a6f0) at util/vircommand.c:493
#14 0x00007f4d5f6b8a34 in virCommandRunAsync (cmd=0x7f4d2000a6f0, pid=0x0) at util/vircommand.c:2340
#15 0x00007f4d5f6b815c in virCommandRun (cmd=0x7f4d2000a6f0, exitstatus=0x7f4d5a1a4728) at util/vircommand.c:2191
#16 0x00007f4d5f6b4bdd in virRun (argv=0x7f4d5a1a4730, status=0x7f4d5a1a4728) at util/vircommand.c:776
#17 0x00007f4d60231006 in virStorageNFSPoolCheckSub (hostName=0x7f4d50011630 "192.168.0.6", hostDir=0x7f4d50014560 "/vms/isos")
    at storage/storage_backend.c:165
#18 0x00007f4d602312fa in virStoragePoolCheckFirst (pool=0x7f4d5000dcc0) at storage/storage_backend.c:255
#19 0x00007f4d602383af in virStorageBackendFileSystemRefresh (conn=0x7f4d3c0010a0, pool=0x7f4d5000dcc0) at storage/storage_backend_fs.c:887
#20 0x00007f4d6022c458 in storagePoolRefresh (obj=0x7f4d200012e0, flags=0) at storage/storage_driver.c:1705
#21 0x00007f4d5f7ac445 in virStoragePoolRefresh (pool=0x7f4d200012e0, flags=0) at libvirt.c:12936
#22 0x00007f4d6015c62f in remoteDispatchStoragePoolRefresh (server=0x7f4d60bb4ec0, client=0x7f4d60bbc380, msg=0x7f4d60bbcae0, rerr=0x7f4d5a1a4af0,
    args=0x7f4d2000f0b0) at remote_dispatch.h:12867
#23 0x00007f4d6015c527 in remoteDispatchStoragePoolRefreshHelper (server=0x7f4d60bb4ec0, client=0x7f4d60bbc380, msg=0x7f4d60bbcae0, rerr=0x7f4d5a1a4af0,
    args=0x7f4d2000f0b0, ret=0x7f4d200144f0) at remote_dispatch.h:12845
#24 0x00007f4d5f7fe7d5 in virNetServerProgramDispatchCall (prog=0x7f4d60bbfc90, server=0x7f4d60bb4ec0, client=0x7f4d60bbc380, msg=0x7f4d60bbcae0)
    at rpc/virnetserverprogram.c:439
#25 0x00007f4d5f7fe34e in virNetServerProgramDispatch (prog=0x7f4d60bbfc90, server=0x7f4d60bb4ec0, client=0x7f4d60bbc380, msg=0x7f4d60bbcae0)
    at rpc/virnetserverprogram.c:305
#26 0x00007f4d5f7f72ea in virNetServerProcessMsg (srv=0x7f4d60bb4ec0, client=0x7f4d60bbc380, prog=0x7f4d60bbfc90, msg=0x7f4d60bbcae0)
    at rpc/virnetserver.c:162
#27 0x00007f4d5f7f73cd in virNetServerHandleJob (jobOpaque=0x7f4d60bbdbf0, opaque=0x7f4d60bb4ec0) at rpc/virnetserver.c:183
#28 0x00007f4d5f6f9602 in virThreadPoolWorker (opaque=0x7f4d60b9aa40) at util/virthreadpool.c:144
#29 0x00007f4d5f6f8fd6 in virThreadHelper (data=0x7f4d60b9a9b0) at util/virthreadpthread.c:212
#30 0x00007f4d5e7f6e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#31 0x00007f4d5e5244bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#32 0x0000000000000000 in ?? ()
(gdb)
..............

I googled and found the similar deadlock call trace,which can be found in this link:  code.google.com/p/gperftools/issues/detail?id=66  , but it is not a situation in libvirtd.
......
                Labels: -Priority-Medium Priority-Low
                Jan 22, 2013
                #32 DarkAge    gmail com

                Please note the glibc unwinder also uses dl_iterate_phdr, and takes several locks during backtrace generation in _Unwind_Find_FDE.

                #0  0x00007ffff7bc6e80 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
                #1  0x00007ffff7212fdb in dl_iterate_phdr () from /lib/x86_64-linux-gnu/libc.so.6
                #2  0x00007ffff6be28b6 in _Unwind_Find_FDE () from /lib/libgcc_s.so.1
                #3  0x00007ffff6bdfd70 in ?? () from /lib/libgcc_s.so.1
                #4  0x00007ffff6be0d7d in _Unwind_Backtrace () from /lib/libgcc_s.so.1
                #5  0x00007ffff71ed9c8 in backtrace () from /lib/x86_64-linux-gnu/libc.so.6
                #6  0x000000000040fdb5 in glibc_backtrace (error=<synthetic pointer>, buffer=0x7fffffffdd10,
                    size=<optimized out>, ucontext=<optimized out>)
......


Is there anyone ever encouter the similar problem? Would be great if someone can help me on this.
Thank you very much.



-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]