[libvirt] [PATCH 1/2] avoid vm to be deleted if qemuConnectMonitor failed

Daniel P. Berrange berrange at redhat.com
Wed Jan 26 15:17:06 UTC 2011


On Tue, Jan 25, 2011 at 02:43:43PM +0800, Wen Congyang wrote:
> The reason of libvirtd cores dump is that:
> We add vm->refs when we alloc the memory, and decrease it 
> in the function qemuHandleMonitorEOF() in other thread.
> 
> We add vm->refs in the function qemuConnectMonitor() and
> decrease it when the vm is inactive.
> 
> The libvirtd will block in the function qemuMonitorSetCapabilities()
> because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.
> 
> Then we kill the vm by signal SIGKILL. The function
> qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs
> in the function qemuMonitorClose().
> In another thread, mon->fd is broken and the function
> qemuHandleMonitorEOF() is called. 
> 
> If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor()
> returns, vm->refs will be decrease to 0 and the memory is freed.
> 
> We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed.
> The memory has been freed, so qemudShutdownVMDaemon() is too dangerous.
> 
> We will reference NULL pointer in the function virDomainConfVMNWFilterTeardown():
> =============
> void
> virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) {
>     int i;
> 
>     if (nwfilterDriver != NULL) {
>         for (i = 0; i < vm->def->nnets; i++)
>             virDomainConfNWFilterTeardown(vm->def->nets[i]);
>     }
> }
> ============
> vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets
> to 0 when we free vm).
> 
> We should add an extra reference of vm to avoid vm to be deleted if
> qemuConnectMonitor() failed.
> 
> Signed-off-by: Wen Congyang <wency at cn.fujitsu.com>
> 
> ---
>  src/qemu/qemu_driver.c |   31 ++++++++++++++++++++++---------
>  1 files changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 34cc29f..613c2d4 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -930,6 +930,10 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq
>  
>      priv = obj->privateData;
>  
> +    /* Hold an extra reference because we can't allow 'vm' to be
> +     * deleted if qemuConnectMonitor() failed */
> +    virDomainObjRef(obj);
> +
>      /* XXX check PID liveliness & EXE path */
>      if (qemuConnectMonitor(driver, obj) < 0)
>          goto error;
> @@ -961,18 +965,27 @@ qemuReconnectDomain(void *payload, const char *name ATTRIBUTE_UNUSED, void *opaq
>      if (obj->def->id >= driver->nextvmid)
>          driver->nextvmid = obj->def->id + 1;
>  
> -    virDomainObjUnlock(obj);
> +    if (virDomainObjUnref(obj) > 0)
> +        virDomainObjUnlock(obj);
>      return;
>  
>  error:
> -    /* We can't get the monitor back, so must kill the VM
> -     * to remove danger of it ending up running twice if
> -     * user tries to start it again later */
> -    qemudShutdownVMDaemon(driver, obj, 0);
> -    if (!obj->persistent)
> -        virDomainRemoveInactive(&driver->domains, obj);
> -    else
> -        virDomainObjUnlock(obj);
> +    if (!virDomainObjIsActive(obj)) {
> +        if (virDomainObjUnref(obj) > 0)
> +            virDomainObjUnlock(obj);
> +        return;
> +    }
> +
> +    if (virDomainObjUnref(obj) > 0) {
> +        /* We can't get the monitor back, so must kill the VM
> +         * to remove danger of it ending up running twice if
> +         * user tries to start it again later */
> +        qemudShutdownVMDaemon(driver, obj, 0);
> +        if (!obj->persistent)
> +            virDomainRemoveInactive(&driver->domains, obj);
> +        else
> +            virDomainObjUnlock(obj);
> +    }
>  }

On closer examination I see why this change is required.
Normally we would be doing qemuDomainObjBeginJob before
doing anything with the monitor and that grabs an extra
reference.

ACK

Daniel




More information about the libvir-list mailing list