[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Libvir] [PATCH] avoid virsh hang due to missing virDomainFree(dom) call



"Daniel P. Berrange" <berrange redhat com> wrote:

> On Wed, Jan 30, 2008 at 06:36:37PM +0100, Jim Meyering wrote:
>> "Daniel P. Berrange" <berrange redhat com> wrote:
>>
>> > On Wed, Jan 30, 2008 at 06:25:19PM +0100, Jim Meyering wrote:
>> >> Without the following patch, this command would hang
>> >>
>> >>     printf 'domuuid fc4\ndomstate fc4\n' \
>> >>       | ./virsh --connect test://$PWD/../docs/testnode.xml
>> >>
>> >> with this stack trace:
>> >>
>> >>     __lll_lock_wait ...
>> >>     _L_lock_105 ...
>> >>     __pthread_mutex_lock ...
>> >>     virUnrefDomain (domain=0x6a8b30) at hash.c:884
>> >>     virDomainFree (domain=0x6a8b30) at libvirt.c:1211
>> >>     cmdDomstate (ctl=0x7fffea723390, cmd=0x6a8b10) at virsh.c:677
>> >>     vshCommandRun (ctl=0x7fffea723390, cmd=0x6a8b10) at virsh.c:4033
>> >>     main (argc=3, argv=0x7fffea7234e8) at virsh.c:501
>> >>
>> >> The problem is that cmdDomuuid didn't call virDomainFree(dom), so
>> >> cmdDomstate hangs trying to do the Unref.
>> >
>> > I don't understand why it would hang simply because it forgot to free
>> > an object. Surely it ought to just be a memory leak not hang ? Each
>> > individual libvirt API must always have a matched lock & unlock pair
>> > in all codepaths, so a hang should only occur if an unlock is missing
>> > in some codepath.
>>
>> Well maybe virDomainFree is misnamed then.
>> It does more:
>>
>> The missing virDomainFree calls cause that because
>> they are what is supposed to do the unlock.
>
> Not, its a bug in virUnrefDomain/Network - it calls mutex_lock() twice
> in one codepath, instead of calling  unlock().
>
> Of course your patch to avoid the memory leak is still needed - so ACK
> to that, but the locking flaw needs this patch:
>
> Index: src/hash.c
> ===================================================================
> RCS file: /data/cvs/libvirt/src/hash.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 hash.c
> --- src/hash.c	29 Jan 2008 18:15:54 -0000	1.29
> +++ src/hash.c	30 Jan 2008 17:42:32 -0000
> @@ -881,7 +881,7 @@ virUnrefDomain(virDomainPtr domain) {
>          return (0);
>      }
>
> -    pthread_mutex_lock(&domain->conn->lock);
> +    pthread_mutex_unlock(&domain->conn->lock);
>      return (refs);
>  }
>
> @@ -1013,7 +1013,7 @@ virUnrefNetwork(virNetworkPtr network) {
>          return (0);
>      }
>
> -    pthread_mutex_lock(&network->conn->lock);
> +    pthread_mutex_unlock(&network->conn->lock);
>      return (refs);
>  }

Oh!  Two locks in a row.
Amazing that such a problem wasn't exposed sooner.
Ack.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]