[libvirt] libvirtd deadlock on shutdown

Jim Fehlig jfehlig at suse.com
Wed Jun 20 23:37:37 UTC 2012


Eric Blake wrote:
> On 06/20/2012 03:07 PM, Jim Fehlig wrote:
>   
>> I'm looking into a libvirtd deadlock on daemon shutdown.  The deadlock
>> occurs when shutting down virNetServer.  If the handling of a job is in
>> flight in virNetServerHandleJob(), the virNetServer lock is acquired
>> when freeing job->prog (src/rpc/virnetserver.c:167).  But the lock is
>> already held in virNetServerFree(), which is blocked in
>> virThreadPoolFree() waiting for all the workers to finish.  No progress
>> can be made.
>>
>> The attached hack fixes the problem, but I'm not convinced this is an
>> appropriate fix.  Is it necessary to hold the virNetServer lock when
>> calling virNetServerProgramFree(job->prog)?  I notice the lock is not
>> held in the error path of virNetServerHandleJob().
>>
>>     
>
>   
>> +++ b/src/rpc/virnetserver.c
>> @@ -774,7 +774,9 @@ void virNetServerFree(virNetServerPtr srv)
>>      for (i = 0 ; i < srv->nservices ; i++)
>>          virNetServerServiceToggle(srv->services[i], false);
>>  
>> +    virNetServerUnlock(srv);
>>      virThreadPoolFree(srv->workers);
>> +    virNetServerLock(srv);
>>     
>
> As written, this can't be right, because it reads a field of srv outside
> the locks.  But maybe you meant:
>
> <type> tmp = srv->workers;
> srv->workers = NULL;
> virNetServerUnlock(srv);
> virThreadPoolFree(tmp);
> virNetServerLock(srv);
>
> as a possible fix that doesn't violate the safety rules of reading
> fields from srv outside a lock.
>   

Perhaps, but it _currently_ appears srv->workers is only set in
virNetServerNew(), which is called when libvirtd starts.  I suppose that
could change in the future, causing a bug as I wrote it.

> I hope danpb chimes in on this one, as he is more of an expert when it
> comes to the locking rules in virnet*.
>   

Agreed.  I think there is a better fix here.

Thanks,
Jim




More information about the libvir-list mailing list