[libvirt] [PATCH] events: Fix domain event race on client disconnect

Michal Privoznik mprivozn at redhat.com
Fri Sep 7 12:55:51 UTC 2012


On 07.09.2012 14:47, Daniel P. Berrange wrote:
> On Fri, Sep 07, 2012 at 02:44:03PM +0200, Christophe Fergeau wrote:
>> On Fri, Sep 07, 2012 at 01:24:35PM +0100, Daniel P. Berrange wrote:
>>> A nice long detailed explanation. I agree that this scenario you
>>> outline is plausible as an explanation for why Boxes sometimes
>>> stops getting events from libvirtd.
>>
>> I've ran more tests in the mean time without this patch applied, but
>> with the one below to add some debugging:
>>
>> diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c
>> index 43ecdcf..33d90fb 100644
>> --- a/src/conf/domain_event.c
>> +++ b/src/conf/domain_event.c
>> @@ -1501,7 +1501,13 @@ virDomainEventStateRegisterID(virConnectPtr conn,
>>      int ret = -1;
>>
>>      virDomainEventStateLock(state);
>> +    VIR_WARN("RegisterID");

[1]

>>
>> +    if ((state->callbacks->count == 0) && (state->timer == -1)) {
>> +        if (state->queue->count != 0) {
>> +            VIR_WARN("REG: queue's not empty: %d", state->queue->count);

I understand the WARN level here but not in [1]. Isn't DEBUG just enough
there?

>> +        }
>> +    }
>>      if ((state->callbacks->count == 0) &&
>>          (state->timer == -1) &&
>>          (state->timer = virEventAddTimeout(-1,
>> @@ -1584,6 +1590,7 @@ virDomainEventStateDeregisterID(virConnectPtr conn,
>>  {
>>      int ret;
>>
>> +    VIR_WARN("DeregisterID");

ditto

>>      virDomainEventStateLock(state);
>>      if (state->isDispatching)
>>          ret = virDomainEventCallbackListMarkDeleteID(conn,
>> @@ -1596,6 +1603,9 @@ virDomainEventStateDeregisterID(virConnectPtr conn,
>>          state->timer != -1) {
>>          virEventRemoveTimeout(state->timer);
>>          state->timer = -1;
>> +        if (state->queue->count != 0) {
>> +            VIR_WARN("DEREG: queue's not empty: %d", state->queue->count);
>> +        }
>>      }
>>
>>      virDomainEventStateUnlock(state);
>>
>>
>> I've hit the event lost issue once, and right when this started happening,
>> the log was:
>> 2012-09-06 11:37:06.094+0000: 30498: warning :
>> virDomainEventStateDeregisterID:1593 : DeregisterID
>> 2012-09-06 11:37:06.094+0000: 30498: warning :
>> virDomainEventStateDeregisterID:1607 : DEREG: queue's not empty: 1
>> 2012-09-06 11:45:42.363+0000: 30502: warning :
>> virDomainEventStateRegisterID:1504 : RegisterID
>> 2012-09-06 11:45:42.363+0000: 30502: warning :
>> virDomainEventStateRegisterID:1508 : REG: queue's not empty: 1
>>
>> and after that, no events and these warnings kept happening with an
>> increasing number of queued events which is consistent with the hypothesis I made
>> in this patch.
> 
> Great, that's pretty encouraging then.
> 
> Daniel
> 




More information about the libvir-list mailing list