[libvirt] [PATCH v2 1/2] rpc: Report proper close reason for keepalive disconnections

Martin Kletzander mkletzan at redhat.com
Mon Dec 8 08:59:50 UTC 2014


On Fri, Dec 05, 2014 at 12:30:17PM +0100, Jiri Denemark wrote:
>On Mon, Dec 01, 2014 at 12:00:52 +0100, Martin Kletzander wrote:
>> Whenever client socket was disconnected due to keepalive timeout, the
>> I/O event loop did not exit and continued until the point where the
>> hangup was found.  Ending with an error right away when the
>> keepalive times out takes care of this problem.
>>
>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>> ---
>>  src/rpc/virkeepalive.c | 11 ++++++-----
>>  src/rpc/virnetclient.c |  1 +
>>  2 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/rpc/virkeepalive.c b/src/rpc/virkeepalive.c
>> index c882313..f382f93 100644
>> --- a/src/rpc/virkeepalive.c
>> +++ b/src/rpc/virkeepalive.c
>> @@ -136,11 +136,12 @@ virKeepAliveTimerInternal(virKeepAlivePtr ka,
>>            ka, ka->client, ka->countToDeath, timeval);
>>
>>      if (ka->countToDeath == 0) {
>> -        VIR_WARN("No response from client %p after %d keepalive messages in"
>> -                 " %d seconds",
>> -                 ka->client,
>> -                 ka->count,
>> -                 timeval);
>> +        virReportError(VIR_ERR_INTERNAL_ERROR,
>> +                       _("No response from client %p after %d keepalive messages in"
>> +                         " %d seconds"),
>> +                       ka->client,
>> +                       ka->count,
>> +                       timeval);
>>          return true;
>>      } else {
>>          ka->countToDeath--;
>> diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
>> index 8657b0e..ac4850c 100644
>> --- a/src/rpc/virnetclient.c
>> +++ b/src/rpc/virnetclient.c
>> @@ -1525,6 +1525,7 @@ static int virNetClientIOEventLoop(virNetClientPtr client,
>>
>>          if (virKeepAliveTrigger(client->keepalive, &msg)) {
>>              virNetClientMarkClose(client, VIR_CONNECT_CLOSE_REASON_KEEPALIVE);
>> +            goto error;
>>          } else if (msg && virNetClientQueueNonBlocking(client, msg) < 0) {
>>              VIR_WARN("Could not queue keepalive request");
>>              virNetMessageFree(msg);
>
>I don't think this patch is quite right. The first hunk is OK, but the
>second one is problematic. We don't have an event loop here (sigh) and
>thus we use the passing the buck algorithm to process requests from
>several threads. So in general, in case of any error (not just keepalive
>timeout), we should make sure we process all possibly pending data on
>the socket first and let completed requests finish. Only unfinished
>requests should report the error. We should first finish the rest of the
>loop and only jump to the error path once we processed possibly
>completed requests. Moreover we should make sure the right error is
>reported by all threads, i.e., we either need to carry the original
>error in client and copy it from there to the thread local error object
>or each thread needs to report the error again (which is the current
>approach with "received hangup / error event on socket" message at the
>end of the loop).
>
>Welcome to the client's IO loop code :-)
>

So if I understand it properly, I cannot use virReportError(), I
cannot finish the execution with any data left in the socket to
read...  What *can* I do? :D

As you ACKed the second patch, which can go in without this one, would
you be OK with it going it with the previous series as well (all
together), i.e.:

https://www.redhat.com/archives/libvir-list/2014-November/msg01081.html
ttps://www.redhat.com/archives/libvir-list/2014-November/msg01082.html

*and* 2/2 of this v2?

Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141208/93d49a79/attachment-0001.sig>


More information about the libvir-list mailing list