[libvirt] [PATCH v2 1/2] rpc: Report proper close reason for keepalive disconnections
Martin Kletzander
mkletzan at redhat.com
Mon Dec 8 08:59:50 UTC 2014
On Fri, Dec 05, 2014 at 12:30:17PM +0100, Jiri Denemark wrote:
>On Mon, Dec 01, 2014 at 12:00:52 +0100, Martin Kletzander wrote:
>> Whenever client socket was disconnected due to keepalive timeout, the
>> I/O event loop did not exit and continued until the point where the
>> hangup was found. Ending with an error right away when the
>> keepalive times out takes care of this problem.
>>
>> Signed-off-by: Martin Kletzander <mkletzan at redhat.com>
>> ---
>> src/rpc/virkeepalive.c | 11 ++++++-----
>> src/rpc/virnetclient.c | 1 +
>> 2 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/rpc/virkeepalive.c b/src/rpc/virkeepalive.c
>> index c882313..f382f93 100644
>> --- a/src/rpc/virkeepalive.c
>> +++ b/src/rpc/virkeepalive.c
>> @@ -136,11 +136,12 @@ virKeepAliveTimerInternal(virKeepAlivePtr ka,
>> ka, ka->client, ka->countToDeath, timeval);
>>
>> if (ka->countToDeath == 0) {
>> - VIR_WARN("No response from client %p after %d keepalive messages in"
>> - " %d seconds",
>> - ka->client,
>> - ka->count,
>> - timeval);
>> + virReportError(VIR_ERR_INTERNAL_ERROR,
>> + _("No response from client %p after %d keepalive messages in"
>> + " %d seconds"),
>> + ka->client,
>> + ka->count,
>> + timeval);
>> return true;
>> } else {
>> ka->countToDeath--;
>> diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
>> index 8657b0e..ac4850c 100644
>> --- a/src/rpc/virnetclient.c
>> +++ b/src/rpc/virnetclient.c
>> @@ -1525,6 +1525,7 @@ static int virNetClientIOEventLoop(virNetClientPtr client,
>>
>> if (virKeepAliveTrigger(client->keepalive, &msg)) {
>> virNetClientMarkClose(client, VIR_CONNECT_CLOSE_REASON_KEEPALIVE);
>> + goto error;
>> } else if (msg && virNetClientQueueNonBlocking(client, msg) < 0) {
>> VIR_WARN("Could not queue keepalive request");
>> virNetMessageFree(msg);
>
>I don't think this patch is quite right. The first hunk is OK, but the
>second one is problematic. We don't have an event loop here (sigh) and
>thus we use the passing the buck algorithm to process requests from
>several threads. So in general, in case of any error (not just keepalive
>timeout), we should make sure we process all possibly pending data on
>the socket first and let completed requests finish. Only unfinished
>requests should report the error. We should first finish the rest of the
>loop and only jump to the error path once we processed possibly
>completed requests. Moreover we should make sure the right error is
>reported by all threads, i.e., we either need to carry the original
>error in client and copy it from there to the thread local error object
>or each thread needs to report the error again (which is the current
>approach with "received hangup / error event on socket" message at the
>end of the loop).
>
>Welcome to the client's IO loop code :-)
>
So if I understand it properly, I cannot use virReportError(), I
cannot finish the execution with any data left in the socket to
read... What *can* I do? :D
As you ACKed the second patch, which can go in without this one, would
you be OK with it going it with the previous series as well (all
together), i.e.:
https://www.redhat.com/archives/libvir-list/2014-November/msg01081.html
ttps://www.redhat.com/archives/libvir-list/2014-November/msg01082.html
*and* 2/2 of this v2?
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141208/93d49a79/attachment-0001.sig>
More information about the libvir-list
mailing list