[libvirt] PATCH: Fix remote driver RPC recv handling

Wed Jan 28 19:06:44 UTC 2009

"Daniel P. Berrange" <berrange at redhat.com> wrote:
> I noticed a odd error message randomly appearing from virsh after all
> commands had been run
>
>   # virsh dominfo VirtTest
>   Id:             -
>   Name:           VirtTest
>   UUID:           82038f2a-1344-aaf7-1a85-2a7250be2076
>   OS Type:        hvm
>   State:          shut off
>   CPU(s):         3
>   Max memory:     256000 kB
>   Used memory:    256000 kB
>   Autostart:      disable
>
>   libvir: Remote error : server closed connection
>
> It turns out that inthe for(;;) loop where I'm procesing incoming data,
> it was forgetting to break out of the loop when a completed RPC call
> had been received. Most of the time this wasn't  problem, because there'd
> rarely be more data following, so it'd get EAGAIN next iteration & quit
> the loop. When shutting down though, you'd occassionally see the SIGHUP
> from read() before you get an EAGAIN, causing this error to be raised
> even though the RPC call had been successfully received.
>
> In addition, if there was a 2nd RPC reply already pending, we'd read and
> loose some of its data. Though I never saw this happen, it is a definite
> theoretical possibility, particularly over a TCP link with bad latency
> or fragementation or bursty traffic.
>
> Daniel
>
> diff -r e582072116a3 src/remote_internal.c
> --- a/src/remote_internal.c	Mon Jan 26 16:21:42 2009 +0000
> +++ b/src/remote_internal.c	Mon Jan 26 17:02:15 2009 +0000
> @@ -6135,12 +6135,27 @@ processCallRecv(virConnectPtr conn, stru
>          if (priv->bufferOffset == priv->bufferLength) {
>              if (priv->bufferOffset == 4) {
>                  ret = processCallRecvLen(conn, priv, in_open);
> +                if (ret < 0)
> +                    return -1;

This part is no net change.  Just hoisting the test+return
that used to follow both calls.

> +                /*
> +                 * We'll carry on around the loop to immediately
> +                 * process the message body, because it has probably
> +                 * already arrived. Worst case, we'll get EAGAIN on
> +                 * next iteration.
> +                 */
>              } else {
>                  ret = processCallRecvMsg(conn, priv, in_open);
>                  priv->bufferOffset = priv->bufferLength = 0;
> -            }
> -            if (ret < 0)
> -                return -1;
> +                /*
> +                 * We've completed one call, so return even
> +                 * though there might still be more data on
> +                 * the wire. We need to actually let the caller
> +                 * deal with this arrived message to keep good
> +                 * response, and also to correctly handle EOF.
> +                 */
> +                return ret;

This seems like a good idea for all the reasons you give.

Suggestions, while you're in the area:
- s/EGAIN/EAGAIN/ in a comment just above, in the same function.
- less important: I'd move the declaration of "ret" down into the
    while loop, since it's used only therein.  Save 2 lines (yeah, yeah,
    out of 6800+ -- gotta start somewhere)

ACK