[libvirt] [PATCH] Fix reporting of i/o errors by iohelper process

Jason Herne jjherne at us.ibm.com
Wed Jul 30 14:23:47 UTC 2014


Hi Eric,

Thanks for continuing work on this.

To repeat my test:
Make a small disk image (50MB) and mount it
at /usr/local/var/lib/libvirt/qemu/save/.
Then attempt to managed save a guest whose memory is larger, say 256MB.
The image will quickly fill and you should see the "Unexpected Error"
message but nothing more.

I was unable to get any meaningful error info from iohelper. I'll try your
"goodbye world" patch and see what I get. Things might get interesting if
my result is different :).

   - Jason J. Herne
   z/VM CP Development
   IBM Corporation, Endicott, NY
   Email - jjherne at us.ibm.com
   Phone 607-429-5136 or tie-line 620-5136


|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Eric Blake <eblake at redhat.com>                                                                                                                    |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Jason Herne/Endicott/IBM at IBMUS, libvir-list at redhat.com,                                                                                           |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |07/16/2014 07:57 PM                                                                                                                               |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: [libvirt] [PATCH] Fix reporting of i/o errors by iohelper process                                                                             |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





On 07/16/2014 05:20 PM, Eric Blake wrote:

> But if I then rework the iohelper patch:
>
> diff --git i/src/util/iohelper.c w/src/util/iohelper.c
> index 8a3c377..efb1366 100644
> --- i/src/util/iohelper.c
> +++ w/src/util/iohelper.c
> @@ -301,6 +301,7 @@ main(int argc, char **argv)
>          exit(EXIT_FAILURE);
>      }
>
> +    fprintf(stderr, _("goodbye world\n")); goto error;
>      if (fd < 0 || runIO(path, fd, oflags, length) < 0)
>          goto error;
>
>
> the error is now:
>
> error: Failed to save domain testvm1 to /tmp/save
> error: operation failed: domain save job: unexpectedly failed

and with your patch, I see:

error: Failed to save domain testvm1 to /tmp/save
error: internal error: Child process (LIBVIRT_LOG_OUTPUTS=1:stderr
/home/eblake/libvirt/src/libvirt_iohelper /tmp/save 0 1) unexpected exit
status 1: goodbye world
/home/eblake/libvirt/src/libvirt_iohelper: unknown failure with /tmp/save

on the console, but this longer spew in libvirt's log:

2014-07-16 23:34:23.855+0000: 25406: error :
qemuMigrationUpdateJobStatus:1788 : operation failed: domain save job:
unexpectedly failed
2014-07-16 23:34:23.857+0000: 25406: error : virCommandWait:2423 :
internal error: Child process (LIBVIRT_LOG_OUTPUTS=1:stderr
/home/eblake/libvirt/src/libvirt_iohelper /tmp/save 0 1) unexpected exit
status 1: goodbye world
/home/eblake/libvirt/src/libvirt_iohelper: unknown failure with /tmp/save

2014-07-16 23:34:23.857+0000: 25406: warning : virFileWrapperFdClose:326
: iohelper reports: goodbye world
/home/eblake/libvirt/src/libvirt_iohelper: unknown failure with /tmp/save


so the act of closing the wrapperfd is losing the earlier error message
from being reported to the user (seems okay in this case, but might not
always be), AND logging the stderr contents twice (once via the error
reported to the user, and again due to a VIR_WARN).

>
> So the problem is that we have _two_ possible sources of errors (qemu
> reporting failure because it can't write to iohelper, and iohelper
> reporting an error from whatever other reason, such as disk full).  If
> qemu finishes, we have only the iohelper message and properly report it;
> but if we have both failures, then the qemu error takes priority, and in
> this case it is lower priority.  There are also cases where qemu will
> error out but iohelper succeeds (such as if qemu refuses to migrate
> because the guest has hostdev passthrough).
>
> So I _think_ what we want to do is always check BOTH places for error;
> if only one of the two fails, use that message.  If both fail, then I
> don't know whether it is possible to have a heuristic for which failure
> message is more meaningful, or whether it is better to report both
> errors (even though it will often be the case that one error was a
> knock-on effect from the other).  But I'm a bit stuck on the best way to
> implement that.

I'm still thinking about the best solution

--
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[attachment "signature.asc" deleted by Jason Herne/Endicott/IBM]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20140730/6d9611d2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20140730/6d9611d2/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20140730/6d9611d2/attachment-0003.gif>


More information about the libvir-list mailing list