[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] potential bug in qemu snapshots

On 12/22/2010 11:35 AM, Eric Blake wrote:
> Right now, we create qemu snapshots to a file by using an exec: monitor
> command that passes 'compressor | { dd && dd; }' with stdout connected
> to the target file.  However, since dd is using a larger bs= than
> PIPE_MAX, it is conceivable that under heavy machine load, that dd will
> get a short read from the pipe, and unless we use the GNU extension of
> iflag=fullblock, that short read will be padded out to the output block
> size and result in data corruption in the destination file.
> The possibility of corruption due to short reads when dd is fed input
> through a pipe but produces output through a large block size has
> occurred several times on the coreutils mailing list:
> http://lists.gnu.org/archive/html/bug-coreutils/2010-11/msg00092.html
> Coreutils currently obeys the (non-intuitive) POSIX requirements for
> this short read behavior, and while it is being considered to make dd
> when POSIXLY_CORRECT is unset be more sensible, you can't rely on that
> being a default.
> Should we refuse to make snapshots if we don't detect the GNU extension
> of 'dd iflag=fullblock'?  Probably safe to do on GNU/Linux machines, but
> I'm wondering if it will have negative effects on BSD machines where GNU
> coreutils is not installed.

I've gone back and read POSIX again.  It states:

If the bs=expr operand is specified and no conversions other than sync,
noerror, or notrunc are requested, the data returned from each input
block shall be written as a separate output block; if the read returns
less than a full block and the sync conversion is not specified, the
resulting output block shall be the same size as the input block. If the
bs=expr operand is not specified, or a conversion other than sync,
noerror, or notrunc is requested, the input shall be processed and
collected into full-sized output blocks until the end of the input is

So I stand corrected - because we aren't using conv=sync, there is no
zero padding or data corruption; however, behavior can still be
improved.  With bs=, a short read results in a short write, so we aren't
using the full 1M block size that we asked of dd, and end up with more
syscalls than necessary.  Swapping to ibs= obs= instead of bs= allows dd
to do fewer syscalls on output.  Patch coming up soon, but I'm grateful
that this is not a data corruption bug after all.  And the GNU extension
of iflag=fullblock is only useful when mixing bs= and conv=sync, so
there's no need to use that extension.

Eric Blake   eblake redhat com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]