[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] PATCH: Disable QEMU drive caching



Steve Ofsthun wrote:
Anthony Liguori wrote:
Daniel P. Berrange wrote:
On Wed, Oct 08, 2008 at 11:06:27AM -0500, Anthony Liguori wrote:
  Sorry, it was mistakenly private - fixed now.
Xen does use O_DIRECT for paravirt driver case  - blktap is using the
combo
of AIO+O_DIRECT.
You have to use O_DIRECT with linux-aio.  And blktap is well known to
have terrible performance.  Most serious users use blkback/blkfront and
blkback does not avoid the host page cache.  It maintains data integrity
by passing through barriers from the guest to the host.  You can
approximate this in userspace by using fdatasync.

This is not accurate (at least for HVM guests using PV drivers on Xen 3.2).  blkback does indeed bypass the host page cache completely.  It's I/O behavior is akin to O_DIRECT.

I reread the code more closely and convinced myself that you are correct. While it was obvious that the bio's were being constructed from granted pages, my initial impression was that the requests were still going through the scheduler and could still be satisfied from the host page cache. But that is not that case.

  I/O is dma'd directly to/from guest pages without involving any dom0 buffering.  blkback barrier support only enforces write ordering of the blkback I/O stream(s).  It does nothing to synchronize data in the host page cache.  Data written through blkback will modify the storage "underneath" any data in the host page cache (w/o flushing the page cache).  Subsequent access to the page cache by qemu-dm will access stale data.  In our own Xen product we must explicitly flush the host page cache backing store data at qemu-dm start up, to guarantee proper data access.  It is not safe to access the same backing object with both qemu-dm and blkback simultaneously.

The issue the bug addresses, iozone performs better than native, can be
addressed in the following way:

1) For IDE, you have to disable write-caching in the guest.  This should
force an fdatasync in the host.
2) For virtio-blk, we need to implement barrier support.  This is what
blkfront/blkback do.

I don't think this is enough.  Barrier semantics are local to a particular I/O stream.  There would be no reason for the barrier to affect the host page cache (unless the I/Os are buffered by the cache).

If we implement barriers in terms of fdatasync, it should be sufficient.

Regards,

Anthony Liguori


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]