[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [Qemu-devel] QEMU interfaces for image streaming and post-copy block migration



 On 09/07/2010 04:41 PM, Anthony Liguori wrote:
Hi,

We've got copy-on-read and image streaming working in QED and before going much further, I wanted to bounce some interfaces off of the libvirt folks to make sure our final interface makes sense.

Here's the basic idea:

Today, you can create images based on base images that are copy on write. With QED, we also support copy on read which forces a copy from the backing image on read requests and write requests.

Is copy on read QED specific? It looks very similar to the commit command, except with I/O directions reversed.

IIRC, commit looks like

  for each sector:
    if image.mapped(sector):
        backing_image.write(sector, image.read(sector))

whereas copy-on-read looks like:

  def copy_on_read():
    set_ioprio(idle)
    for each sector:
      if not image.mapped(sector):
          image.write(sector, backing_image.read(sector))
   run_in_thread(copy_on_read)

With appropriate locking.


In additional to copy on read, we introduce a notion of streaming a block device which means that we search for an unallocated region of the leaf image and force a copy-on-read operation.

The combination of copy-on-read and streaming means that you can start a guest based on slow storage (like over the network) and bring in blocks on demand while also having a deterministic mechanism to complete the transfer.

The interface for copy-on-read is just an option within qemu-img create. Streaming, on the other hand, requires a bit more thought. Today, I have a monitor command that does the following:

stream <device> <sector offset>

Which will try to stream the minimal amount of data for a single I/O operation and then return how many sectors were successfully streamed.

The idea about how to drive this interface is a loop like:

offset = 0;
while offset < image_size:
   wait_for_idle_time()
   count = stream(device, offset)
   offset += count


This is way too low level for the management stack.

Have you considered using the idle class I/O priority to implement this? That would allow host-wide prioritization. Not sure how to do cluster-wide, I don't think NFS has the concept of I/O priority.


--
error compiling committee.c: too many arguments to function


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]