Re: [libvirt] [PATCHv9 5/9] blockjob: make drive-reopen safer

On 10/26/2012 07:08 AM, Peter Krempa wrote:
> On 10/23/12 04:10, Eric Blake wrote:
>> Since libvirt drops locks between issuing a monitor command and
>> getting a response, it is possible for libvirtd to be restarted
>> before getting a response on a drive-reopen command; worse, it is
>> also possible for the guest to shut itself down during the window
>> while libvirtd is down, ending the qemu process.  A management app
>> needs to know if the pivot happened (and the destination file
>> contains guest contents not in the source) or failed (and the source
>> file contains guest contents not in the destination), but since
>> the job is finished, 'query-block-jobs' no longer tracks the
>> status of the job, and if the qemu process itself has disappeared,
>> even 'query-block' cannot be checked to ask qemu its current state.
>> This is mainly a problem for the RHEL 6.3 drive-reopen command; which
>> partly explains why upstream qemu 1.3 abandoned that command and
>> went with block-job-complete plus persistent bitmap instead.  At
>> the time of this patch, the design for persistent bitmap has not
>> been clarified, so a followup patch will be needed once we actually
>> figure out how to use the qemu 1.3 interface.
>> If we surround 'drive-reopen' with a pause/resume pair, then we can
>> guarantee that the guest cannot modify either source or destination
>> files in the window of libvirtd uncertainty, and the management app
>> is guaranteed that either libvirt knows the outcome and reported it
>> correctly; or that on libvirtd restart, the guest will still be
>> paused and that the qemu process cannot have disappeared due to
>> guest shutdown; and use that as a clue that the management app must
>> implement recovery protocol, with both source and destination files
>> still being in sync and with 'query-block' still being an option as
>> part of that recovery.  My testing of the RHEL 6.3 implementation
>> of 'drive-reopen' show that the pause window will typically be only
>> a fraction of a second.
>> * src/qemu/qemu_driver.c (qemuDomainBlockPivot): Pause around
>> drive-reopen.
>> (qemuDomainBlockJobImpl): Update caller.
>> ---
>>   src/qemu/qemu_driver.c | 37 +++++++++++++++++++++++++++++++++++--
>>   1 file changed, 35 insertions(+), 2 deletions(-)
> ACK with rhel stuff in, but should/could be dropped if we will support
> only the upstream functionality.

I will keep the commit (mostly) as-is, but touch up the commit message.
 That is, with the current state of qemu.git, we STILL have to pause the
guest, since Paolo has not completed the persistent bitmap design.  I'm
not sure if he will get that done by qemu 1.3; if he does, we can
revisit this code as part of using his persistent bitmap, if he doesn't,
then I'd rather have this code be safe out-of-the-box.

