[libvirt] [PATCH] qemu: Fix domain resume after failed migration

Dr. David Alan Gilbert dgilbert at redhat.com
Tue Jun 19 18:55:33 UTC 2018


* Peter Krempa (pkrempa at redhat.com) wrote:
> On Mon, Jun 04, 2018 at 16:51:18 +0200, Jiri Denemark wrote:
> > Libvirt relies on being able to kill the destination domain and resume
> > the source one during migration until we called "cont" on the
> > destination. Unfortunately, QEMU automatically activates block devices
> > at the end of migration even when it's called with -S. This wasn't a big
> > issue in the past since the guest is not running and thus no data are
> > written to the block devices. However, when QEMU introduced its internal
> > block device locks, we can no longer resume the source domain once the
> > destination domain already activated the block devices (and thus
> > acquired all locks) unless the destination domain is killed first.
> > 
> > Since it's impossible to synchronize the destination and the source
> > libvirt daemons after a failed migration, QEMU introduced a new
> > migration capability called "late-block-activat" which ensures QEMU
> > won't activate block devices until it gets "cont". The only thing we
> > need to do is to enable this capability whenever QEMU supports it.
> 
> I'm wondering when this new feature should _not_ be used. I did not get
> the information from the qemu commit message so I've cc'd David to shed
> some light.
> 
> If it's desired to always pass it then I'm failing to see why they've
> added it in the first place.


There was some worry that doing it by default would be a subtle API
change; personally I didn't really see it as a problem, but since people
were worried I made it switchable.

See:
https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg01300.html

Dave
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK




More information about the libvir-list mailing list