[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [RFC] Add a new feild to qemud_save_header to tell if the saved image is corrupt



On 08/18/2011 11:42 PM, Osier Yang wrote:
Remember, that 'migrate' is a long-running async job command, and can
be interrupted. That is, 'service libvirtd restart' is a legal action
to take during step 3, and it is not as severe as a libvirtd crash,
and we have already recently added patches to remember async job
status across libvirtd restarts with the intention of making it legal
to restart libvirtd in the middle of an async job (whether the async
job should still succeed, or should remove the save file, is a
slightly different question; but removing the save file would require
that we save in the XML the name of the file to remove if libvirtd is
restarted).

Hmm, how about restart libvirtd during the process of managed saving?

Domain will be restored from the corrupt save image automatically. We
report an error like "image is corrupt" and quite the domain starting
simply?
This might be not good, as one will see a running domain fails to start
after libvirtd restarting.

Or we want to the managed saving still succeed? If so, we might need:

1) continue the managed saving job, (Per we are already support remeber
the async job status across libvirtd restarting)
2) restore from the saved image finished in 1).

I think the easiest approach is:

if we restart libvirtd, and see that an async job for save-to-file was in progress, then we abort the job (leaving the file marked unfinished, whether it was managed save or user save), and log the error.

On managed restore (virDomainCreate or autostart), if the save file exists but is incomplete, then log the fact that the file is unusable, then unlink() the file and proceed to do a normal boot (nothing we can do to recover the lost autosave, but we can at least clean up on the user's behalf).

On user restore (virDomainRestore), if the save file exists but is incomplete, report the error to the user. No unlink(), and no rebooting the guest; it's up to the user to decide how to handle the failed save.

But if we can figure out how to do better, by making a libvirtd restart able to complete the save process rather than ditch it, then that would be nicer. It's just that I don't know how easy that would be, and we have to start this patch somewhere.

--
Eric Blake   eblake redhat com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]