[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] KVM Live migration when node's FS is read-only

Hi all,

  So I hit a weird issue last week... (EL6 + cman + rgamanager + drbd)

For reasons unknown, a client thought they could start yanking and replacing hard drives on a running node. Obviously, that did not end well. The VMs that had been running on the node continues to operate fine and they just started using the peer's storage.

The problem came when I tried to live-migrate the VMs over to the still-good node. Obviously, the old host couldn't write to logs, and the live-migration failed. Once failed, rgmanager also stopped working once the migration failed. In the end, I had to manually fence the node (corosync never failed, so it didn't get automatically fenced).

This obviously caused the VMs running on the node to reboot, causing a ~40 second outage. It strikes me that the system *should* have been able to migrate, had it not tried to write to the logs.

Is there a way, or can there be made a way, to migrate VMs off of a node whose underlying FS is read-only/corrupt/destroyed, so long as the programs in memory are still working?

  I am sure this is part a part rgmanager, part KVM/qemu question.

Thanks for any feedback!

Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]