xen proposal

seth vidal skvidal at fedoraproject.org
Fri Apr 18 20:42:45 UTC 2008


On Fri, 2008-04-18 at 15:33 -0500, Mike McGrath wrote:
> The only reason we haven't done this already is the inability to detect if
> the box is already up somewhere (which is something we need already)
> Consider this scenario:
> 
> app1 running on xen1 (which is having high load from koji1 also on xen1)
> 
> People complain about the wiki.
> 
> We move app1 to a more free box, xen7.
> 
> high load causes CRASH
> 
> xen1 reboots.  Attempts to bring app1 up (already up on xen7)
> 
> Two machines try to write to the same disk - DOOM.
> 
> 
> There is a bit of hope in this.  1) its happened before and it seems
> that the second guest sees the disk is already mounted and gets stuck at
> an fsck shell.  As long as we realize that that condition potentially
> means the box is already up and needs to be checked... we're fine.  If
> someone tries to type the root password and fsck the disk... DOOM.
> 
> This is all a sign of a larger problem with the lack of open source
> management tools for virtualization on more then one host at a time.  I'm
> a huge fan of automation so in general I'd like to
> see the plan above implemented but I think we need to alter the xm
> creation scripts (I'm not sure what this involves) that makes sure hosts
> don't come up on the wrong xen host.
> 

Okay so maybe we need a really-xen-startup init script which:
1. happens AFTER network, etc are up so iscsi items work
2. provides a locking capability so it can talk to 'something else' to
find out which domains are already locked and allocated to determine if
it should start them (and this is easy to circumvent stale locks on
crashing with a good db)
3. notifies on restart.

just a few thoughts...

thanks
-sv





More information about the Fedora-infrastructure-list mailing list