[Libvir] Unified Xen patch (second version)

Mon Apr 2 14:54:21 UTC 2007

On Mon, Apr 02, 2007 at 03:05:19PM +0100, Daniel P. Berrange wrote:
> On Fri, Mar 30, 2007 at 02:27:46PM +0100, Richard W.M. Jones wrote:
> > Ignore the just posted "first complete version".  It had some stupid 
> > stuff going on with connection opens.  This version fixes all that, and 
> > passes 'make check'.
> 
> > /* These dispatch functions follow the model used historically
> >  * by libvirt.c -- trying each low-level Xen driver in turn
> >  * until one succeeds.  However since we know what low-level
> >  * drivers can perform which functions, it is probably better
> >  * in future to optimise these dispatch functions to just call
> >  * the single function (or small number of appropriate functions)
> >  * in the low level drivers directly.
> >  */
> 
> For now, keeping the same internal driver model will help debugging any
> issues in this conversion. I agree that we should later re-factor this
> further to explicitly call the appropriate methods directly because I
> don't think merely iterating over driver is particularly useful. For
> any given libvirt method is is easy to define exactly how we should fetch
> the info.
> 
> The current situation is a little odd - if we doing 'pause' then we should
> always just go straight to the hypervisor. If that fails there is no point
> trying any others because both XenD / XenStored ultimately call into the
> hypervisor too.

  The difference in the xend case is that then Xend becomes aware of the
status change.

> I don't see the point if the XenStoreD driver at all. It is lower priority
> than the XenD driver is, and since XenD driver has 100% coverage of all
> APIs there is no circumstance in which the XenStoreD driver will ever be
> called. There's a couple of helper methods in the xenstored.c file which
> are used directly be xend.c / xml.c, but aside from that the rest is 100%
> untested by any of us.

  In general agreed, except we don't know about weird cases, for example
going though xend to shutdown a domain in emergency may not work, while
the xenstored mechanism may still work because it's direct.

> > static int
> > xenUnifiedDomainSuspend (virDomainPtr dom)
> > {
> >  int i;
> > 
> >   /* Try non-hypervisor methods first, then hypervisor direct method
> >    * as a last resort.
> >    */
> >   for (i = 0; i < nb_drivers; ++i)
> >     if (i != hypervisor_offset &&
> > 	 drivers[i]->domainSuspend &&
> > 	 drivers[i]->domainSuspend (dom) == 0)
> >       return 0;
> >
> >   if (drivers[hypervisor_offset]->domainSuspend &&
> >       drivers[hypervisor_offset]->domainSuspend (dom) == 0)
> >     return 0;
> >
> >   return -1;
> > }
> 
> Why do we try the non-hypervisor method first ? What does XenD give us that
> we don't get just by going to the HV immediately, aside from higher overheads.

  Xend may become confused because the state changed without it being aware of
that . 

> Likewise for Resume/Destroy/SetVCPUs.  If there is a compelling reason why
> we must call XenD, then we shouldn't try to fallback to the HV at all. If
> there isn't a compelling reason, then we should try HV first.

  fallback is useful because Xend may start failing say under load or memory
pressure in domain 0, in general the current approach is IMHO more reliable.

> Was this how the existing code already worked, if so i guess we should leave
> as is until we can cleanup like I described earlier ? If any knows why, we 
> should at least comment this voodoo...

  /*
   * Xen is woodoo and xend is not the most reliable piece of code
   * a bit of redundancy may help.
   */

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard at redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/