[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Libvir] Fix handling of HVM boot parameters



Jeremy mentioned that it looked like libvirt wasn't able to create an
HVM domain configured to boot off cdrom, so I took a closer look at the
code and indeed the code dealing with <boot> section was both incomplete,
and just plain broken. Incomplete in so much as it never included details
of the ISO file backing the CDROM, and broken in so much as it was doing
string comparisons against the wrong variables. Digging further found
lots more work relating to creation of HVM domains so I've had a go at
writing a patch to resolve matters.

 * Parsing of UUID from SEXPR assumed that the UUID fetched would always
   have four '-' in the usual places. Well, when you run 'dumpxml' from
   virsh the <uuid> element has the UUID encoded without any '-' chars.
   If you feed this back into 'virsh create' and then once again run
   'dumpxml' parsing will fail & libvirt throws errors. So this patch
   allows for parsing of UUID's without '-'.
 * The XML document size was limited to 1k - we just 'malloc(1000)'. While
   this was enough for common cases, if someone creates lots of disk or
   network devices this would overflow and libvirt return an error. So I
   increased it to 4k which ought to be enough for forseeable future - in
   any case my previous patch to fetch XML via the proxy is limited to 4k
   in size too.

Now onto the fine details...

When converting SEXPR into XML current code is doing the following:

     (boot a)   ->  <boot dev='/dev/fd0'/>
     (boot c)   ->  <boot dev='hda'/>
     (boot d)   ->  <boot dev='/dev/cdrom'/>

This is rather inconsistent - the 'hda' is intended to map to an entry
in <devices> block. The '/dev/fd0' and '/dev/cdrom' entries did not
map to anything. 

Meanwhile, when converting from XML to SXEXPR it is doing the following

    <boot dev='hda'/>    -> (boot a)
    <boot dev='hdd'/>    -> (boot d)
    <boot dev='*'/>      -> (boot c)

Obviously this sucks because these processes should be matching each
other. 


Secondly, the (image (hvm....)) SEXPR has three entries for defining
the ISO / disk image file backing the CDROM / Floppy devices. These
were just being ignored, rather than turned into <disk> entries within
the <devices> block. Similarly there was no way to express a <disk>
entry for CDROM/Floppy in the XML when creating a domain.

The upshot of all this is that although the last release of libvirt
included HVM support it was basically unusable for domain creation
unless you were using HD to boot. The XML returned was also incorrect.

Now the good news. Since it was sooo broken, we can fix without worrying
about XML compatability since there is no way any application could be
relying on it in its current state.

Now before I discuss the solution, one final point. The Xen IOMMU model
allows specifiying the boot device in terms of 'a', 'c', 'd' which has
the meaning:

   a - first connected floppy device
   c - first connected harddrive
   d - first connected cdrom

In particular it does *not* allow for expressing boot device in terms
of an explicit disk device - you can't say boot off 'hdb' or 'hdc'. So
the current libvirt code which tries to fake such semantics is doomed
to failure. This isn't really too bad IMHO - VMware has these same
semantics, so does QEMU and so do normal bare metal BIOS

This in the patch I have attached I have implemented the following mapping:

     (boot a)   <->  <boot dev='fd'/>
     (boot c)   <->  <boot dev='hd'/>
     (boot d)   <->  <boot dev='cdrom'/>


The other part of the patch is to deal with definition of the floppy and
cdrom device backing files. For this I have done the following:

  (image (hvm (fda /root/diskboot.img))) 

    <devices>
      <disk type='file'>
        <source file='/root/diskboot.img'/>
        <target dev='fda'/>
      </disk>
    </devices>

And similar for 'fdb'. Then for cdroms:

  (image (hvm (cdrom /root/image.iso)))

    <devices>
      <disk type='file'>
        <source file='/root/image.iso'/>
        <target dev='cdrom'/>
      </disk>
    </devices>

The patch has a little bit of logic such that when converting the
<devices> block backinto an SEXPR it filters out the disk entries
with a dev of 'fda', 'fdb' and 'cdrom' since they need to end up in
a different part of the SEXPR.

Finally, although PV guests automatically get a serial console created
for them (and i don't think you can turn it off), HVM guests need to have
a serial console explicitly requested. Since I recently added support in
the DumpXML method for including details of serial console like this:

   <devices>
     <console tty="/dev/pts/5"/>
   </devices>
 
I decided to leverage this same structure when creating HVM domains. So
if you have:

   <devices>
     <console/>
   </devices>

Then the SEXPR sent to XenD will include

  (image (hvm (serial pty)))

Which enables allocation of PseudoTTY for the HVM's serial console.

I have tested that with this patch I can successfully create a HVM domain
which boots off a floppy, harddrive or cdrom. Furthermore if you then dump
the XML of this domain,the XML you get back will match the XML you fed in
(with the obvious exception of domain ID, and the Pseudo TTY path).

If you have been monitoring xen-devel mailing lists you'll be aware that
in 3.0.3 the way CDROM devices are configured is changing:

 http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00369.html

Although the patch attached does not support the config outlined in that
mail, I'm pretty confident that a small incremental patch will be able to
support it without breaking compatability with the changes I've outlined
in this mail. The only tricky bit will be that we need to detect whether
libvirt is running against a 3.0.2 or 3.0.3 version of XenD to decide how
to convert XML -> SEXPR & vica verca.

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 
Index: src/xend_internal.c
===================================================================
RCS file: /data/cvs/libvirt/src/xend_internal.c,v
retrieving revision 1.52
diff -c -b -r1.52 xend_internal.c
*** src/xend_internal.c	9 Aug 2006 15:21:16 -0000	1.52
--- src/xend_internal.c	9 Aug 2006 22:58:48 -0000
***************
*** 791,796 ****
--- 791,808 ----
          goto error;
  
      ret = sscanf(r,
+                  "%02x%02x%02x%02x"
+                  "%02x%02x%02x%02x"
+                  "%02x%02x%02x%02x"
+                  "%02x%02x%02x%02x",
+                  uuid + 0, uuid + 1, uuid + 2, uuid + 3,
+                  uuid + 4, uuid + 5, uuid + 6, uuid + 7,
+                  uuid + 8, uuid + 9, uuid + 10, uuid + 11,
+                  uuid + 12, uuid + 13, uuid + 14, uuid + 15);
+     if (ret == 16)
+         goto done;
+ 
+     ret = sscanf(r,
                   "%02x%02x%02x%02x-"
                   "%02x%02x-"
                   "%02x%02x-"
***************
*** 1416,1436 ****
          virBufferVSprintf(buf, "    <loader>%s</loader>\n", tmp);
          tmp = sexpr_node(node, "domain/image/hvm/boot");
          if ((tmp != NULL) && (tmp[0] != 0)) {
- 	   /*
-             * FIXME:
-             * Figure out how to map the 'a', 'b', 'c' nonsense to a
-             * device.
- 	    */
             if (tmp[0] == 'a')
!                virBufferAdd(buf, "    <boot dev='/dev/fd0'/>\n", 25 );
             else if (tmp[0] == 'c')
  	   /*
              * Don't know what to put here.  Say the vm has been given 3
              * disks - hda, hdb, hdc.  How does one identify the boot disk?
  	    */
!                virBufferAdd(buf, "    <boot dev='hda'/>\n", 22 );
             else if (strcmp(tmp, "d") == 0)
!                virBufferAdd(buf, "    <boot dev='/dev/cdrom'/>\n", 29 );
          }
      } else {
          virBufferVSprintf(buf, "    <type>linux</type>\n");
--- 1428,1446 ----
          virBufferVSprintf(buf, "    <loader>%s</loader>\n", tmp);
          tmp = sexpr_node(node, "domain/image/hvm/boot");
          if ((tmp != NULL) && (tmp[0] != 0)) {
             if (tmp[0] == 'a')
!                /* XXX no way to deal with boot from 2nd floppy */
!                virBufferAdd(buf, "    <boot dev='fd'/>\n", 21 );
             else if (tmp[0] == 'c')
                 /*
                  * Don't know what to put here.  Say the vm has been given 3
                  * disks - hda, hdb, hdc.  How does one identify the boot disk?
+                 * We're going to assume that first disk is the boot disk since
+                 * this is most common practice
                  */
!                virBufferAdd(buf, "    <boot dev='hd'/>\n", 21 );
             else if (strcmp(tmp, "d") == 0)
!                virBufferAdd(buf, "    <boot dev='cdrom'/>\n", 24 );
          }
      } else {
          virBufferVSprintf(buf, "    <type>linux</type>\n");
***************
*** 1482,1492 ****
          /* ERROR */
          return (NULL);
      }
!     ret = malloc(1000);
      if (ret == NULL)
          return (NULL);
      buf.content = ret;
!     buf.size = 1000;
      buf.use = 0;
  
      domid = sexpr_int(root, "domain/domid");
--- 1492,1502 ----
          /* ERROR */
          return (NULL);
      }
!     ret = malloc(4000);
      if (ret == NULL)
          return (NULL);
      buf.content = ret;
!     buf.size = 4000;
      buf.use = 0;
  
      domid = sexpr_int(root, "domain/domid");
***************
*** 1625,1630 ****
--- 1635,1662 ----
      }
  
      if (hvm) {
+         tmp = sexpr_node(root, "domain/image/hvm/fda");
+         if ((tmp != NULL) && (tmp[0] != 0)) {
+             virBufferAdd(&buf, "    <disk type='file'>\n", 23);
+             virBufferVSprintf(&buf, "      <source file='%s'/>\n", tmp);
+             virBufferAdd(&buf, "      <target dev='fda'/>\n", 26);
+             virBufferAdd(&buf, "    </disk>\n", 12);
+         }
+         tmp = sexpr_node(root, "domain/image/hvm/fdb");
+         if ((tmp != NULL) && (tmp[0] != 0)) {
+             virBufferAdd(&buf, "    <disk type='file'>\n", 23);
+             virBufferVSprintf(&buf, "      <source file='%s'/>\n", tmp);
+             virBufferAdd(&buf, "      <target dev='fdb'/>\n", 26);
+             virBufferAdd(&buf, "    </disk>\n", 12);
+         }
+         tmp = sexpr_node(root, "domain/image/hvm/cdrom");
+         if ((tmp != NULL) && (tmp[0] != 0)) {
+             virBufferAdd(&buf, "    <disk type='file'>\n", 23);
+             virBufferVSprintf(&buf, "      <source file='%s'/>\n", tmp);
+             virBufferAdd(&buf, "      <target dev='cdrom'/>\n", 28);
+             virBufferAdd(&buf, "    </disk>\n", 12);
+         }
+         
          /* Graphics device */
          tmp = sexpr_node(root, "domain/image/hvm/vnc");
          if (tmp != NULL) {
***************
*** 1641,1651 ****
             if (tmp[0] == '1')
                 virBufferAdd(&buf, "    <graphics type='sdl'/>\n", 27 );
          }
- 
-         /*
-          * TODO:
-          * Device for cdrom
-          */
      }
      
      tty = xenStoreDomainGetConsolePath(conn, domid);
--- 1673,1678 ----
Index: src/xml.c
===================================================================
RCS file: /data/cvs/libvirt/src/xml.c,v
retrieving revision 1.30
diff -c -b -r1.30 xml.c
*** src/xml.c	9 Aug 2006 15:21:16 -0000	1.30
--- src/xml.c	9 Aug 2006 22:58:49 -0000
***************
*** 634,655 ****
      obj = NULL;
  
      if (boot_dev) {
!        /* TODO:
!         * Have to figure out the naming used here.
!         */
!        if (xmlStrEqual(type, BAD_CAST "hda")) {
            virBufferVSprintf(buf, "(boot a)", (const char *) boot_dev);
!        } else if (xmlStrEqual(type, BAD_CAST "hdd")) {
            virBufferVSprintf(buf, "(boot d)", (const char *) boot_dev);
!        } else {
!           /* Force hd[b|c] if boot_dev specified but not floppy or cdrom? */
            virBufferVSprintf(buf, "(boot c)", (const char *) boot_dev);
         }
      }
!     /* TODO:
!      * Is a cdrom disk device specified?
!      * Kind of ugly since it is buried in the devices/diskk node.
!      */
      
      /* Is a graphics device specified? */
      obj = xmlXPathEval(BAD_CAST "/domain/devices/graphics[1]", ctxt);
--- 634,704 ----
      obj = NULL;
  
      if (boot_dev) {
!        if (xmlStrEqual(boot_dev, BAD_CAST "fd")) {
            virBufferVSprintf(buf, "(boot a)", (const char *) boot_dev);
!        } else if (xmlStrEqual(boot_dev, BAD_CAST "cdrom")) {
            virBufferVSprintf(buf, "(boot d)", (const char *) boot_dev);
!        } else if (xmlStrEqual(boot_dev, BAD_CAST "hd")) {
            virBufferVSprintf(buf, "(boot c)", (const char *) boot_dev);
+        } else {
+          /* Any other type of boot dev is unsupported right now */
+          virXMLError(VIR_ERR_XML_ERROR, NULL, 0);
         }
+ 
+        /* get the 1st floppy device file */
+        obj = xmlXPathEval(BAD_CAST "/domain/devices/disk[ type='file' and target/@dev='fda']/source", ctxt);
+        if ((obj != NULL) && (obj->type == XPATH_NODESET) &&
+            (obj->nodesetval != NULL) && (obj->nodesetval->nodeNr == 1)) {
+          cur = obj->nodesetval->nodeTab[0];
+          virBufferVSprintf(buf, "(fda '%s')",
+                            (const char *) xmlGetProp(cur, BAD_CAST "file"));
+          cur = NULL;
+        }
+        if (obj) {
+          xmlXPathFreeObject(obj);
+          obj = NULL;
         }
! 
!        /* get the 2nd floppy device file */
!        obj = xmlXPathEval(BAD_CAST "/domain/devices/disk[ type='file' and target/@dev='fdb']/source", ctxt);
!        if ((obj != NULL) && (obj->type == XPATH_NODESET) &&
!            (obj->nodesetval != NULL) && (obj->nodesetval->nodeNr == 1)) {
!          cur = obj->nodesetval->nodeTab[0];
!          virBufferVSprintf(buf, "(fdb '%s')",
!                            (const char *) xmlGetProp(cur, BAD_CAST "file"));
!          cur = NULL;
!        }
!        if (obj) {
!          xmlXPathFreeObject(obj);
!          obj = NULL;
!        }
! 
! 
!        /* get the cdrom device file */
!        obj = xmlXPathEval(BAD_CAST "/domain/devices/disk[ type='file' and target/@dev='cdrom']/source", ctxt);
!        if ((obj != NULL) && (obj->type == XPATH_NODESET) &&
!            (obj->nodesetval != NULL) && (obj->nodesetval->nodeNr == 1)) {
!          cur = obj->nodesetval->nodeTab[0];
!          virBufferVSprintf(buf, "(cdrom '%s')",
!                            (const char *) xmlGetProp(cur, BAD_CAST "file"));
!          cur = NULL;
!        }
!        if (obj) {
!          xmlXPathFreeObject(obj);
!          obj = NULL;
!        }
!     }
! 
!     obj = xmlXPathEval(BAD_CAST "count(domain/devices/console) > 0", ctxt);
!     if ((obj == NULL) || (obj->type != XPATH_BOOLEAN)) {
!       virXMLError(VIR_ERR_XML_ERROR, NULL, 0);
!       goto error;
!     }
!     if (obj->boolval) {
!       virBufferAdd(buf, "(serial pty)", 12);
!     }
!     xmlXPathFreeObject(obj);
!     obj = NULL;
      
      /* Is a graphics device specified? */
      obj = xmlXPathEval(BAD_CAST "/domain/devices/graphics[1]", ctxt);
***************
*** 795,800 ****
--- 844,850 ----
              typ = 1;
          xmlFree(type);
      }
+     
      cur = node->children;
      while (cur != NULL) {
          if (cur->type == XML_ELEMENT_NODE) {
***************
*** 828,833 ****
--- 878,896 ----
              xmlFree(source);
          return (-1);
      }
+ 
+     /* Skip floppy/cdrom disk used as the boot device
+        since that's incorporated into the HVM kernel
+        (image (hvm..)) part of the sexpr, rather than
+        the (devices...) bit. Odd Xend HVM config :-( */
+     if (!strcmp((const char *)target, "fda") ||
+         !strcmp((const char *)target, "fdb") ||
+         !strcmp((const char *)target, "cdrom")) {
+       return 0;
+     }
+ 
+ 
+     virBufferAdd(buf, "(device ", 8);
      virBufferAdd(buf, "(vbd ", 5);
      virBufferVSprintf(buf, "(dev '%s')", (const char *) target);
      if (typ == 0)
***************
*** 844,849 ****
--- 907,913 ----
          virBufferVSprintf(buf, "(mode 'r')");
  
      virBufferAdd(buf, ")", 1);
+     virBufferAdd(buf, ")", 1);
      xmlFree(target);
      xmlFree(source);
      return (0);
***************
*** 911,916 ****
--- 975,981 ----
      }
      if (script != NULL)
          virBufferVSprintf(buf, "(script '%s')", script);
+     virBufferAdd(buf, "(type ioemu)", 12);
  
      virBufferAdd(buf, ")", 1);
      if (mac != NULL)
***************
*** 1089,1100 ****
      if ((obj != NULL) && (obj->type == XPATH_NODESET) &&
          (obj->nodesetval != NULL) && (obj->nodesetval->nodeNr >= 0)) {
  	for (i = 0; i < obj->nodesetval->nodeNr; i++) {
- 	    virBufferAdd(&buf, "(device ", 8);
  	    res = virDomainParseXMLDiskDesc(obj->nodesetval->nodeTab[i], &buf);
  	    if (res != 0) {
  		goto error;
  	    }
- 	    virBufferAdd(&buf, ")", 1);
  	}
      }
      xmlXPathFreeObject(obj);
--- 1154,1163 ----

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]