[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH]: Export the 'label' on block devices



Chris Lalancette <clalance redhat com> wrote:
> To support LVM partitioning in oVirt, one of the things we need is the ability
> to tell what kind of label is currently on a block device.  Here, a 'label' is
> used in the same sense that it is used in parted; namely, it defines which kind
> of partition table is on the disk, whether it be DOS, LVM2, SUN, BSD, etc.  Note
> that this is different than the partition type; those are things like Linux,
> FAT16, FAT32, etc.
>
> This actually turns out to be fairly easy to implement; there are really only a
> few labels that are in common use, and they all have an easy signature to
> recognize (but see comments in the code about pc98).  This patch implements
> label detection on block devices in virStorageBackendUpdateVolInfoFD, and hooks
> it up to the iSCSI backend so it works there.
>
> To keep code duplication down, I moved some of the enum's from
> storage_backend_disk.c into a common place.  Note, however, that there is a
> slight semantic change because of this.  Previously, if no label was found on a
> disk in storage_backend_disk.c, it would always return "dos" as the label type.
>  That's not actually true, though; if it's a completely zeroed disk, for
> instance, it really just has label type of 'unknown'.  This patch changes to the
> new semantic of 'unknown' for label types we don't understand.  I don't think
> this will be a huge issue for compatibility, but there could be something I'm
> missing.

Hi Chris,

I like the patch.
Some minor suggestions below.

> Otherwise, this patch has been tested by me to work, and now when you do:
>
> # virsh vol-dumpxml --pool iscsitest lun-1
>
> you'll get:
>
> <volume>
> ...
>   <target>
>       ...
>       <format type='dos' />
>
> Which should be sufficient for oVirt to do it's detection.
>
> Signed-off-by: Chris Lalancette <clalance redhat com>
> Index: src/storage_backend.c
> ===================================================================
> RCS file: /data/cvs/libvirt/src/storage_backend.c,v
> retrieving revision 1.21
> diff -u -r1.21 storage_backend.c
> --- src/storage_backend.c	5 Sep 2008 12:03:45 -0000	1.21
> +++ src/storage_backend.c	16 Oct 2008 07:29:46 -0000
> @@ -192,6 +192,30 @@
>      return ret;
>  }
>
> +struct diskType disk_types[] = {
> +    { "lvm2", VIR_STORAGE_POOL_DISK_LVM2, 0x218, 8, 0x31303020324D564CUL },
> +    { "gpt",  VIR_STORAGE_POOL_DISK_GPT,  0x200, 8, 0x5452415020494645UL },
> +    { "dvh",  VIR_STORAGE_POOL_DISK_DVH,  0x0,   4, 0x41A9E50BUL },
> +    { "mac",  VIR_STORAGE_POOL_DISK_MAC,  0x0,   2, 0x5245UL },
> +    { "bsd",  VIR_STORAGE_POOL_DISK_BSD,  0x40,  4, 0x82564557UL },
> +    { "sun",  VIR_STORAGE_POOL_DISK_SUN,  0x1fc, 2, 0xBEDAUL },

Some partition tables are deliberately created to be dual GPT/DOS-based.
Here, it gets the first match (as you've deliberately ordered them that
way, I see).  We might need a mechanism to let the caller prefer one or
the other.  Also, realize that this can be wrong if there was once a GPT
table, and it has since been replaced with a DOS one that still happens
to have the GPT signature bytes.  Or maybe it was dual GPT/DOS long ago
but has since been maintained only as DOS (in which case all GPT-related
headers could be gone).  This suggests that pronouncing a partition
table to have type GPT based solely on the first 1KB is a little risky.
But it's probably just fine for now.

> +    /*
> +     * NOTE: pc98 is funky; the actual signature is 0x55AA (just like dos), so
> +     * we can't use that.  At the moment I'm relying on the "dummy" IPL
> +     * bootloader data that comes from parted.  Luckily, the chances of running
> +     * into a pc98 machine running libvirt are approximately nil.

I agree.
Why not just exclude it?

> +     */
> +    /*{ "pc98", 0x1fe, 2, 0xAA55UL },*/
> +    { "pc98", VIR_STORAGE_POOL_DISK_PC98, 0x0,   8, 0x314C5049000000CBUL },
> +    /*
> +     * NOTE: the order is important here; some other disk types (like GPT and
> +     * and PC98) also have 0x55AA at this offset.  For that reason, the DOS
> +     * one must be the last one.
> +     */
> +    { "dos",  VIR_STORAGE_POOL_DISK_DOS,  0x1fe, 2, 0xAA55UL },
> +    { NULL,   -1,                         0x0,   0, 0x0UL },
> +};
> +
>  int
>  virStorageBackendUpdateVolInfoFD(virConnectPtr conn,
>                                   virStorageVolDefPtr vol,
> @@ -245,6 +269,40 @@
>          if (withCapacity) vol->capacity = end;
>      }
>
> +    /* make sure to set the target format "unknown" to begin with */
> +    vol->target.format = VIR_STORAGE_POOL_DISK_UNKNOWN;
> +
> +    if (S_ISBLK(sb.st_mode)) {
> +        off_t start;
> +        int i;
> +        unsigned char buffer[1024];
> +        ssize_t bytes;
> +
> +        start = lseek(fd, 0, SEEK_SET);
> +        if (start < 0) {
> +            virStorageReportError(conn, VIR_ERR_INTERNAL_ERROR,
> +                                  _("cannot seek to beginning of file '%s':%s"),
> +                                  vol->target.path, strerror(errno));
> +            return -1;
> +        }
> +        memset(buffer, 0, 1024);

You can remove the above memset, given one of the suggestions below.

> +        bytes = saferead(fd, buffer, 1024);

Better not to repeat constants like that.  use sizeof:

           bytes = saferead(fd, buffer, sizeof buffer);

> +        if (bytes < 0) {
> +            virStorageReportError(conn, VIR_ERR_INTERNAL_ERROR,
> +                                  _("cannot read beginning of file '%s':%s"),
> +                                  vol->target.path, strerror(errno));
> +            return -1;
> +        }

At worst, memset just the probably-empty portion of the buffer
that wasn't set by the read:

           memset(buffer + bytes, 0, sizeof buffer - bytes);

However, I prefer to drop the memset altogether and check
offset+len against "bytes" below:

> +        for (i = 0; disk_types[i].name != NULL; i++) {

It'd be good to check that offset+len is within range, regardless
of whether the buffer is zeroed.  In case someone decides to shrink
buffer or to add a type with magic bytes past the 1KB mark.

               if (disk_types[i].offset + disk_types[i].length > bytes)
                   continue;

> +            if (memcmp(buffer+disk_types[i].offset, &disk_types[i].magic,
> +                disk_types[i].length) == 0) {
> +                vol->target.format = disk_types[i].label;
> +                break;
> +            }
> +        }
> +    }
> +
...
> Index: src/storage_backend.h
> ===================================================================
> RCS file: /data/cvs/libvirt/src/storage_backend.h,v
> retrieving revision 1.7
> diff -u -r1.7 storage_backend.h
> --- src/storage_backend.h	2 Sep 2008 14:15:42 -0000	1.7
> +++ src/storage_backend.h	16 Oct 2008 07:29:46 -0000
> @@ -56,6 +56,30 @@
>      VIR_STORAGE_BACKEND_POOL_SOURCE_NAME    = (1<<4),
>  };
>
> +enum diskLabel {
> +    VIR_STORAGE_POOL_DISK_DOS = 0,

I like to start with 1 or larger, so that using an uninitialized
variable (often 0) is more likely to trigger an error.

> +    VIR_STORAGE_POOL_DISK_DVH,
> +    VIR_STORAGE_POOL_DISK_GPT,
> +    VIR_STORAGE_POOL_DISK_MAC,
> +    VIR_STORAGE_POOL_DISK_BSD,
> +    VIR_STORAGE_POOL_DISK_PC98,
> +    VIR_STORAGE_POOL_DISK_SUN,
> +    VIR_STORAGE_POOL_DISK_LVM2,
> +    /* the "unknown" disk is 1 billion (and not, for instance, -1), to make
> +       sure it doesn't run afoul of error checking */
> +    VIR_STORAGE_POOL_DISK_UNKNOWN = 1 * 1024 * 1024 * 1024,
> +};
> +
> +struct diskType {
> +    const char *name;
> +    /* the 'label' terminology comes from parted */

i find "label" (meaning "partition table") terminology confusing, if only
because each partition can have a volume-label string.  If it weren't so
ensconced in Parted's code, directory structure (libparted/labels/*.c)
and documentation, I would have excised it long ago.  I much prefer
"partition table".

In a way, I don't mean to make you change names, but I'd hate
to see Parted's poorly chosen name propagate any more than it has.

> +    enum diskLabel label;
> +    unsigned short offset;
> +    unsigned short length;
> +    unsigned long long magic;
> +};
> +extern struct diskType disk_types[];
> +
>  typedef struct _virStorageBackendPoolOptions virStorageBackendPoolOptions;
>  typedef virStorageBackendPoolOptions *virStorageBackendPoolOptionsPtr;
>  struct _virStorageBackendPoolOptions {


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]