[Ovirt-devel] LVM Fun

Tue Jan 6 13:46:29 UTC 2009

Ian Main wrote:
>>> Code from libvirt-ruby:
>>>
>>> /* * Call
>>> +virStorageVolLookupByPath+[http://www.libvirt.org/html/libvirt-libvirt.html#virStorageVolLookupByPath]
>>>  */ VALUE libvirt_pool_lookup_vol_by_path(VALUE p, VALUE path) { 
>>> virStorageVolPtr vol;
>>>
>>> // FIXME: Why does this take a connection, not a pool ? vol =
>>> virStorageVolLookupByPath(conn(p), StringValueCStr(path)); _E(vol == NULL,
>>> create_error(e_RetrieveError, "virStorageVolLookupByPath", "", conn(p)));
>>>
>>> return vol_new(vol, conn_attr(p)); }
>>>
>>> As you can see in the FIXME comment and the usage of
>>> virStorageVolLookupByPath(), it's not associated with a specific pool.
> 
> This in turn opens up the possibility that another storage pool may be active
> on the host we select to do the scanning, which would mean we'd pick up LVM volumes
> from that other pool as well.  From what I can tell, right now the only way to fill
> the LVM volume/pool information in the DB correctly is by keeping track of it 
> during creation which defeats the purpose of the scan/refresh.  Hence the removal.
> 
> If I missed something (which is possible), let me know, but from what I can tell
> this is the case.  While I've done some playing with the API and have shown myself
> that this is the case, it might be worth setting up a test case with two pools with
> LVM volumes in each and see if it sorts them out right.

I see what you mean about pool.lookup_vol_by_path() not doing what I expect.
However, I'm not sure that is actually an issue.  But let me go back and start
at the beginning here.
     Your assertion that we should "rely on the database" for information just
can't work.  The database doesn't *have* the information at this point; it is
depending on us (i.e. taskomatic) to get the information about whether there are
existing LV's on the device or not.  The database can't know this a-priori, so
we have to scan it to find out.  That's why this piece of functionality has to
work; if you did something like:

a)  add an ISCSI pool
b)  create a few LV's on the LUN's from that ISCSI pool
c)  delete the ISCSI pool from ovirt
d)  add the ISCSI pool back to ovirt

our only choice without LVM scanning would be to blow away the data that's
already on the LUNs, which isn't usually desirable.
     So.  That's why we want to do this.  Next to define what exactly we are
trying to do with the code, and then see if the code accomplishes it.  At
pool_refresh time, what we are essentially trying to do is get a list of all of
the volumes available on the pool we are given.  For NFS, this is easy;
basically when we are adding a new NFS storage pool, we want to add all of the
existing files on the share as NFS storage volumes.  For iSCSI it's a lot more
complicated.  We want to scan all existing iSCSI LUNs, add them to the database
as iSCSI volumes, and then additionally scan each LUN to see if it has LVM
logical volumes already present.  If so, we also want to add that volume group
to the database as a LVM storage pool, and then each of the logical volumes in
that volume group gets added as an LVM storage volume.  Make sense?
    Now that we have that down, we can see what the current taskomatic code does
for ISCSI.  Basically, it starts by scanning the pool and adding all volumes
(i.e. raw LUNS) to the database:

      # OK, the pool is all set.  Add in all of the volumes
      add_volumes_to_db(phys_db_pool, phys_libvirt_pool)

Fine.  Next, it discovers all logical pool sources:

      # OK, now we've scanned the underlying hardware pool and added the
      # volumes.  Next we scan for pre-existing LVM volumes
      logical_xml = conn.discover_storage_pool_sources("logical")

Then it iterates through each logical pool source.  This is where things get
tricky.  Each logical pool source equates to a volume group available.
Basically, we iterate through each logical pool source, and see if that pool
source is already in the database.  If it is, we basically just rescan all of
the LV's attached.  If it isn't, we create a new logical pool in the database,
then scan all of the LV's attached.  I guess the key point here is that yes, we
may add stuff to the database that wasn't *this* particular pool, but it
*doesn't matter*.  The only way that lookup_vol_by_path() will return a volume
is if it is a libvirt defined volume, and the only way a libvirt defined volume
can get started on an ovirt node is via ovirt.  That just means that yes, we may
unnecessarily rescan volumes that are already there, but we are always
guaranteed to add the logical volumes (if they exist) on the iSCSI LUN that we
just added.  Does this make sense to you, or am I still talking crazy?
    The above still leaves this weird piece of code:

        begin
          source.elements.each("device") do |device|
            byid_device =
phys_libvirt_pool.lookup_vol_by_path(device.attributes["path"]).path
          end
        rescue
          # If matching any of the <device> sections in the LVM XML fails
          # against the storage pool, then it is likely that this is a storage
          # pool not associated with the one we connected above.  Go on
          # FIXME: it would be nicer to catch the right exception here, and
          # fail on other exceptions
          puts "One of the logical volumes in #{vgname} is not part of the pool
of type #{phys_db_pool[:type]} that we are scanning; ignore the previous error!"
          next
        end

Which seems completely unnecessary given my above explanation, but is actually
there to handle a corner case.  Namely, it handles the case where you have an
LVM partition defined already on the host, but it is *not* managed by libvirt.
Because of the way storage pool discovery is done, it will return entries for
*all* logical volumes on the host, even those that aren't managed by libvirt.
However, lookup_vol_by_path() will only return a valid entry for a volume
managed by libvirt.  So the above basically ensures that any logical pools that
are on the host, but not managed by libvirt, are skipped.
     Please go through the above logic, and see if it makes sense to you.  We
can talk about it more on IRC if there are still confusing things, or if I am
completely off-base here.

-- 
Chris Lalancette