[dm-devel] [PATCH 4/4] libmultipath: trigger uevents for partitions, too

Roger Heflin rogerheflin at gmail.com
Wed Jul 10 17:42:00 UTC 2019


On Wed, Jul 10, 2019 at 5:49 AM Martin Wilck <Martin.Wilck at suse.com> wrote:
>
> On Tue, 2019-07-09 at 11:40 -0500, Roger Heflin wrote:
> > We have an observed behavior as follows:
> >
> > When the host boots up, a uuid symbolic link is created pointing at
> > /dev/sda1 (device for /boot)
> >
> > Multipath starts up and creates an multipath device to manage
> > /dev/sda
> > and a udev rule deletes /dev/sda1 invalidating the symbolic link.
>
> I suppose you are talking about 68-del-part-nodes.rules. Note that the
> rules in this file can be deactivated by setting
> ENV{DONT_DEL_PART_NODES}="1" in an early udev rule.

The OS we have uses 62-multipath and does not have a global override like that.

I am looking at my notes on the issue and it was this:
rootvg gets started directly on /dev/sda2 and then multipath starts up
and attempts to mange it and deletes the partition on /dev/sda1
causing the by-uuid link to be invalid but multipath fails to create
the device with "map in use" because the lv's for rootvg are live on
/dev/sda2 directly.   So it does sound like your fix would would
correct this issue since on the multipath failure to manage it would
recreate the /dev/sda1 device.  There appears to be some race
condition in the initramfs/systemd where sometimes rootvg gets started
before multipath has managed the device causing the partition to be
deleted (we have multipath is the initramfs, and that was confirmed).
All of our other vg's dont have this issue as we are using the
rd.lvm.vg= such that only the rootvg gets turned on early.

>
> Also, the rule only removes partitions for devices that have been
> detected as being eligible for multipathing.
>
> > The symbolic link does not appear to get re-created to point to the
> > new multipath device which would lead one to suspect that there was
> > no
> > event happening for when the multipath device is created.
>
> That's very unlikely. You should verify that the multipath device (for
> sda) is created. My patch here relates only to the case where creating
> the multipath device *fails*.
>
?
>
> Maybe. I don't know enough details about your configuration to tell.
> But if this is a device that should not be multipathed, from my point
> of view, proper blacklisting via multipath.conf is the recommended way
> to handle this problem.
>
> You can also use "find_multipaths"; please check the documentation.
> Note also that since 0.7.8, blacklisting "by protocol" is possible,
> which makes it possible e.g. to blacklist local SATA disks with a
> simple statement.
>
We intentionally have find_multipaths set to allow a single path.  The
issue is on a number of VM and using multipath for everything allows
us to not have separate work instructions/scripts for VM's vs
physical, and also allows using multipath to use io retries to work
around short-term external vmhost and storage issues without having to
identify what nodes were affected and reboot them all (they just pause
and continue once the issue is fixed).  It is a very large environment
and things happen in the different sections of the environment and we
have been tweaking various configuration settings to result in less
trouble/more stability when things happen.   The environment has >5000
linux VM's and > 5000 physical linux hosts.

> Martin
>




More information about the dm-devel mailing list