[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] [PATCH] libmultipath: Use existing user friendly name if possible



On Thu, May 15, 2014 at 11:45:40PM +0200, Christophe Varoqui wrote:
>    Ben,
>    I'd need your ack on this one.
>    Best regards,
>    Christophe Varoqui

Sorry I dropped the ball on this one.

I'm o.k. with this patch.  The biggest issue I have with it has nothing
to do with its correctness, but with rlookup_wwid()'s use of scan_device.
Previously, the only scan_device call always failed.  Now scan every
device name, but we don't ever get anything out of it. First off, if we
find a match, we will never use the id. Second, if we don't find a match we
return the id that of the alias we were looking for, but if we do find a
match we return the next id after the one we were looking for (which is
completely pointless).

It seems like we could just make rlookup_wwid() return success or failure,
and then call scan_device() from use_existing_alias() if we need to, and
take out a bunch of pointless work that rlookup_wwid() is doing.

-Ben

> 
>    On Thu, May 15, 2014 at 9:21 PM, Stewart, Sean
>    <[1]Sean Stewart netapp com> wrote:
> 
>      Ping...  Any additional comments or suggestions for this patch?
>      Bumping in case it got lost in the backlog. :)
>      On Fri, 2014-04-11 at 17:01 +0000, Stewart, Sean wrote:
>      > On Fri, 2014-04-11 at 17:03 +0100, Bryn M. Reeves wrote:
>      > > On Fri, Mar 28, 2014 at 09:01:14PM +0000, Stewart, Sean wrote:
>      > > > When a system is booted to the SAN, a condition can occur where
>      one
>      > > > user friendly name is given to a disk during boot, but multipathd
>      tries
>      > > > to allocate a different one after boot. If the second alias is
>      already
>      > > > used by another device, multipathd can't rename it. Multipathd
>      then has
>      > > > incorrect information about the alias/wwid relationships, which
>      can
>      > > > result in paths being added to the wrong map.
>      > >
>      > > This should only happen if the initramfs and root file system have
>      > > inconsistent multipath configurations (either multipath.conf or
>      bindings
>      > > / wwids file mismatched). That's not really a valid configuration
>      for
>      > > the system to be in and leads to the type of problems you describe.
>      >
>      > That is true that it only happens if they are out of sync.  We tried
>      > remaking the initramfs to fix the problem, but it didn't help.
>      > >
>      > > > This patch works around this problem by first trying to use the
>      alias
>      > > > already bound to a device during boot.  If the bindings file has
>      that
>      > > > alias bound to a different device, it'll auto generate a new alias
>      to
>      > > > rename it to.
>      > >
>      > > To be honest I'd prefer to see this cause an error. These types of
>      > > configurations currently run the risk of silent data corruption -
>      I'd
>      > > much rather deal with a system that refuses to boot due to an out of
>      > > date initramfs image than one that quietly remaps paths in
>      unexpected
>      > > ways.
>      >
>      > The issue, though, is that the system does not refuse to boot.  In the
>      > case we saw, it booted anyway, our QA engineer ran a test, and it
>      ended
>      > with a data corruption.  A user could perform a fresh installation,
>      > map
>      > new luns, reboot, and without any way of realizing it have essentially
>      a
>      > ticking time bomb on their hands, ready to go off as soon as there's a
>      > blip in the SAN.
> 
> References
> 
>    Visible links
>    1. mailto:Sean Stewart netapp com


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]