On Thu, May 15, 2014 at 11:45:40PM +0200, Christophe Varoqui wrote:Sorry I dropped the ball on this one.
> I'd need your ack on this one.
> Best regards,
> Christophe Varoqui
I'm o.k. with this patch. The biggest issue I have with it has nothing
to do with its correctness, but with rlookup_wwid()'s use of scan_device.
Previously, the only scan_device call always failed. Now scan every
device name, but we don't ever get anything out of it. First off, if we
find a match, we will never use the id. Second, if we don't find a match we
return the id that of the alias we were looking for, but if we do find a
match we return the next id after the one we were looking for (which is
It seems like we could just make rlookup_wwid() return success or failure,
and then call scan_device() from use_existing_alias() if we need to, and
take out a bunch of pointless work that rlookup_wwid() is doing.
> On Thu, May 15, 2014 at 9:21 PM, Stewart, Sean
> References> <Sean Stewart netapp com> wrote:
> Ping... Any additional comments or suggestions for this patch?
> Bumping in case it got lost in the backlog. :)
> On Fri, 2014-04-11 at 17:01 +0000, Stewart, Sean wrote:
> > On Fri, 2014-04-11 at 17:03 +0100, Bryn M. Reeves wrote:
> > > On Fri, Mar 28, 2014 at 09:01:14PM +0000, Stewart, Sean wrote:
> > > > When a system is booted to the SAN, a condition can occur where
> > > > user friendly name is given to a disk during boot, but multipathd
> > > > to allocate a different one after boot. If the second alias is
> > > > used by another device, multipathd can't rename it. Multipathd
> then has
> > > > incorrect information about the alias/wwid relationships, which
> > > > result in paths being added to the wrong map.
> > >
> > > This should only happen if the initramfs and root file system have
> > > inconsistent multipath configurations (either multipath.conf or
> > > / wwids file mismatched). That's not really a valid configuration
> > > the system to be in and leads to the type of problems you describe.
> > That is true that it only happens if they are out of sync. We tried
> > remaking the initramfs to fix the problem, but it didn't help.
> > >
> > > > This patch works around this problem by first trying to use the
> > > > already bound to a device during boot. If the bindings file has
> > > > alias bound to a different device, it'll auto generate a new alias
> > > > rename it to.
> > >
> > > To be honest I'd prefer to see this cause an error. These types of
> > > configurations currently run the risk of silent data corruption -
> > > much rather deal with a system that refuses to boot due to an out of
> > > date initramfs image than one that quietly remaps paths in
> > > ways.
> > The issue, though, is that the system does not refuse to boot. In the
> > case we saw, it booted anyway, our QA engineer ran a test, and it
> > with a data corruption. A user could perform a fresh installation,
> > map
> > new luns, reboot, and without any way of realizing it have essentially
> > ticking time bomb on their hands, ready to go off as soon as there's a
> > blip in the SAN.
> Visible links
> 1. mailto:Sean Stewart netapp com