[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] Fwd: multipath errors



On Fri, 2013-09-27 at 15:11 -0500, Benjamin Marzinski wrote:
> On Mon, Sep 23, 2013 at 11:09:18AM +0300, Amitai Alkalay wrote:
> >    Hi,
> >    Sorry for bumping, but I will appreciate any help on this matter..
> >    Thanks,
> >    Amitai
> > 
> >    ---------- Forwarded message ----------
> >    From: Amitai Alkalay <[1]amitai alkalay work gmail com>
> >    Date: Tue, Sep 10, 2013 at 2:29 PM
> >    Subject: multipath errors
> >    To: [2]dm-devel redhat com
> > 
> >    hi,
> >    I sometimes see cases where io failed after some path failover, although
> >    there are other valid paths.
> >    I seem to get a lot of the following errors during a path removal
> >    (failover):
> > 
> >  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d80000a: failed in domap for removal of path sdcy
> >  Aug  1 08:59:37 lg641 multipathd: uevent trigger error
> >  Aug  1 08:59:37 lg641 multipathd: sdt: remove path (uevent)
> >  Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
> >  Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
> >  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d800001: load table [0 21474836480 multipath 0 0 1 1 queue-length 0 12 1 66:240 1 66:80 1 70:192 1 71:96 1 67:176 1 67:112 1 71:224 1 128:128 1 8:
> >  16 1 69:16 1 69:32 1 8:32 1]
> >  Aug  1 08:59:37 lg641 multipathd: sdt: path removed from map 3514f0c532d800001f
> > 
> >    All the other paths are there, and still multipath decided to fail the io
> >    with no apparent reason.�
> > 
> >    I would appreciate any comment about:
> > 
> >    1. How can this happen.
> 
> It shouldn't.  The kernel should check through all of the paths before
> failing.  Sometimes some actions on storage arrays temporarily bring all
> paths down, but even in that case, you should see messages in the logs
> of multipath trying all the paths before it fails.
> 
> >    2. How can I increase the log level to understand multipath decisions.
> 
> in /etc/multipath.conf
> 
> defaults {
> 	...
> 	verbosity 3
> }
> 
> This will add a lot of extra logging.  Make sure that your logging isn't
> rate limited, or you will miss messages exactly when it's most important
> to see them.
> 
> 
> >    3. Why do I always see the errors regarding adding target to table.
> >    The only thing I can think think about, that multipath temporarily
> >    bypassed the other paths (maybe it got busy several times and gave up).
> >    I'm using�device-mapper-multipath-0.4.9-64.el6.x86_64.
> 
> These messages:
> 
> Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
> Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
> 
> ususally mean that the device is already in use.  They shouldn't be in
> relation to the device you are removing.  Do you get them when you
> create the device as well?

Other than this, I've seen these messages occur because of another path
in the map being offline, or has already been deleted but the message
hasn't yet reached userspace, or otherwise unavailable.  In these cases,
I usually don't see I/O errors unless all the paths are now gone.  One
offlined device can prevent paths from being added or removed from the
map.

The messages file should give a clue if one of these is the case.

- Sean

> 
> Another possibility is that the device has the wrong permissions. For
> instance, this happens whenever multipath tries to get a read-only
> device?  Again, this doesn't seem like it could be referring to the
> device that is being removed.  Unfortunately, the kernel doesn't give
> any indication which path device is failing, or why.  That should
> probably get fixed.
> 
> Are you seeing IO errors for the multipath device in the messages?
> Can you post those?
> 
> Could you post all of the log messages around the failure (I assume
> there there is a kernel message saying that an IO failed), along with
> the multipath -l listing of the device both when no paths are failed,
> and immediately after the error happens.
> 
> Also, it would be interesting to know if setting something like
> "no_path_retry 5" would avoid the issue.  There's still a bug if multipath
> isn't trying all the paths, but this would narrow down where to look.
> 
> -Ben
> 
> >    Thanks a lot,
> >    Amitai
> > 
> > References
> > 
> >    Visible links
> >    1. mailto:amitai alkalay work gmail com
> >    2. mailto:dm-devel redhat com
> 
> > --
> > dm-devel mailing list
> > dm-devel redhat com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> 
> --
> dm-devel mailing list
> dm-devel redhat com
> https://www.redhat.com/mailman/listinfo/dm-devel




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]