[dm-devel] Fwd: multipath errors

Benjamin Marzinski bmarzins at redhat.com
Fri Sep 27 20:11:04 UTC 2013


On Mon, Sep 23, 2013 at 11:09:18AM +0300, Amitai Alkalay wrote:
>    Hi,
>    Sorry for bumping, but I will appreciate any help on this matter..
>    Thanks,
>    Amitai
> 
>    ---------- Forwarded message ----------
>    From: Amitai Alkalay <[1]amitai.alkalay.work at gmail.com>
>    Date: Tue, Sep 10, 2013 at 2:29 PM
>    Subject: multipath errors
>    To: [2]dm-devel at redhat.com
> 
>    hi,
>    I sometimes see cases where io failed after some path failover, although
>    there are other valid paths.
>    I seem to get a lot of the following errors during a path removal
>    (failover):
> 
>  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d80000a: failed in domap for removal of path sdcy
>  Aug  1 08:59:37 lg641 multipathd: uevent trigger error
>  Aug  1 08:59:37 lg641 multipathd: sdt: remove path (uevent)
>  Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
>  Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table
>  Aug  1 08:59:37 lg641 multipathd: 3514f0c532d800001: load table [0 21474836480 multipath 0 0 1 1 queue-length 0 12 1 66:240 1 66:80 1 70:192 1 71:96 1 67:176 1 67:112 1 71:224 1 128:128 1 8:
>  16 1 69:16 1 69:32 1 8:32 1]
>  Aug  1 08:59:37 lg641 multipathd: sdt: path removed from map 3514f0c532d800001f
> 
>    All the other paths are there, and still multipath decided to fail the io
>    with no apparent reason.�
> 
>    I would appreciate any comment about:
> 
>    1. How can this happen.

It shouldn't.  The kernel should check through all of the paths before
failing.  Sometimes some actions on storage arrays temporarily bring all
paths down, but even in that case, you should see messages in the logs
of multipath trying all the paths before it fails.

>    2. How can I increase the log level to understand multipath decisions.

in /etc/multipath.conf

defaults {
	...
	verbosity 3
}

This will add a lot of extra logging.  Make sure that your logging isn't
rate limited, or you will miss messages exactly when it's most important
to see them.


>    3. Why do I always see the errors regarding adding target to table.
>    The only thing I can think think about, that multipath temporarily
>    bypassed the other paths (maybe it got busy several times and gave up).
>    I'm using�device-mapper-multipath-0.4.9-64.el6.x86_64.

These messages:

Aug  1 08:59:37 lg641 kernel: device-mapper: table: 253:5: multipath: error getting device
Aug  1 08:59:37 lg641 kernel: device-mapper: ioctl: error adding target to table

ususally mean that the device is already in use.  They shouldn't be in
relation to the device you are removing.  Do you get them when you
create the device as well?

Another possibility is that the device has the wrong permissions. For
instance, this happens whenever multipath tries to get a read-only
device?  Again, this doesn't seem like it could be referring to the
device that is being removed.  Unfortunately, the kernel doesn't give
any indication which path device is failing, or why.  That should
probably get fixed.

Are you seeing IO errors for the multipath device in the messages?
Can you post those?

Could you post all of the log messages around the failure (I assume
there there is a kernel message saying that an IO failed), along with
the multipath -l listing of the device both when no paths are failed,
and immediately after the error happens.

Also, it would be interesting to know if setting something like
"no_path_retry 5" would avoid the issue.  There's still a bug if multipath
isn't trying all the paths, but this would narrow down where to look.

-Ben

>    Thanks a lot,
>    Amitai
> 
> References
> 
>    Visible links
>    1. mailto:amitai.alkalay.work at gmail.com
>    2. mailto:dm-devel at redhat.com

> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel




More information about the dm-devel mailing list