[dm-devel] [RFC PATCH] multipathd: Removal of device with invalid uevent DEVPATH fails

Neerav Parikh neerav.parikh at intel.com
Mon Apr 9 20:53:17 UTC 2012


From: Neerav Parikh <Neerav.Parikh at intel.com>

When a VLAN device is removed via command line "vconfig rem <vlan-dev>"; the
network layer will send out NETDEV_UNREGISTER notification to all the devices
that are on top of the VLAN device and listening to that notification. After that
the network layer will keep the device reference till all the holds are removed
but it will go ahead and remove the sysfs entries of the VLAN device after sending
out the notification.
In case of an FCoE interface configured on top of a VLAN device; when the VLAN
NETDEV_UNREGISTER is called it queues up the destroying of the interface in a delayed
workqueue.

Now, when SCSI disks that are discovered via FCoE interface are participating in a
multipath environment the removal of VLAN devices results in multipathd not able to
remove the individual paths from it's internal table resulting in dangling sysfs links
and references.

Multipathd is listening to uevents generated by the kernel and removal of the VLAN
device and the tree below it is received by the uevent listener. In case of the
the 'remove' uevent from the kernel for SCSI disks the DEVPATH for the SCSI disk
ends at the VLAN interface name; resulting in the multipathd not able to find the
device in sysfs and hence not removing it from it's internal table.
Here's an excerpt from the syslog with multipath debug enabled for the sequence of
'remove' uevent to understand what's received:

[snip]
Apr  2 13:22:23 linaut71 multipathd: uevent 'remove' from '/eth2.228-fcoe/ctlr_4/host6/rport-6:0-2/target6:0:0/6:0:0:0/block/sdb'
Apr  2 13:22:23 linaut71 multipathd: UDEV_LOG=3
Apr  2 13:22:23 linaut71 multipathd: ACTION=remove
Apr  2 13:22:23 linaut71 multipathd: DEVPATH=/eth2.228-fcoe/ctlr_4/host6/rport-6:0-2/target6:0:0/6:0:0:0/block/sdb
Apr  2 13:22:23 linaut71 multipathd: SUBSYSTEM=block
Apr  2 13:22:23 linaut71 multipathd: DEVNAME=/dev/sdb
Apr  2 13:22:23 linaut71 multipathd: DEVTYPE=disk
Apr  2 13:22:23 linaut71 multipathd: SEQNUM=2040
Apr  2 13:22:23 linaut71 multipathd: MAJOR=259
Apr  2 13:22:23 linaut71 multipathd: MINOR=589824
Apr  2 13:22:23 linaut71 multipathd: ID_SCSI=1
Apr  2 13:22:23 linaut71 multipathd: ID_VENDOR=EMC
Apr  2 13:22:23 linaut71 multipathd: ID_VENDOR_ENC=EMC\x20\x20\x20\x20\x20
Apr  2 13:22:23 linaut71 multipathd: ID_MODEL=SYMMETRIX
Apr  2 13:22:23 linaut71 multipathd: ID_MODEL_ENC=SYMMETRIX\x20\x20\x20\x20\x20\x20\x20
Apr  2 13:22:23 linaut71 multipathd: ID_REVISION=5874
Apr  2 13:22:23 linaut71 multipathd: ID_TYPE=disk
Apr  2 13:22:23 linaut71 multipathd: ID_SERIAL=360000970000194900586533030303243
Apr  2 13:22:24 linaut71 multipathd: ID_SERIAL_SHORT=60000970000194900586533030303243
Apr  2 13:22:24 linaut71 multipathd: ID_WWN=0x6000097000019490
Apr  2 13:22:24 linaut71 multipathd: ID_WWN_VENDOR_EXTENSION=0x0586533030303243
Apr  2 13:22:24 linaut71 multipathd: ID_WWN_WITH_EXTENSION=0x60000970000194900586533030303243
Apr  2 13:22:24 linaut71 multipathd: ID_SCSI_SERIAL=90058602C000
Apr  2 13:22:24 linaut71 multipathd: ID_BUS=scsi
Apr  2 13:22:24 linaut71 multipathd: ID_PATH=fc-0x50000972c0092919-lun-0
Apr  2 13:22:24 linaut71 multipathd: DEVLINKS=/dev/block/259:589824 /dev/disk/by-id/scsi-360000970000194900586533030303243 /dev/disk/by-path/fc-0x50000972c0092919-l
Apr  2 13:22:24 linaut71 multipathd: /eth2.228-fcoe/ctlr_4/host6/rport-6:0-2/target6:0:0/6:0:0:0/block/sdb: not found in sysfs
Apr  2 13:22:24 linaut71 multipathd: uevent trigger error
[snip]

With the below patch that I added to fix this issue; on failure of sysfs_device_get()
in fetching the device from sysfs in uev_remove_path() instead of bailing out the code
will continue and search for the device itself (leaf node in devpath) in the internal
multipathd table. If it is found then it will continue with the removal of the path.



Signed-off-by: Neerav Parikh <Neerav.Parikh at intel.com>
---
 multipathd/main.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 5b7195d..74fa8fc 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -553,12 +553,22 @@ uev_remove_path (struct uevent *uev, struct vectors * vecs)
 	dev = sysfs_device_get(uev->devpath);
 	if (!dev) {
 		condlog(2, "%s: not found in sysfs", uev->devpath);
-		return 1;
+		/*
+		 * Seems like we got uevent for a device that does not have
+		 * a valid devpath anymore.
+		 * Check if the device itself is actually present or not.
+		 */
+		condlog(2, "%s: searching path in pathvec", uev->kernel);
+		if (!find_path_by_dev(vecs->pathvec, uev->kernel)) {
+			condlog(2, "%s: path not found in pathvec",
+				uev->kernel);
+			return 1;
+		}
 	}
 	condlog(2, "%s: remove path (uevent)", uev->kernel);
 	retval = ev_remove_path(uev->kernel, vecs);
 
-	if (!retval)
+	if (!retval && dev)
 		sysfs_device_put(dev);
 
 	return retval;




More information about the dm-devel mailing list