[dm-devel] [PATCH 0/6] scsi_dh : Couple of fixes for scsi device handlers

Moger, Babu Babu.Moger at lsi.com
Thu Jul 29 22:35:59 UTC 2010


Chandra, Shyam,  
 Thanks for your comments.. Please see my response.

> -----Original Message-----
> From: Chandra Seetharaman [mailto:sekharan at us.ibm.com]
> Sent: Thursday, July 29, 2010 4:54 PM
> To: Moger, Babu
> Cc: device-mapper development; linux-scsi at vger.kernel.org; Qi, Yanling;
> Chauhan, Vijay; Dachepalli, Sudhir; Stankey, Robert
> Subject: Re: [PATCH 0/6] scsi_dh : Couple of fixes for scsi device
> handlers
> 
> Babu,
> 
> Your main object is to protect scsi_dh_data across scsi_dh_activate()
> by
> way of getting kref around scsi_dh_activate(), right ?
> 

  Yes, That is correct. 

> Wouldn't doing what Shyam suggested (doing kref_put() and put_device())
> in scsi_activate() make it simpler and code still be readable ? (it
> would make all the patches except 2/6 not needed).
> 
> Did you hit with any problems doing it that way ?
>

  Yes, We can do that.  Problem is I am hitting the issue with BUG_ON in get_rdac_data which is there in the beginning of rdac_activate.
  If I have to go this way, then I need to remove a call get_rdac_data and just validate pointers. Report error(SCSI_DH_IO) if pointer is not valid.
  Then hold the reference counts and continue if everything is alright.  I will send the new patches as soon as I can.  

> Also the snippet (in 2/6)
> ---------------
> @@ -228,7 +228,8 @@ store_dh_state(struct device *dev, struc
>                          * Activate a device handler
>                          */
>                         if (scsi_dh->activate)
> -                               err = scsi_dh->activate(sdev, NULL,
> NULL);
> +                               err = scsi_dh_activate(sdev-
> >request_queue,
> +                                                       NULL, NULL);
>                         else
>                                 err = 0;
>                 }
> ------------------
> can be made as a patch in itself.
> 
> Thanks for fixing the problem.

 Ok.. I will make this as separate patch.. 

>
> regards,
> 
> chandra
> On Wed, 2010-07-28 at 16:58 -0600, Moger, Babu wrote:
> > These patches fix the following two cases.
> > 1. Devices going away while scsi device hander's activate is still in
> progress.
> >
> > 2. Removal of scsi_dh_data(calling detach handler) when scsi device
> hander's activate is still in progress.
> >
> > We have been seeing these problems while running multipath failover
> tests on LSI storage. These patches fix the problem. We have verified
> it.
> >
> > Here is the panic we have been seeing while running failover failback
> tests.
> >
> > > 00:40:42:869  COM1 >------------[ cut here ]------------
> > > 00:40:42:869  COM1 >kernel BUG at /usr/src/packages/BUILD/lsi-
> > > scsi_dh_rdac-01.00/obj/default/scsi_dh_rdac.c:232!
> > > 00:40:42:869  COM1 >invalid opcode: 0000 [1] SMP
> > > 00:40:42:885  COM1 >last sysfs file: /sys/kernel/uevent_seqnum
> > > 00:40:42:885  COM1 >CPU 3
> > > 00:40:42:885  COM1 >Modules linked in: dm_round_robin dm_multipath
> > > nls_utf8 cifs(X) microcode af_packet ipv6 fuse loop dm_mod iTCO_wdt
> > > iTCO_vendor_support dcdbas(X) pcspkr rtc_cmos rtc_core serio_raw
> > > rtc_lib i5000_edac edac_core bnx2 shpchp sg pci_hotplug button
> > > mptctl usbhid hid ff_memless uhci_hcd ehci_hcd usbcore sd_mod
> > > crc_t10dif mpt2sas(N) raid_class edd ext3 mbcache jbd fan
> > > ide_pci_generic piix ide_core ata_generic ata_piix libata dock
> > > mptsas mptscsih mptbase scsi_transport_sas thermal processor
> > > thermal_sys hwmon scsi_dh_rdac(X) scsi_dh scsi_mod
> > > 00:40:42:932  COM1 >Supported: No
> > > 00:40:42:932  COM1 >Pid: 14044, comm: kmpath_handlerd Tainted: G
> > > 2.6.27.39-0.3-default #1
> > > 00:40:42:932  COM1 >RIP: 0010:
> > > rdac_activate+0x257/0x387 [scsi_dh_rdac]
> > > 00:40:42:947  COM1 >RSP: 0018:ffff880127109dc0  EFLAGS: 00010246
> > > 00:40:42:947  COM1 >RAX: ffff8800ae02f000 RBX: 0000000000000001
> RCX:
> > > 0000000000000018
> > > 00:40:42:963  COM1 >RDX: 0000000000001bbc RSI: 0000000000000282
> RDI:
> > > ffff8800c2ccd918
> > > 00:40:42:963  COM1 >RBP: 00000000fffffffb R08: ffffffff806eaf78
> R09:
> > > ffff880028087720
> > > 00:40:42:963  COM1 >R10: 0000000000000000 R11: ffffffff80284ebe
> R12:
> > > ffffffffa0030fbe
> > > 00:40:42:978  COM1 >R13: 0000000000000000 R14: 0000000000000282
> R15:
> > > 0000000000000000
> > > 00:40:42:978  COM1 >FS:  0000000000000000(0000) GS:ffff88012fb81ec0
> > > (0000) knlGS:0000000000000000
> > > 00:40:42:978  COM1 >CS:  0010 DS: 0018 ES: 0018 CR0:
> 000000008005003b
> > > 00:40:42:994  COM1 >CR2: 00000000f7701630 CR3: 0000000101d5e000
> CR4:
> > > 00000000000006e0
> > > 00:40:42:994  COM1 >DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> > > 0000000000000000
> > > 00:40:43:010  COM1 >DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> > > 0000000000000400
> > > 00:40:43:010  COM1 >Process kmpath_handlerd (pid: 14044, threadinfo
> > > ffff880127108000, task ffff88012710e680)
> > > 00:40:43:010  COM1 >Stack:  ffff880127109ec0 ffffffff8049c431
> > > 0000000000000000 ffff880127109e50
> > > 00:40:43:025  COM1 > ffff8800ae02f000 ffff8800b5032208
> > > ffff8800b5032200 ffff8800ae02f250
> > > 00:40:43:025  COM1 > ffff8800b5032216 0000000580a33680
> > > ffff8800c2ccd6b0 ffff8800ae02f120
> > > 00:40:43:041  COM1 >Call Trace:
> > > 00:40:43:041  COM1 > scsi_dh_activate+0x81/0x9b[scsi_dh]
> > > 00:40:43:041  COM1 > activate_path+0x22/0x46
> > > [dm_multipath]
> > > 00:40:43:041  COM1 > run_workqueue+0x7a/0x100
> > > 00:40:43:057  COM1 > worker_thread+0xd8/0xe7
> > > 00:40:43:057  COM1 > kthread+0x47/0x73
> > > 00:40:43:057  COM1 > child_rip+0xa/0x11
> > > 00:40:43:057  COM1 >
> > > 00:40:43:057  COM1 >
> > > 00:40:43:057  COM1 >Code: 4c 89 ea e8 78 dd 30 e0 4c 89 ef 89 c5 e8
> > > db a8 30 e0 85 ed 0f 84 da 00 00 00 48 8b 44 24 20 4c 8b a8 d0 05
> 00
> > > 00 4d 85 ed 75 04 <0f> 0b eb fe 48 8b 7c 24 40 48 8d 54 24 60 be 60
> > > 00 00 00 e8 ae
> > > 00:40:43:072  COM1 >RIP   rdac_activate+0x257/
> > > 0x387 [scsi_dh_rdac]
> > > 00:40:43:088  COM1 > RSP <ffff880127109dc0>
> > > 00:40:43:088  COM1 >---[ end trace 00e89c598c82483b ]---
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> > the body of a message to majordomo at vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 





More information about the dm-devel mailing list