[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] multipath - reference count on pgpath



This problem may be fixed by this commit..

http://git.kernel.org/linus/2bded7bd7e8b12a913b0b58167a48220560e1514

 

Please check..

 


From: dm-devel-bounces redhat com [mailto:dm-devel-bounces redhat com] On Behalf Of Menny_Hamburger dell com
Sent: Monday, January 24, 2011 9:58 AM
To: dm-devel redhat com
Subject: Re: [dm-devel] multipath - reference count on pgpath

 

Hi,

 

Patch was premature so I am re-posting my initial question:

This panic occurs quite frequently when multipathd is restarted (either after devices are added or removed) and at the same time some other script does udevtrigger, udevsettle.

I have fixed our configuration/monitoring scripts to avoid this panic, however this only hides the problem and does not solve it.

I would appreciate any input on this issue.

 

Thanks,

Menny

 

From: dm-devel-bounces redhat com [mailto:dm-devel-bounces redhat com] On Behalf Of Hamburger, Menny
Sent: 23 January, 2011 11:23
To: dm-devel redhat com
Subject: [dm-devel] multipath - reference count on pgpath

 

Hi,

 

We are using multipath over ISCSI storage that uses the RDAC hardware handler (RHEL5.5)

When we have two running scripts doing the following:

1)   Configure new LUNs mapping file

udevsettle

multipathd restart

udevtrigger

udevsettle

2)   udevtrigger

udevsettle

 

 

We get the kernel panic below.

Although I know that the usage here is problematic (I already modified the scripts), the kernel should protect itself from situations like this.

 

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Unable to handle kernel paging request

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: at 0000000100000010 RIP:

2011 Jan 20 02:53:08 node0 INFO: kernel: sd 350:0:0:0: rdac: array MD3200i-c7-d7, ctlr 1, MODE_SELECT completed

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff881b713f>] :dm_multipath:pg_init_done+0x2f/0x1c0

2011 Jan 20 02:53:08 node0 ALERT: kernel: Unable to handle kernel paging request at 0000000100000010 RIP:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: PGD 0

2011 Jan 20 02:53:08 node0 ALERT: kernel: [<ffffffff881b713f>] :dm_multipath:pg_init_done+0x2f/0x1c0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Oops: 0000 [1]

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: SMP

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: last sysfs file: /devices/platform/host353/session76/target353:0:0/353:0:0:0/rev

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: CPU 3

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Modules linked in:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: iscsi_tcp(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: libiscsi_tcp(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: libiscsi2(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: scsi_transport_iscsi2(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: scsi_transport_iscsi(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ipmi_si(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dell_rbu(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: scsi_dh_rdac(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_rdac(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_queue_depth(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_round_robin(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: xt_tcpudp(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ipt_SYSRQ(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: netconsole(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: iptable_filter(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ip_tables(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: vfat(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: fat(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: usb_storage(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: mptctl(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: exa_ioctls(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: nfs(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: lockd(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: nfs_acl(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: x_tables(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: sunrpc(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ipmi_devintf(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ipmi_msghandler(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: bonding1(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: bonding(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ipv6(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: xfrm_nalgo(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: crypto_api(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_mirror(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_log(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_multipath(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: scsi_dh(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: dm_mod(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: video(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: hwmon(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: backlight(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: sbs(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: i2c_ec(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: i2c_core(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: button(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: battery(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: asus_acpi(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ac(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: sr_mod(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: cdrom(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: joydev(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: sg(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: bnx2(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: pcspkr(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ata_piix(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: libata(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: mptsas(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: mptscsih(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: mptbase(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: scsi_transport_sas(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: sd_mod(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: scsi_mod(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ext3(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: jbd(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: uhci_hcd(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ohci_hcd(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ehci_hcd(U)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Pid: 23040, comm: kmpath_rdacd Tainted: G      2.6.18-164sys #1

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RIP: 0010:[<ffffffff881b713f>]

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff881b713f>] :dm_multipath:pg_init_done+0x2f/0x1c0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RSP: 0018:ffff810081e71d40  EFLAGS: 00010293

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RAX: ffffffff881b7110 RBX: ffff81004c488780 RCX: ffff810080721c00

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RDX: ffff81004c488780 RSI: 0000000000000000 RDI: ffff810071521ea0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: R10: 0000000000000000 R11: ffffffff80087ee0 R12: ffff810080721c00

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: R13: ffff810087a247d8 R14: 0000000100000000 R15: ffff810071521e80

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: FS:  0000000000000000(0000) GS:ffff810107f5da40(0000) knlGS:0000000000000000

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: CR2: 0000000100000010 CR3: 00000000592df000 CR4: 00000000000006e0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Process kmpath_rdacd (pid: 23040, threadinfo ffff810081e70000, task ffff81008a0f5080)

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Stack:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 0000000000000282

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff81004c488780

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff81006c14fc00

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff810080721c00

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff810087a247d8

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 0000000000000000

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff81006c14fc62

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffffffff883bcdf7

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 0000000010008040

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff81008c7f65c0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff810081e71e70

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ffff810081e71dd0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Call Trace:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff883bcdf7>] :scsi_dh_rdac:send_mode_select+0x477/0x4b0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff80032183>] __wake_up+0x43/0x70

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff883bc980>] :scsi_dh_rdac:send_mode_select+0x0/0x4b0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff800561a3>] run_workqueue+0xb3/0x110

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff80052180>] worker_thread+0x0/0x150

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff800b33b0>] keventd_create_kthread+0x0/0xa0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff80052291>] worker_thread+0x111/0x150

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff8009ca00>] default_wake_function+0x0/0x10

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff80052180>] worker_thread+0x0/0x150

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff800373c9>] kthread+0xd9/0x120

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff80068fb1>] child_rip+0xa/0x11

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff800b33b0>] keventd_create_kthread+0x0/0xa0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff800372f0>] kthread+0x0/0x120

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff80068fa7>] child_rip+0x0/0x11

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: Code:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 49

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 8b

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 5e

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 10

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 77

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 7b

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 89

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: f0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: ff

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 24

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: c5

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: a8

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 8c

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 1b

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 88

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: c7

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 44

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 24

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 04

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: 00

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RIP

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: [<ffffffff881b713f>] :dm_multipath:pg_init_done+0x2f/0x1c0

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: RSP <ffff810081e71d40>

2011 Jan 20 02:53:08 172.19.58.130 NOTICE: CR2: 0000000100000010

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 EMERG: Kernel panic - not syncing: Fatal exception

2011 Jan 20 02:53:08 172.19.58.130 NOTICE:

2011 Jan 20 02:53:08 172.19.58.130 EMERG: Rebooting in 1 seconds..

 

Seems that the pgpath is deleted in dm-mpath while send_mode_select is in progress and when it calls pg_init_done we get an OOPs.

I wonder if a reference count on pgpath will do the trick here.

 

Menny Hamburger

Engineer

Dell | IDC

office +972 97698789,  fax +972 97698889

Dell IDC. 4 Hacharoshet St, Raanana 43657, Israel

 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]