[linux-lvm] Segfault & BUG/OOPS during lvremove snapshot
James G. Sack (jim)
jsack at inostor.com
Wed Oct 19 21:27:22 UTC 2005
I'm assuming that lvm2 snapshots really work and am trying to find the
proper usage recipes, I get a repeatable Segmentation fault on the
command line and a BUG/OOPS to syslog.
The dmesg output is below, the environment and procedural background is
as follows:
tools
LVM version: 2.01.08 (2005-03-22)
Library version: 1.01.02 (2005-05-17)
Driver version: 4.4.0
dmsetup
Library version: 1.01.02 (2005-05-17)
Driver version: 4.4.0
Running Fedora Core 4 (standard yum-updated kernel 2.6.13-1.1526_FC4)
P3 1200MHz
1MB RAM
aic7xxx, 9 disks, but I'm only testing with one partition of one disk
IBM Model: IC35L146UCDY10-0 Rev: S21D
after reboot or prior to a test-run..
pvs shows: /dev/sdh11 VGh11 lvm2 a- 134.71G 74.71G
vgs shows: VGh11 1 2 1 wz--n 134.71G 74.71G
lvs shows:
L1 VGh11 owi-ao 50.00G
S1 VGh11 swi-a- 10.00G L1 26.96
dmsetup info shows 4 ACTIVE devices
During my test I repeatedly create a second snapshot "SS" , sleep 10
seconds, and remove the SS snapshot. While this is looping forever, I
repeatedly examine status via dmsetup info commands, and in a third
loop, I repeatedly read the directory tree from the ext3 fs on the
origin volume (L1).
I am not issuing any explicit dmsetup suspend or resume commands in my
test script.
It may take several hundred snapshot create/remove cycles to crash when
only doing filesystem read operations.
NOTE, HOWEVER: If I substitute a read/write operation for the read
operation, it seems to crash on the first create/remove loop. I believe
it's always during the lvremove call.
NEW NOTE: I had a nagging thought that my test might have been done on
an old test volume with possibly corrupt metadata from previous testing,
so I repeated my experiment on a fresh PV/VG/LV and with only a single
snapshot. I first tried without multiple snapshots and couldn't get a
crash even after considerable read/write activity --UNTIL after I
manually started another snapshot, whereupon the test loop then
triggered the same BUG after only a little more i/o. More oopses will be
furnished on request.
typical dmesg output follows
------------[ cut here ]------------
kernel BUG at drivers/md/kcopyd.c:145!
invalid operand: 0000 [#1]
Modules linked in: xfs exportfs dm_snapshot ipv6 parport_pc lp parport
autofs4 rfcomm l2cap bluetooth sunrpc ohci_hcd i2c_piix4 i2c_core tulip
e100 mii floppy ext3 jbd raid1 dm_mod aic7xxx scsi_transport_spi sd_mod
scsi_mod
CPU: 0
EIP: 0060:[<f886da1a>] Not tainted VLI
EFLAGS: 00010287 (2.6.13-1.1526_FC4)
EIP is at client_free_pages+0x2a/0x40 [dm_mod]
eax: 00000100 ebx: f3074a20 ecx: f7fff060 edx: 00000000
esi: f9167080 edi: 00000000 ebp: 00000000 esp: f6230f1c
ds: 007b es: 007b ss: 0068
Process lvremove (pid: 10432, threadinfo=f6230000 task=f66b7aa0)
Stack: f3074a20 f886efc2 c1ac65c0 f89c296f f9167080 f59e6280 f8868d3b
f6384b80
f89e8000 00000004 f886b460 f886acba f8875860 f886b4af f6230000
00000000
f886c96d f89e8000 f886c8a0 f666dec0 08642188 f6230000 c01affee
08642188
Call Trace:
[<f886efc2>] kcopyd_client_destroy+0x12/0x26 [dm_mod]
[<f89c296f>] snapshot_dtr+0x4f/0x60 [dm_snapshot]
[<f8868d3b>] table_destroy+0x3b/0x90 [dm_mod]
[<f886b460>] dev_remove+0x0/0xd0 [dm_mod]
[<f886acba>] __hash_remove+0x5a/0xa0 [dm_mod]
[<f886b4af>] dev_remove+0x4f/0xd0 [dm_mod]
[<f886c96d>] ctl_ioctl+0xcd/0x110 [dm_mod]
[<f886c8a0>] ctl_ioctl+0x0/0x110 [dm_mod]
[<c01affee>] do_ioctl+0x4e/0x60
[<c01b00ff>] vfs_ioctl+0x4f/0x1c0
[<c01b02c4>] sys_ioctl+0x54/0x70
[<c01041e9>] syscall_call+0x7/0xb
Code: 00 53 89 c3 8b 40 24 39 43 28 75 1f 8b 43 20 e8 6d ff ff ff c7 43
20 00 00 00 00 c7 43 24 00 00 00 00 c7 43 28 00 00 00 00 5b c3 <0f> 0b
91 00 cb f3 86 f8 eb d7 8d b6 00 00 00 00 8d bf 00 00 00
<1>Unable to handle kernel NULL pointer dereference at virtual address
00000034
printing eip:
c019b50c
*pde = 00000000
Oops: 0000 [#2]
Modules linked in: xfs exportfs dm_snapshot ipv6 parport_pc lp parport
autofs4 rfcomm l2cap bluetooth sunrpc ohci_hcd i2c_piix4 i2c_core tulip
e100 mii floppy ext3 jbd raid1 dm_mod aic7xxx scsi_transport_spi sd_mod
scsi_mod
CPU: 0
EIP: 0060:[<c019b50c>] Not tainted VLI
EFLAGS: 00010287 (2.6.13-1.1526_FC4)
EIP is at bio_add_page+0xc/0x30
eax: 00000000 ebx: f6558740 ecx: 00001000 edx: c1663080
esi: 00000000 edi: f6558740 ebp: f6b1ef30 esp: f6b1ee90
ds: 007b es: 007b ss: 0068
Process kcopyd (pid: 3975, threadinfo=f6b1e000 task=f6548000)
Stack: 00000010 f886d02e 00000000 f6592608 00000000 00000001 00000000
00001000
c1663080 f6b1ef30 00000000 00000001 00000010 f886d10b f6b1ef30
f63844c0
f886ce40 f63844c0 f6592608 00000001 00000001 f886ce60 00000000
f3805560
Call Trace:
[<f886d02e>] do_region+0xde/0x110 [dm_mod]
[<f886d10b>] dispatch_io+0xab/0xd0 [dm_mod]
[<f886ce40>] list_get_page+0x0/0x20 [dm_mod]
[<f886ce60>] list_next_page+0x0/0x10 [dm_mod]
[<f886db60>] complete_io+0x0/0x360 [dm_mod]
[<f886d28e>] async_io+0x5e/0xb0 [dm_mod]
[<f886d3d4>] dm_io_async+0x34/0x40 [dm_mod]
[<f886db60>] complete_io+0x0/0x360 [dm_mod]
[<f886ce40>] list_get_page+0x0/0x20 [dm_mod]
[<f886ce60>] list_next_page+0x0/0x10 [dm_mod]
[<f886dec0>] run_io_job+0x0/0x60 [dm_mod]
[<f886df12>] run_io_job+0x52/0x60 [dm_mod]
[<f886db60>] complete_io+0x0/0x360 [dm_mod]
[<f886e1a6>] process_jobs+0x16/0x590 [dm_mod]
[<f886e720>] do_work+0x0/0x30 [dm_mod]
[<c0142c81>] worker_thread+0x271/0x520
[<c0120170>] default_wake_function+0x0/0x10
[<c0142a10>] worker_thread+0x0/0x520
[<c014a935>] kthread+0x85/0x90
[<c014a8b0>] kthread+0x0/0x90
[<c01012f1>] kernel_thread_helper+0x5/0x14
Code: 07 00 00 00 00 c7 47 04 00 00 00 00 c7 47 08 00 00 00 00 31 c0 5b
5e 5f 5d c3 90 8d 74 26 00 53 89 c3 8b 40 0c 8b 80 80 00 00 00 <8b> 40
34 ff 74 24 08 51 89 d1 89 da e8 b3 fe ff ff 5a 59 5b c3
------------------------------------------------------------------------
Regards,
..jim
More information about the linux-lvm
mailing list