[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Cluster-devel] Problems mounting GFS2 devices



Hi guys,

this is using the latest gfs2 code from git and the latest cvs head userland.

# gfs2_mkfs -t edgy:mygfs2 -p lock_dlm -j 4 /dev/mapper/mofo 
This will destroy any data on /dev/mapper/mofo.

Are you sure you want to proceed? [y/n] y

Device:                    /dev/mapper/mofo
Blocksize:                 4096
Device Size                237.36 GB (62223680 blocks)
Filesystem Size:           237.36 GB (62223679 blocks)
Journals:                  4
Resource Groups:           950
Locking Protocol:          "lock_dlm"
Lock Table:                "edgy:mygfs2"

mapper/mofo is a SAN exported device as seen by multipath,
but accessing the device directly makes no difference.

# mount /dev/mapper/mofo /mnt
Segmentation fault

# dmesg
[42950437.160000] GFS2: fsid=: Trying to join cluster "lock_dlm", "edgy:mygfs2"
[42950437.170000] dlm: mygfs2: recover 1
[42950437.170000] dlm: mygfs2: add member 1
[42950437.170000] dlm: mygfs2: total members 1
[42950437.170000] dlm: mygfs2: dlm_recover_directory
[42950437.170000] dlm: mygfs2: dlm_recover_directory 0 entries
[42950437.170000] dlm: mygfs2: recover 1 done: 0 ms
[42950437.170000] GFS2: fsid=edgy:mygfs2.4294967295: Joined cluster. Now mounting FS...
[42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: can't mount journal #4294967295
[42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: there are only 4 journals (0 - 3)
[42950437.180000] GFS2: fsid=edgy:mygfs2.4294967295: fatal assertion failed

^^^ note i get the same kind of error no matter how many journals i create.

[42950437.180000] ------------[ cut here ]------------
[42950437.180000] kernel BUG at fs/gfs2/ops_super.c:290!
[42950437.180000] invalid opcode: 0000 [#1]
[42950437.180000] SMP 
[42950437.180000] Modules linked in: video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container button acpi_sbs battery ac i2c_acpi_ec i2c_core sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod lp sg snd_intel8x0 snd_ac97_codec snd_ac97_bus hw_random snd_pcm_oss snd_mixer_oss tsdev shpchp snd_pcm snd_timer evdev intel_agp agpgart snd soundcore snd_page_alloc pci_hotplug e100 mii parport_pc psmouse pcspkr floppy serio_raw parport dm_round_robin dm_multipath dm_mod ext3 jbd sd_mod uhci_hcd usbcore lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan vesafb capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
[42950437.180000] CPU:    0
[42950437.180000] EIP:    0060:[<e0c20493>]    Not tainted VLI
[42950437.180000] EFLAGS: 00010296   (2.6.17-5-server #2) 
[42950437.180000] EIP is at gfs2_clear_inode+0x73/0x90 [gfs2]
[42950437.180000] eax: 0000004f   ebx: d0118048   ecx: 00000000   edx: 00000292
[42950437.180000] esi: 00000000   edi: e0bd3000   ebp: e0beb4ac   esp: d3c4fcb8
[42950437.180000] ds: 007b   es: 007b   ss: 0068
[42950437.180000] Process mount (pid: 4736, threadinfo=d3c4e000 task=dafb2580)
[42950437.180000] Stack: d0118048 c01850bd df20c400 d0118048 df20c400 c01852ce d0118048 e0beb788 
[42950437.180000]        c0184bac ffffffea e0c1cd2b e0c2cecc e0beb788 00000004 00000003 d3c4fcf4 
[42950437.180000]        d3c4fcf4 00000000 dafb2580 00000003 00000020 00000000 000000c2 00000000 
[42950437.180000] Call Trace:
[42950437.180000]  <c01850bd> clear_inode+0x9d/0x120  <c01852ce> generic_drop_inode+0x6e/0x150
[42950437.180000]  <c0184bac> iput+0x5c/0x70  <e0c1cd2b> init_journal+0x8b/0x4a0 [gfs2]
[42950437.180000]  <e0c1d17f> init_inodes+0x3f/0x200 [gfs2]  <e0c1dd8f> fill_super+0x58f/0x6e0 [gfs2]
[42950437.180000]  <e0c107e8> gfs2_glock_nq_num+0x48/0x80 [gfs2]  <c017278c> get_sb_bdev+0xec/0x130
[42950437.180000]  <c0187598> alloc_vfsmnt+0xa8/0xe0  <e0c1c859> gfs2_get_sb+0x19/0x20 [gfs2]
[42950437.180000]  <e0c1d800> fill_super+0x0/0x6e0 [gfs2]  <c017210c> do_kern_mount+0xcc/0x170
[42950437.180000]  <c01889a5> do_mount+0x435/0x730  <c014e339> filemap_nopage+0x2e9/0x390
[42950437.180000]  <c0158b88> __handle_mm_fault+0x368/0xc10  <c01190a6> do_page_fault+0x3b6/0x744
[42950437.180000]  <c0103be7> error_code+0x4f/0x54  <c0150c32> __alloc_pages+0x52/0x310
[42950437.180000]  <c0187873> copy_mount_options+0x43/0x150  <c0188d17> sys_mount+0x77/0xc0
[42950437.180000]  <c0103007> sysenter_past_esp+0x54/0x75 
[42950437.180000] Code: 60 02 00 00 85 c0 74 10 8d 83 64 02 00 00 5b e9 a4 f4 fe ff 8d 74 26 00 5b c3 8b 83 9c 00 00 00 8b 80 60 01 00 00 e8 9d 98 00 00 <0f> 0b 22 01 dc ba c2 e0 8b 83 60 02 00 00 eb 9e 8d b6 00 00 00 
[42950437.180000] EIP: [<e0c20493>] gfs2_clear_inode+0x73/0x90 [gfs2] SS:ESP 0068:d3c4fcb8
[42950437.180000]  <1>BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008
[42950437.520000]  printing eip:
[42950437.530000] e0c1005e
[42950437.530000] *pde = 0170d001
[42950437.540000] Oops: 0002 [#2]
[42950437.540000] SMP 
[42950437.540000] Modules linked in: video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container button acpi_sbs battery ac i2c_acpi_ec i2c_core sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod lp sg snd_intel8x0 snd_ac97_codec snd_ac97_bus hw_random snd_pcm_oss snd_mixer_oss tsdev shpchp snd_pcm snd_timer evdev intel_agp agpgart snd soundcore snd_page_alloc pci_hotplug e100 mii parport_pc psmouse pcspkr floppy serio_raw parport dm_round_robin dm_multipath dm_mod ext3 jbd sd_mod uhci_hcd usbcore lpfc scsi_transport_fc scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan vesafb capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
[42950437.540000] CPU:    0
[42950437.540000] EIP:    0060:[<e0c1005e>]    Not tainted VLI
[42950437.540000] EFLAGS: 00010246   (2.6.17-5-server #2) 
[42950437.540000] EIP is at drop_bh+0x8e/0x1b0 [gfs2]
[42950437.540000] eax: 00000004   ebx: d484f43c   ecx: 00000000   edx: d0118048
[42950437.540000] esi: d3c4fc74   edi: d484f458   ebp: 00000000   esp: c8505f2c
[42950437.540000] ds: 007b   es: 007b   ss: 0068
[42950437.540000] Process lock_dlm2 (pid: 4739, threadinfo=c8504000 task=dfc81a90)
[42950437.540000] Stack: e0be4358 c8505fac e0bd3000 e0c3e220 e0bd3000 c8505fac d484f43c df20ce00 
[42950437.540000]        e0c0f746 00000292 c0135c7a df20ce00 df348f40 fffefffe e0b32b9d 00000000 
[42950437.540000]        00000009 dfc81b98 dfc81a90 dffa7a90 c1404d20 c8505fac df20cf74 00010000 
[42950437.540000] Call Trace:
[42950437.540000]  <e0c0f746> gfs2_glock_cb+0x96/0x170 [gfs2]  <c0135c7a> remove_wait_queue+0x1a/0x50
[42950437.540000]  <e0b32b9d> gdlm_thread+0x4fd/0x740 [lock_dlm]  <c011b9f0> default_wake_function+0x0/0x10
[42950437.540000]  <e0b326a0> gdlm_thread+0x0/0x740 [lock_dlm]  <c013586c> kthread+0xac/0xe0
[42950437.540000]  <c01357c0> kthread+0x0/0xe0  <c0101005> kernel_thread_helper+0x5/0x10
[42950437.540000] Code: 89 d8 e8 d6 f0 ff ff 8b 44 24 0c 8b 48 14 85 c9 74 09 ba 60 00 00 00 89 d8 ff d1 85 f6 74 22 89 f8 e8 e7 6a 6c df 8b 06 8b 56 04 <89> 50 04 89 02 b0 01 89 36 89 76 04 c7 46 18 00 00 00 00 86 43 
[42950437.540000] EIP: [<e0c1005e>] drop_bh+0x8e/0x1b0 [gfs2] SS:ESP 0068:c8505f2c
[42950437.540000]  <3>BUG: soft lockup detected on CPU#0!

system is still usable for a few seconds. then another OOPS appears on the terminal and
the machine dies hard.

(hand copied)

[42950461.990000] <c014899x> softlockup_tick+0x9c/0xf0		<c012b9c1> update_process_times+0x21/0x80
[42950461.990000] <c0113cb1> smp_apic_timer_interrupt+0x51/0x60 <c0103b40> apic_timer_interrupt+0x1c/0x24
[42950461.990000] <c02d6b45> _spin_lock+0x5/0x10		<e0c0e85b> gfs2_glmutex_trylock+0xb/0x40 [gfs2]
[42950461.990000] <e0c10f88> scan_glock+0x8/0x70 [gfs2]		<e0c0e9fb> examine_bucket+0x8b/0xd0 [gfs2]
[42950461.990000] <e0c10f80> scan_glock+0x0/0x70 [gfs2]		<e0c07790> gfs2_scand+0x0/0x50 [gfs2]
[42950461.990000] <e0c0ebaf> gfs2_scand_internal+0x1f/0x40 [gfs2] <e0c0779c> gfs2_scand+0xc/0x50 [gfs2]
[42950461.990000] <c013586c> kthread+0xac/0xe0			<c01357c0> kthread+0x0/0xe0
[42950461.990000] <c0101005> kernel_thread_herlper+0x5/0x10

Here a test with lock_nolock:

# gfs2_mkfs -t edgy:mygfs2 -p lock_nolock -j 4 /dev/mapper/mofo 
This will destroy any data on /dev/mapper/mofo.

Are you sure you want to proceed? [y/n] y

Device:                    /dev/mapper/mofo
Blocksize:                 4096
Device Size                237.36 GB (62223680 blocks)
Filesystem Size:           237.36 GB (62223679 blocks)
Journals:                  4
Resource Groups:           950
Locking Protocol:          "lock_nolock"
Lock Table:                "edgy:mygfs2"

[42949467.940000] Lock_Nolock (built Jul 18 2006 14:27:44) installed
[42949521.080000] GFS2: fsid=: Trying to join cluster "lock_nolock", "edgy:mygfs2"
[42949521.080000] GFS2: fsid=edgy:mygfs2.0: Joined cluster. Now mounting FS...
[42949521.220000] GFS2: fsid=edgy:mygfs2.0: jid=0, already locked for use
[42949521.220000] GFS2: fsid=edgy:mygfs2.0: jid=0: Looking at journal...
[42949521.330000] GFS2: fsid=edgy:mygfs2.0: jid=0: Done
[42949521.330000] GFS2: fsid=edgy:mygfs2.0: jid=1: Trying to acquire journal lock...
[42949521.330000] GFS2: fsid=edgy:mygfs2.0: jid=1: Looking at journal...
[42949521.470000] GFS2: fsid=edgy:mygfs2.0: jid=1: Done
[42949521.470000] GFS2: fsid=edgy:mygfs2.0: jid=2: Trying to acquire journal lock...
[42949521.470000] GFS2: fsid=edgy:mygfs2.0: jid=2: Looking at journal...
[42949521.620000] GFS2: fsid=edgy:mygfs2.0: jid=2: Done
[42949521.620000] GFS2: fsid=edgy:mygfs2.0: jid=3: Trying to acquire journal lock...
[42949521.620000] GFS2: fsid=edgy:mygfs2.0: jid=3: Looking at journal...
[42949521.770000] GFS2: fsid=edgy:mygfs2.0: jid=3: Done

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda1              19G  789M   17G   5% /
varrun                252M   80K  252M   1% /var/run
varlock               252M  4,0K  252M   1% /var/lock
udev                   10M  112K  9,9M   2% /dev
devshm                252M     0  252M   0% /dev/shm
Segmentation fault

# dmesg
[42949571.960000] BUG: unable to handle kernel paging request at virtual address 0000109c
[42949571.960000]  printing eip:
[42949571.960000] e0c374c8
[42949571.960000] *pde = 1bbf2001
[42949571.960000] Oops: 0000 [#1]
[42949571.960000] SMP 
[42949571.960000] Modules linked in: lock_nolock video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container button acpi_sbs battery ac i2c_acpi_ec i2c_core sctp lock_dlm gfs2 dlm configfs ipv6 af_packet md_mod lp hw_random snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss sg snd_pcm snd_timer snd soundcore e100 tsdev evdev mii shpchp intel_agp agpgart pci_hotplug snd_page_alloc parport_pc psmouse serio_raw pcspkr parport floppy dm_round_robin dm_multipath dm_mod ext3 jbd sd_mod lpfc scsi_transport_fc uhci_hcd usbcore scsi_mod ide_generic ide_cd cdrom ide_disk piix generic thermal processor fan vesafb capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
[42949571.960000] CPU:    0
[42949571.960000] EIP:    0060:[<e0c374c8>]    Not tainted VLI
[42949571.960000] EFLAGS: 00010286   (2.6.17-5-server #2) 
[42949571.960000] EIP is at gfs2_statfs+0x18/0xd0 [gfs2]
[42949571.960000] eax: 00001000   ebx: def2d800   ecx: e0c556c0   edx: cc1abeb0
[42949571.960000] esi: cc1abeb0   edi: cc1abf04   ebp: cc1abeb0   esp: cc1abe74
[42949571.960000] ds: 007b   es: 007b   ss: 0068
[42949571.960000] Process df (pid: 4689, threadinfo=cc1aa000 task=dfc7da90)
[42949571.960000] Stack: dffc5ea0 dfbfe5f8 c017b8c1 dc7ff000 dfbfe5f8 dffc5ea0 def2d800 cc1abeb0 
[42949571.960000]        cc1abf04 cc1aa000 c0168fe5 00000000 cc1abeb0 cc1abf14 c0169116 00000000 
[42949571.960000]        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
[42949571.960000] Call Trace:
[42949571.960000]  <c017b8c1> link_path_walk+0x71/0xf0  <c0168fe5> vfs_statfs+0x65/0x80
[42949571.960000]  <c0169116> vfs_statfs64+0x16/0x30  <c016a5c3> sys_statfs64+0x83/0xc0
[42949571.960000]  <c0226220> tty_write+0x0/0x1f0  <c016be11> sys_write+0x41/0x70
[42949571.960000]  <c0103007> sysenter_past_esp+0x54/0x75 
[42949571.960000] Code: 60 02 00 00 eb 9e 8d b6 00 00 00 00 8d bc 27 00 00 00 00 83 ec 28 89 74 24 1c 89 7c 24 20 89 6c 24 24 89 d5 89 5c 24 18 8b 40 0c <8b> 80 9c 00 00 00 8b 98 60 01 00 00 8d 83 e4 02 00 00 e8 61 f6 
[42949571.960000] EIP: [<e0c374c8>] gfs2_statfs+0x18/0xd0 [gfs2] SS:ESP 0068:cc1abe74
[42949571.960000]  


Thanks for your time
Fabio

PS of course i am ready to test possible patches or provide any extra info
required. The SAN is not in production so we can play as much as we want.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]