[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] NFS/GFS problems



On Thu, 2005-06-09 at 19:04 -0800, Jay Cable wrote:

Hi, can you put this in bugzilla?  Oopses are always bugs.

-- Lon

> I am using the "RHEL4 cluster" branch from cvs, and the 2.6.9-5.0.5.ELsmp
> kernel.  I am using "lock_dlm" locking and the file system was created
> via:
> gfs_mkfs -r 1536 -j 3 -p lock_dlm -t ftp:dds_space /dev/mapper/ftp_space-erc1
> My cluster configuration is pretty simple - sanbox2 fencing with two nodes
> and the two nodes option set (<cman two_node="1" expected_votes="1">).
> 
> I would greatly appreciate any advice folks have as to what I can do to
> fix this problem.   For the list archives it appears that other folks are
> serving out gfs filesystems via nfs, so this should be possible, right?
> 
> I have attached the relevant part of /var/log/messages
> for a crash.  If any additional information would be helpful, please let
> me know, and I will get it ( the crashes/hangs are very repeatable!).
> 
> Thanks,
>   -Jay Cable
> 
> Here is the output from one of the crashes:
> Jun  9 19:23:46 jin kernel: send_arp uses obsolete (PF_INET,SOCK_PACKET)
> Jun  9 19:28:06 jin kernel: Bad page state at prep_new_page (in process
> 'nfsd', page c159f4e0)
> Jun  9 19:28:06 jin kernel: flags:0x20001020 mapping:f6a300e0 mapcount:0
> count:2
> Jun  9 19:28:06 jin kernel: Backtrace:
> Jun  9 19:28:06 jin kernel:  [<c013e669>] bad_page+0x58/0x89
> Jun  9 19:28:06 jin kernel:  [<c013e9ec>] prep_new_page+0x24/0x3a
> Jun  9 19:28:06 jin kernel:  [<c013eef8>] buffered_rmqueue+0x17d/0x1a5
> Jun  9 19:28:06 jin kernel:  [<c013efd4>] __alloc_pages+0xb4/0x298
> Jun  9 19:28:06 jin kernel:  [<c013baa2>] find_lock_page+0x96/0x9d
> Jun  9 19:28:06 jin kernel:  [<c013d16d>]
> generic_file_buffered_write+0x10d/0x47c
> Jun  9 19:28:06 jin kernel:  [<c013bac1>] find_or_create_page+0x18/0x72
> Jun  9 19:28:06 jin kernel:  [<c013b775>] wake_up_page+0x9/0x29
> Jun  9 19:28:06 jin kernel:  [<c013d85e>]
> generic_file_aio_write_nolock+0x382/0x3b0
> Jun  9 19:28:06 jin kernel:  [<c013d910>]
> generic_file_write_nolock+0x84/0x99
> Jun  9 19:28:06 jin kernel:  [<f8f96e5f>] gfs_glock_nq+0xe3/0x116 [gfs]
> Jun  9 19:28:06 jin kernel:  [<c011e8d2>]
> autoremove_wake_function+0x0/0x2d
> Jun  9 19:28:06 jin kernel:  [<f8fb7658>] gfs_trans_begin_i+0xfd/0x15a
> [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8faadd2>] do_do_write_buf+0x268/0x3b4
> [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8fab02e>] do_write_buf+0x110/0x152 [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8faa238>] walk_vm+0xd3/0xf7 [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8f9709a>] gfs_glock_dq+0x111/0x11f [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8fab10d>] gfs_write+0x9d/0xb6 [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8faaf1e>] do_write_buf+0x0/0x152 [gfs]
> Jun  9 19:28:06 jin kernel:  [<f8fab070>] gfs_write+0x0/0xb6 [gfs]
> Jun  9 19:28:06 jin kernel:  [<c0155ba8>] do_readv_writev+0x1c5/0x21d
> Jun  9 19:28:06 jin kernel:  [<c0154c92>] dentry_open+0xf0/0x1a5
> Jun  9 19:28:06 jin kernel:  [<c0155c7e>] vfs_writev+0x3e/0x43
> Jun  9 19:28:06 jin kernel:  [<f8c11b6b>] nfsd_write+0xeb/0x289 [nfsd]
> Jun  9 19:28:06 jin kernel:  [<f8b2d5db>] svcauth_unix_accept+0x2d3/0x34a
> [sunrpc]
> Jun  9 19:28:06 jin kernel:  [<f8c18356>] nfsd3_proc_write+0xbf/0xd5
> [nfsd]
> Jun  9 19:28:06 jin kernel:  [<f8c1a3a8>]
> nfs3svc_decode_writeargs+0x0/0x243 [nfsd]
> Jun  9 19:28:06 jin kernel:  [<f8c0e5d7>] nfsd_dispatch+0xba/0x16f [nfsd]
> Jun  9 19:28:06 jin kernel:  [<f8b2a446>] svc_process+0x420/0x6d6 [sunrpc]
> Jun  9 19:28:06 jin kernel:  [<f8c0e3b7>] nfsd+0x1cc/0x332 [nfsd]
> Jun  9 19:28:06 jin kernel:  [<f8c0e1eb>] nfsd+0x0/0x332 [nfsd]
> Jun  9 19:28:06 jin kernel:  [<c01041f1>] kernel_thread_helper+0x5/0xb
> Jun  9 19:28:06 jin kernel: Trying to fix it up, but a reboot is needed
> Jun  9 19:30:34 jin kernel: ------------[ cut here ]------------
> Jun  9 19:30:34 jin kernel: kernel BUG at mm/vmscan.c:377!
> Jun  9 19:30:34 jin kernel: invalid operand: 0000 [#1]
> Jun  9 19:30:34 jin kernel: SMP
> Jun  9 19:30:34 jin kernel: Modules linked in: lock_dlm(U) dlm(U) cman(U)
> gfs(U) lock_harness(U) dm_mod qla2300 qla2xxx scsi_transport_fc nfsd
> exportfs lockd autofs4 i2c_dev i2c_core md5 ipv6 sunrpc ipt_REJECT
> ipt_state ip_conntrack iptable_filter ip_tables button battery ac uhci_hcd
> ehci_hcd e1000 floppy ext3 jbd raid1 ata_piix libata sd_mod scsi_mod
> Jun  9 19:30:34 jin kernel: CPU:    1
> Jun  9 19:30:34 jin kernel: EIP:    0060:[<c01447bd>]    Tainted: GF   B
> VLI
> Jun  9 19:30:34 jin kernel: EFLAGS: 00010202   (2.6.9-5.0.5.ELsmp)
> Jun  9 19:30:34 jin kernel: EIP is at shrink_list+0xa9/0x3ee
> Jun  9 19:30:34 jin kernel: eax: 20001049   ebx: f7cedecc   ecx: c159f4f8
> edx: c10f24d8
> Jun  9 19:30:34 jin kernel: esi: c159f4e0   edi: 00000021   ebp: f7cedf58
> esp: f7cede54
> Jun  9 19:30:34 jin kernel: ds: 007b   es: 007b   ss: 0068
> Jun  9 19:30:34 jin kernel: Process kswapd0 (pid: 44, threadinfo=f7ced000
> task=f7d1b7b0)
> Jun  9 19:30:34 jin kernel: Stack: 00000001 00000000 00000000 00000000
> f7cedecc f7cede68 f7cede68 00000000
> Jun  9 19:30:34 jin kernel:        00000001 c12f4be0 c1204a00 00000246
> f7ceded4 c0319e00 00000000 f7ceded4
> Jun  9 19:30:34 jin kernel:        c0143bc0 c10639f8 00000296 c1f479c0
> c10639e0 00000000 00000020 f7ced000
> Jun  9 19:30:34 jin kernel: Call Trace:
> Jun  9 19:30:34 jin kernel:  [<c0143bc0>] __pagevec_release+0x15/0x1d
> Jun  9 19:30:34 jin kernel:  [<c0144cdf>] shrink_cache+0x1dd/0x34d
> Jun  9 19:30:34 jin kernel:  [<c014539d>] shrink_zone+0xa7/0xb6
> Jun  9 19:30:34 jin kernel:  [<c0145740>] balance_pgdat+0x1b6/0x2f8
> Jun  9 19:30:34 jin kernel:  [<c014594c>] kswapd+0xca/0xcc
> Jun  9 19:30:34 jin kernel:  [<c011e8d2>]
> autoremove_wake_function+0x0/0x2d
> Jun  9 19:30:34 jin kernel:  [<c02c6206>] ret_from_fork+0x6/0x14
> Jun  9 19:30:34 jin kernel:  [<c011e8d2>]
> autoremove_wake_function+0x0/0x2d
> Jun  9 19:30:34 jin kernel:  [<c0145882>] kswapd+0x0/0xcc
> Jun  9 19:30:34 jin kernel:  [<c01041f1>] kernel_thread_helper+0x5/0xb
> Jun  9 19:30:34 jin kernel: Code: 71 e8 89 50 04 89 02 c7 41 04 00 02 20
> 00 c7 01 00 01 10 00 f0 0f ba 69 e8 00 19 c0 85 c0 0f 85 b8 02 00 00 8b 41
> e8 a8 40 74 08 <0f> 0b 79 01 41 9a 2d c0 8b 41 e8 f6 c4 20 0f 85 96 02 00
> 00 8b




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]