[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] DLM messages



Bas van der Vlies wrote:
David Teigland wrote:
On Mon, Mar 27, 2006 at 08:23:40AM +0200, Bas van der Vlies wrote:
GFS: cvs STABLE

I just noticed that as we get these dlm messages the load will increase about 60 and stays there. what does this messages
mean?

process_lockqueue_reply id 237030a state 0

These are common and haven't been associated with any problems
before.  We should probably remove the printk.  It's caused
by a grant message arriving before the reply to the lock request.

Thanks i also found that answer in the archive

Mar 26 15:35:25 ifs2 kernel: dlm: lisa_vg3_lv1: cancel reply ret 0
Mar 26 15:35:25 ifs2 kernel: lock_dlm: unlock sb_status 0 2,15a940b8 flags 0

I've never seen these before.  They're related to cancels which
GFS only does during recovery.  What applications are using GFS?

Our setup is a 4 node GFS-server that act as Home NFS-server for our cluster. This cluster is used by different universities and research institutes. All applications are using NFS and therfore also GFS.


The load is even getting higher and the node does not respond to NFS
requests. When it try to fence the nodeby disabing the heartbeat-network
interface (It was the master). We get all kind od cman/dlm kernel crashes when the node wants to rejoin

Here are some kern.log outputs

==== FS4 ====
Mar 27 12:29:26 ifs4 kernel: ------------[ cut here ]------------
Mar 27 12:29:26 ifs4 kernel: kernel BUG at /usr/src/gfs/stable_1.0.2/stable/cluster/gfs-kernel/src/dlm/lock.c:357!
Mar 27 12:29:26 ifs4 kernel: invalid opcode: 0000 [#1]
Mar 27 12:29:26 ifs4 kernel: SMP
Mar 27 12:29:26 ifs4 kernel: Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg ide_floppy ide_cd cdrom qla2xxx siimage piix e1000 gfs lock_harness dm_mod
Mar 27 12:29:26 ifs4 kernel: CPU:    0
Mar 27 12:29:26 ifs4 kernel: EIP: 0060:[<f8aa5586>] Tainted: GF VLI
Mar 27 12:29:26 ifs4 kernel: EFLAGS: 00010246   (2.6.16-rc5-sara3 #1)
Mar 27 12:29:26 ifs4 kernel: EIP is at do_dlm_unlock+0x91/0xaa [lock_dlm]
Mar 27 12:29:26 ifs4 kernel: eax: 00000004 ebx: f08b35c0 ecx: 0000b8f7 edx: 00000246 Mar 27 12:29:26 ifs4 kernel: esi: ffffffea edi: f8c29000 ebp: eff65ee8 esp: eff65edc
Mar 27 12:29:26 ifs4 kernel: ds: 007b   es: 007b   ss: 0068
Mar 27 12:29:26 ifs4 kernel: Process gfs_glockd (pid: 6605, threadinfo=eff64000 task=eff46030) Mar 27 12:29:26 ifs4 kernel: Stack: <0>f8aa9d89 f8c29000 f167e3c0 eff65ef4 f8aa5824 f08b35c0 eff65f08 f899a7bc Mar 27 12:29:26 ifs4 kernel: f08b35c0 00000003 f167e3e4 eff65f2c f8990ca4 f8c29000 f08b35c0 00000003 Mar 27 12:29:26 ifs4 kernel: f89c4ec0 f167e3c0 00000001 f167e3c0 eff65f40 f8993680 f167e3c0 f167e3c0
Mar 27 12:29:26 ifs4 kernel: Call Trace:
Mar 27 12:29:26 ifs4 kernel: [<c0103599>] show_stack_log_lvl+0xad/0xb5
Mar 27 12:29:26 ifs4 kernel: [<c01036db>] show_registers+0x10d/0x176
Mar 27 12:29:26 ifs4 kernel: [<c01038ad>] die+0xf2/0x16d
Mar 27 12:29:26 ifs4 kernel: [<c0103996>] do_trap+0x6e/0x8a
Mar 27 12:29:26 ifs4 kernel: [<c0103bed>] do_invalid_op+0x90/0x97
Mar 27 12:29:26 ifs4 kernel: [<c010322f>] error_code+0x4f/0x54
Mar 27 12:29:26 ifs4 kernel: [<f8aa5824>] lm_dlm_unlock+0x1d/0x24 [lock_dlm]
Mar 27 12:29:26 ifs4 kernel: [<f899a7bc>] gfs_lm_unlock+0x2c/0x46 [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f8990ca4>] gfs_glock_drop_th+0xf0/0x12d [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f8993680>] inode_go_drop_th+0x13/0x18 [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f89901f9>] rq_demote+0x79/0x95 [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f89902b4>] run_queue+0x56/0xbb [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f89903d6>] unlock_on_glock+0x1f/0x29 [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f899232a>] gfs_reclaim_glock+0xbf/0x138 [gfs]
Mar 27 12:29:26 ifs4 kernel: [<f8986682>] gfs_glockd+0x3b/0xe3 [gfs]
Mar 27 12:29:26 ifs4 kernel: [<c0100ed9>] kernel_thread_helper+0x5/0xb
Mar 27 12:29:26 ifs4 kernel: Code: 73 34 ff 73 2c ff 73 08 ff 73 04 ff 73 0c 56 8b 03 ff 70 18 68 a0 a6 aa f8 e8 80 19 67 c7 83 c4 34 68 89 9d aa f8 e8 73 19 67 c7 <0f> 0b 65 01 c0 a4 aa f8 68 a0 a5 aa f8 e8 27 12 67 c7 8d 65 f8
Mar 27 12:29:36 ifs4 kernel: ------------[ cut here ]------------
Mar 27 12:29:36 ifs4 kernel: kernel BUG at /usr/src/gfs/stable_1.0.2/stable/cluster/gfs-kernel/src/dlm/lock.c:357!
Mar 27 12:29:36 ifs4 kernel: invalid opcode: 0000 [#2]
Mar 27 12:29:36 ifs4 kernel: SMP
Mar 27 12:29:36 ifs4 kernel: Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg ide_floppy ide_cd cdrom qla2xxx siimage piix e1000 gfs lock_harness dm_mod
Mar 27 12:29:36 ifs4 kernel: CPU:    1
Mar 27 12:29:36 ifs4 kernel: EIP: 0060:[<f8aa5586>] Tainted: GF VLI
Mar 27 12:29:36 ifs4 kernel: EFLAGS: 00010206   (2.6.16-rc5-sara3 #1)
Mar 27 12:29:36 ifs4 kernel: EIP is at do_dlm_unlock+0x91/0xaa [lock_dlm]
Mar 27 12:29:36 ifs4 kernel: eax: 00000004 ebx: e948f6c0 ecx: c034ab40 edx: 00000206 Mar 27 12:29:36 ifs4 kernel: esi: ffffffea edi: f8b57000 ebp: f4fafee8 esp: f4fafedc
Mar 27 12:29:36 ifs4 kernel: ds: 007b   es: 007b   ss: 0068
Mar 27 12:29:36 ifs4 kernel: Process gfs_glockd (pid: 6472, threadinfo=f4fae000 task=f5d44550) Mar 27 12:29:36 ifs4 kernel: Stack: <0>f8aa9d89 f8b57000 f33799a8 f4fafef4 f8aa5824 e948f6c0 f4faff08 f899a7bc Mar 27 12:29:36 ifs4 kernel: e948f6c0 00000003 f33799cc f4faff2c f8990ca4 f8b57000 e948f6c0 00000003 Mar 27 12:29:36 ifs4 kernel: f89c4ec0 f33799a8 00000001 f33799a8 f4faff40 f8993680 f33799a8 f33799a8
Mar 27 12:29:36 ifs4 kernel: Call Trace:
Mar 27 12:29:36 ifs4 kernel: [<c0103599>] show_stack_log_lvl+0xad/0xb5
Mar 27 12:29:36 ifs4 kernel: [<c01036db>] show_registers+0x10d/0x176
Mar 27 12:29:36 ifs4 kernel: [<c01038ad>] die+0xf2/0x16d
Mar 27 12:29:36 ifs4 kernel: [<c0103996>] do_trap+0x6e/0x8a
Mar 27 12:29:36 ifs4 kernel: [<c0103bed>] do_invalid_op+0x90/0x97
Mar 27 12:29:36 ifs4 kernel: [<c010322f>] error_code+0x4f/0x54
Mar 27 12:29:36 ifs4 kernel: [<f8aa5824>] lm_dlm_unlock+0x1d/0x24 [lock_dlm]
Mar 27 12:29:36 ifs4 kernel: [<f899a7bc>] gfs_lm_unlock+0x2c/0x46 [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f8990ca4>] gfs_glock_drop_th+0xf0/0x12d [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f8993680>] inode_go_drop_th+0x13/0x18 [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f89901f9>] rq_demote+0x79/0x95 [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f89902b4>] run_queue+0x56/0xbb [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f89903d6>] unlock_on_glock+0x1f/0x29 [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f899232a>] gfs_reclaim_glock+0xbf/0x138 [gfs]
Mar 27 12:29:36 ifs4 kernel: [<f8986682>] gfs_glockd+0x3b/0xe3 [gfs]
Mar 27 12:29:36 ifs4 kernel: [<c0100ed9>] kernel_thread_helper+0x5/0xb
Mar 27 12:29:36 ifs4 kernel: Code: 73 34 ff 73 2c ff 73 08 ff 73 04 ff 73 0c 56 8b 03 ff 70 18 68 a0 a6 aa f8 e8 80 19 67 c7 83 c4 34 68 89 9d aa f8 e8 73 19 67 c7 <0f> 0b 65 01 c0 a4 aa f8 68 a0 a5 aa f8 e8 27 12 67 c7 8d 65 f8


=== FS2 ==
Mar 27 12:28:25 ifs2 kernel: ------------[ cut here ]------------
Mar 27 12:28:25 ifs2 kernel: kernel BUG at /usr/src/gfs/stable_1.0.2/stable/cluster/cman-kernel/src/membership.c:3151!
Mar 27 12:28:25 ifs2 kernel: invalid opcode: 0000 [#1]
Mar 27 12:28:25 ifs2 kernel: SMP
Mar 27 12:28:25 ifs2 kernel: Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg ide_floppy ide_cd cdrom qla2xxx siimage piix e1000 gfs lock_harness dm_mod
Mar 27 12:28:25 ifs2 kernel: CPU:    0
Mar 27 12:28:25 ifs2 kernel: EIP: 0060:[<f8ac7825>] Tainted: GF VLI
Mar 27 12:28:25 ifs2 kernel: EFLAGS: 00010246   (2.6.16-rc5-sara3 #1)
Mar 27 12:28:25 ifs2 kernel: EIP is at elect_master+0x34/0x41 [cman]
Mar 27 12:28:25 ifs2 kernel: eax: f8a3e000 ebx: 00000080 ecx: 00000080 edx: 00000000 Mar 27 12:28:25 ifs2 kernel: esi: f8adb584 edi: f677bfcc ebp: f677bf7c esp: f677bf78
Mar 27 12:28:25 ifs2 kernel: ds: 007b   es: 007b   ss: 0068
Mar 27 12:28:25 ifs2 kernel: Process cman_memb (pid: 5892, threadinfo=f677a000 task=f6984030) Mar 27 12:28:25 ifs2 kernel: Stack: <0>f63374a0 f677bf90 f8ac5430 f677bf98 00000000 f63374a0 f677bfa4 f8ac3acc Mar 27 12:28:25 ifs2 kernel: f68dba40 00000000 f6984030 f677bfe4 f8ac3cb3 0000001f 00000000 c0102612 Mar 27 12:28:25 ifs2 kernel: 00000000 f6984030 c01124e1 00100100 00200200 00000000 00000000 00000000
Mar 27 12:28:25 ifs2 kernel: Call Trace:
Mar 27 12:28:25 ifs2 kernel: [<c0103599>] show_stack_log_lvl+0xad/0xb5
Mar 27 12:28:25 ifs2 kernel: [<c01036db>] show_registers+0x10d/0x176
Mar 27 12:28:25 ifs2 kernel: [<c01038ad>] die+0xf2/0x16d
Mar 27 12:28:25 ifs2 kernel: [<c0103996>] do_trap+0x6e/0x8a
Mar 27 12:28:25 ifs2 kernel: [<c0103bed>] do_invalid_op+0x90/0x97
Mar 27 12:28:25 ifs2 kernel: [<c010322f>] error_code+0x4f/0x54
Mar 27 12:28:25 ifs2 kernel: [<f8ac5430>] a_node_just_died+0x118/0x178 [cman] Mar 27 12:28:25 ifs2 kernel: [<f8ac3acc>] process_dead_nodes+0x4e/0x7a [cman] Mar 27 12:28:25 ifs2 kernel: [<f8ac3cb3>] membership_kthread+0x1bb/0x38d [cman]
Mar 27 12:28:25 ifs2 kernel: [<c0100ed9>] kernel_thread_helper+0x5/0xb
Mar 27 12:28:25 ifs2 kernel: Code: 8b 1d 44 c3 ad f8 39 d9 7d 21 a1 48 c3 ad f8 8b 14 88 85 d2 74 10 83 7a 1c 02 75 0a 8b 45 08 89 10 8b 42 14 eb 0f 41 39 d9 7c df <0f> 0b 4f 0c 60 f2 ac f8 31 c0 5b 5d c3 55 89 e5 ff 35 48 c3 ad

== fs3 ==
ar 27 12:22:30 ifs3 kernel: kernel BUG at /usr/src/gfs/stable_1.0.2/stable/cluster/gfs-kernel/src/dlm/lock.c:428!
Mar 27 12:22:30 ifs3 kernel: invalid opcode: 0000 [#1]
Mar 27 12:22:30 ifs3 kernel: SMP
Mar 27 12:22:30 ifs3 kernel: Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg ide_floppy ide_cd cdrom qla2xxx siimage piix e1000 gfs lock_harness dm_mod
Mar 27 12:22:30 ifs3 kernel: CPU:    0
Mar 27 12:22:30 ifs3 kernel: EIP: 0060:[<f8aa5714>] Tainted: GF VLI
Mar 27 12:22:30 ifs3 kernel: EFLAGS: 00010246   (2.6.16-rc5-sara3 #1)
Mar 27 12:22:30 ifs3 kernel: EIP is at do_dlm_lock+0x138/0x152 [lock_dlm]
Mar 27 12:22:30 ifs3 kernel: eax: 00000004 ebx: ffffffea ecx: 0000b152 edx: 00000246 Mar 27 12:22:30 ifs3 kernel: esi: eb12b4c0 edi: f5f8f180 ebp: f62edb00 esp: f62edacc
Mar 27 12:22:30 ifs3 kernel: ds: 007b   es: 007b   ss: 0068
Mar 27 12:22:30 ifs3 kernel: Process nfsd (pid: 6278, threadinfo=f62ec000 task=f62c1030) Mar 27 12:22:30 ifs3 kernel: Stack: <0>f8aa9d89 00000001 20202020 32202020 20202020 20202020 32363820 33646565 Mar 27 12:22:30 ifs3 kernel: f62e0018 70baf030 eb12b4c0 00000008 f8b57000 f62edb34 f8aa57b9 eb12b4c0 Mar 27 12:22:30 ifs3 kernel: 00000000 eb12b4c0 00000008 ffffffff 00000003 00000003 eb12b4c0 00000000
Mar 27 12:22:30 ifs3 kernel: Call Trace:
Mar 27 12:22:30 ifs3 kernel: [<c0103599>] show_stack_log_lvl+0xad/0xb5
Mar 27 12:22:30 ifs3 kernel: [<c01036db>] show_registers+0x10d/0x176
Mar 27 12:22:30 ifs3 kernel: [<c01038ad>] die+0xf2/0x16d
Mar 27 12:22:30 ifs3 kernel: [<c0103996>] do_trap+0x6e/0x8a
Mar 27 12:22:30 ifs3 kernel: [<c0103bed>] do_invalid_op+0x90/0x97
Mar 27 12:22:30 ifs3 kernel: [<c010322f>] error_code+0x4f/0x54
Mar 27 12:22:30 ifs3 kernel: [<f8aa57b9>] lm_dlm_lock+0x4f/0x5b [lock_dlm]
Mar 27 12:22:30 ifs3 kernel: [<f899a776>] gfs_lm_lock+0x32/0x4c [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f89909f8>] gfs_glock_xmote_th+0x125/0x161 [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f8993625>] inode_go_xmote_th+0x20/0x25 [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f89900fd>] rq_promote+0xb3/0x136 [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f89902e3>] run_queue+0x85/0xbb [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f8991281>] gfs_glock_nq+0xce/0x119 [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f89917f0>] gfs_glock_nq_init+0x1d/0x36 [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f8991858>] gfs_glock_nq_num+0x37/0x7f [gfs]
Mar 27 12:22:30 ifs3 kernel: [<f89a17d1>] gfs_get_dentry+0xa9/0x29c [gfs]
Mar 27 12:22:30 ifs3 kernel: [<c01915bc>] find_exported_dentry+0x2f/0x4e1
Mar 27 12:22:30 ifs3 kernel: [<f89a1461>] gfs_decode_fh+0xc1/0xc9 [gfs]
Mar 27 12:22:30 ifs3 kernel: [<c0193c37>] fh_verify+0x35f/0x4db
Mar 27 12:22:30 ifs3 kernel: [<c0194d5c>] nfsd_access+0x27/0xe5
Mar 27 12:22:30 ifs3 kernel: [<c019ab4f>] nfsd3_proc_access+0x95/0xa2
Mar 27 12:22:30 ifs3 kernel: [<c019229f>] nfsd_dispatch+0xbe/0x17f
Mar 27 12:22:30 ifs3 kernel: [<c02e0a52>] svc_process+0x381/0x5c7
Mar 27 12:22:30 ifs3 kernel: [<c019208c>] nfsd+0x18d/0x2e2
Mar 27 12:22:30 ifs3 kernel: [<c0100ed9>] kernel_thread_helper+0x5/0xb
Mar 27 12:22:30 ifs3 kernel: Code: 26 50 0f bf 46 24 50 53 ff 76 08 ff 76 04 ff 76 0c ff 77 18 68 e0 a6 aa f8 e8 f2 17 67 c7 83 c4 38 68 89 9d aa f8 e8 e5 17 67 c7 <0f> 0b ac 01 c0 a4 aa f8 68 a0 a5 aa f8 e8 99 10 67 c7 8d 65 f4


--
--
********************************************************************
*                                                                  *
*  Bas van der Vlies                     e-mail: basv sara nl      *
*  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
*  Kruislaan 415                         fax:    +31 20 6683167    *
*  1098 SJ Amsterdam                                               *
*                                                                  *
********************************************************************


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]