[Linux-cluster] SMP and GFS

Manuel Bujan bujan at isqsolutions.com
Thu Jul 14 20:57:51 UTC 2005


Hi there,

Is there any  issue I should be aware of if SMP is enabled in
my kernel ? What if I compile my kernel to be pre-emptible ? Any problem with that and GFS ?

I am running GFS in a dual Xeon server from DELL.
My current kernel config has:

CONFIG_SMP=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y

After a lot of time running my GFS setup I got the following error in one of our cluster servers, and I had to reboot it in order to restablish the service:

#################################################################################
Jul 14 14:19:35 atmail-2 kernel:  2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (18044) req reply einval ae2c0092 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (31381) req reply einval bf9901e7 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (2023) req reply einval d6c30333 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 last message repeated 2 times
Jul 14 14:19:35 atmail-2 kernel: gfs001 (22381) req reply einval e03903ee fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (9779) req reply einval e0b20396 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (21318) req reply einval e3f00178 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (12439) req reply einval e3390095 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 last message repeated 2 times
Jul 14 14:19:35 atmail-2 kernel: gfs001 (12439) req reply einval e57e033f fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (17946) req reply einval ef3400a6 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (6679) req reply einval f0dc0169 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (10103) req reply einval f6a700d9 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel:  3,3 id 24a019c sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,103519 223022b 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,103519 3,3 id 223022b sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,b06ad 8d02f9 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,103519 3,3 id 223022b sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,b06ad 8d02f9 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,b06ad 3,3 id 8d02f9 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,33df5b ff9d01f5 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,33df5b 3,3 id ff9d01f5 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369770 fe1f02fd 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369770 3,3 id fe1f02fd sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e353 e7034e 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e353 3,3 id e7034e sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e33e ffc001a7 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e33e 3,3 id ffc001a7 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,b06b0 8901e1 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,b06b0 3,3 id 8901e1 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,3bdc60 1ca0351 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,3bdc60 3,3 id 1ca0351 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,ef8da ffdd0006 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,ef8da 3,3 id ffdd0006 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,40b4a 1fa012f 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,40b4a 3,3 id 1fa012f sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,136976e ff8c0371 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,136976e 3,3 id ff8c0371 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369832 de0060 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369832 3,3 id de0060 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,af9d6 690279 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,af9d6 3,3 id 690279 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,8baaf4 fffb0229 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,8baaf4 3,3 id fffb0229 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,136976f ff730126 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,136976f 3,3 id ff730126 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e34e 620175 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e34e 3,3 id 620175 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,3ce7e7 2a00002 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,3ce7e7 3,3 id 2a00002 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369833 ba00ae 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369833 3,3 id ba00ae sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,df936 2530027 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,df936 3,3 id 2530027 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e356 feb502d9 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e356 3,3 id feb502d9 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369874 ff93010d 3 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,13892c 1dc038b 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369874 3,3 id ff93010d sts -65538 0

...........

Jul 14 14:19:35 atmail-2 kernel: lock_dlm:  Assertion failed on line 411 of file /usr/src/cluster/gfs-kernel/src/dlm/lock.c
Jul 14 14:19:35 atmail-2 kernel: lock_dlm:  assertion:  "!error"
Jul 14 14:19:35 atmail-2 kernel: lock_dlm:  time = 1698417809
Jul 14 14:19:35 atmail-2 kernel: gfs001: num=2,cb81a8 err=-22 cur=3 req=5 lkf=44
Jul 14 14:19:35 atmail-2 kernel:
Jul 14 14:19:35 atmail-2 kernel: ------------[ cut here ]------------
Jul 14 14:19:35 atmail-2 kernel: kernel BUG at /usr/src/cluster/gfs-kernel/src/dlm/lock.c:411!
Jul 14 14:19:35 atmail-2 kernel: invalid operand: 0000 [#1]
Jul 14 14:19:35 atmail-2 kernel: PREEMPT SMP
Jul 14 14:19:35 atmail-2 kernel: Modules linked in: ipmi_si ipmi_devintf ipmi_msghandler autofs e1000 eepro100 mii microcode lock_dlm dlm cman gfs lock_harness dm_mod ide_disk ide_core aic7xxx aacraid megaraid_mbox megaraid_mm
Jul 14 14:19:35 atmail-2 kernel: CPU:    0
Jul 14 14:19:35 atmail-2 kernel: EIP:    0060:[<f887cfe7>]    Not tainted VLI
Jul 14 14:19:35 atmail-2 kernel: EFLAGS: 00010296   (2.6.11.6y)
Jul 14 14:19:35 atmail-2 kernel: EIP is at do_dlm_lock+0x1d7/0x1f0 [lock_dlm]
Jul 14 14:19:35 atmail-2 kernel: eax: 00000001   ebx: ffffffea   ecx: 00008000   edx: 00000202
Jul 14 14:19:35 atmail-2 kernel: esi: e360d500   edi: f7521e00   ebp: 00000001   esp: cdbfbcc4
Jul 14 14:19:35 atmail-2 kernel: ds: 007b   es: 007b   ss: 0068
Jul 14 14:19:35 atmail-2 kernel: Process virtual (pid: 7819, threadinfo=cdbfb000 task=cdb56a40)
Jul 14 14:19:35 atmail-2 kernel: Stack: f8882cc9 f73299a0 00000002 00cb81a8 00000000 ffffffea 00000003 00000005
Jul 14 14:19:35 atmail-2 kernel:        00000044 f887d730 00000000 20202020 32202020 20202020 20202020 62632020
Jul 14 14:19:35 atmail-2 kernel:        38613138 c2000018 c0319d70 e360d500 00000000 de646b08 f8c21000 f887d0d9
Jul 14 14:19:35 atmail-2 kernel: Call Trace:
Jul 14 14:19:35 atmail-2 kernel:  [<f887d730>] lock_bast+0x0/0x10 [lock_dlm]
Jul 14 14:19:35 atmail-2 kernel:  [<f887d0d9>] lm_dlm_lock+0x79/0x90 [lock_dlm]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b5aa2a>] gfs_lm_lock+0x4a/0x70 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4f2cf>] gfs_glock_xmote_th+0xbf/0x220 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4e6d7>] rq_promote+0xd7/0x1b0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4e9be>] run_queue+0xce/0xe0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4ff95>] gfs_glock_nq+0x85/0x190 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b50ab9>] nq_m_sync+0x69/0xa0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b50990>] glock_compare+0x0/0xc0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b50c62>] gfs_glock_nq_m+0x172/0x1e0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b6828e>] gfs_link+0xae/0x410 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<c0166912>] permission+0x92/0xa0
Jul 14 14:19:35 atmail-2 kernel:  [<f8b681e0>] gfs_link+0x0/0x410 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<c0169c8c>] vfs_link+0xec/0x170
Jul 14 14:19:35 atmail-2 kernel:  [<c0169e11>] sys_link+0x101/0x130
Jul 14 14:19:35 atmail-2 kernel:  [<c0162d67>] sys_stat64+0x37/0x40
Jul 14 14:19:35 atmail-2 kernel:  [<c0102919>] sysenter_past_esp+0x52/0x75
Jul 14 14:19:35 atmail-2 kernel: Code: 0c 89 54 24 10 8b 46 0c 89 44 24 08 8b 47 18 c7 04 24 a0 35 88 f8 89 44 24 04 e8 f5 d4 89 c7 c7 04 24 c9 2c 88 f8 e8 e9 d4 89 c7 <0f> 0b 9b 01 a0 33 88 f8 c7 04 24 60 34 88 f8 e8 b5 cb 89 c7 90


##########################################

We were running without problem until now the following version of the GFS suite from cvs:

gfs_tool version
gfs_tool DEVEL.1112190134 (built Mar 30 2005 08:43:42)
Copyright (C) Red Hat, Inc.  2004-2005  All rights reserved.

#cman_tool version
5.0.1 config 15

#ccs_tool -V
ccs_tool DEVEL.1112190133 (built Mar 30 2005 08:43:29)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.


Any hints or recomendation,

Regards
Bujan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050714/6e778b84/attachment.htm>


More information about the Linux-cluster mailing list