[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] dlm problem



Hi guys,

I got into an problem while using GFS with DLM on Fedora Core 4.
Every thing is working fine, the boot process, the mounting of the GFS volume, the reading of the journals,
until I try to read files from that volume. And from then the GFS hangs.
I tried to look into the system logs but there I can't find anything.
The only thing that I find it different from my other cluster wich is based on Ubuntu, is the content of
/proc/cluster/dlm_debug

On FC4 :
[root cluster01 ~]# cat /proc/cluster/dlm_debug
6 finished
clvmd move flags 1,0,0 ids 15,15,15
clvmd move flags 0,1,0 ids 15,20,15
clvmd move use event 20
clvmd recover event 20
clvmd add node 5
clvmd total nodes 5
clvmd rebuild resource directory
clvmd rebuilt 1 resources
clvmd purge requests
clvmd purged 0 requests
clvmd mark waiting requests
clvmd marked 0 requests
clvmd recover event 20 done
clvmd move flags 0,0,1 ids 15,20,20
clvmd process held requests
clvmd processed 0 requests
clvmd resend marked requests
clvmd resent 0 requests
clvmd recover event 20 finished
data move flags 1,0,0 ids 16,16,16
data move flags 0,1,0 ids 16,21,16
data move use event 21
data recover event 21
data add node 5
data total nodes 4
data rebuild resource directory
data rebuilt 8 resources
data purge requests
data purged 0 requests
data mark waiting requests
data marked 0 requests
data recover event 21 done
data move flags 0,0,1 ids 16,21,21
data process held requests
data processed 0 requests
data resend marked requests
data resent 0 requests
data recover event 21 finished
[root cluster01 ~]#


And on Ubuntu:
root web1:~# cat /proc/cluster/dlm_debug
       5
data (8066) req reply einval 660c01e6 fr 5 r 5        5
data (8066) req reply einval 660c01e6 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 65fb0158 fr 5 r 5        5
data (8066) req reply einval 63b4013a fr 5 r 5        5
data (8066) req reply einval 63b4013a fr 5 r 5        5
data send einval to 7
data send einval to 7
data send einval to 4
data send einval to 4
data send einval to 4
root web1:~#


So I thought that this is and DLM locking problem, so I started to look for clues
[root cluster01 ~]# lsmod | grep dlm
lock_dlm               42084  1
lock_harness            4392  2 lock_dlm,gfs
dlm                   118220  5 lock_dlm
cman                  130208  21 lock_dlm,dlm

[root cluster01 ~]# dmesg | grep dlm
dlm: no version for "struct_module" found: kernel tainted.
GFS: Trying to join cluster "lock_dlm", "cluster:data"

The packages that are installed are:
[root cluster01 ~]# rpm -qa | grep dlm
dlm-kernel-2.6.11.5-20050601.152643.FC4.17
dlm-kernheaders-2.6.11.5-20050601.152643.FC4.17
dlm-devel-1.0.0-3
dlm-1.0.0-3


And the machine is
[root cluster01 ~]# uname -a
Linux cluster01 2.6.14-1.1653_FC4 #1 Tue Dec 13 21:32:09 EST 2005 i686 i686 i386 GNU/Linux

Right now I don't know for sure which is the problem. I hope that someone can explain me what I did wrong.

Thanks in advance for your help.

Costi.






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]