[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] (no subject)



Out setup is:
* We are using GFS from cvs stable branch on our 2.6.14.7 cluster. Just updated today to the
   newest CVS version. Only had to change the mutex() calls.
* The 4 nodes are running debian sarge;
* The 4 nodes act as NFS-servers for +/- 640  client-nodes
* brocade switch with SGI TP9300 4 controllers (15 TB)

We did a lot of testing an we could not crash the cluster, bonnie/ iozone and other tools/jobs. Now the cluster is in production we get a lot of nfsd crashed with EIP is at fda_create. We had it with our previous kernel 2.16.4.4 and with this one and "latest" CVS stable version. The server still runs ++ the load is high and it does not respond any more. If we are luckly only one NFS
thread is gone and rest is still up. The rest of the nodes still work.

Have users experienced this kind of problems and maybe have a solution for this problem?


Regards,


Here is a oops message:
Unable to handle kernel NULL pointer dereference at virtual address 00000038
printing eip:
f89bf999
*pde = 37bff001
*pte = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: lock_dlm dlm cman dm_round_robin dm_multipath sg ide_floppy ide_cd cdrom qla2300 qla2xxx_conf qla2xxx firmware_class siimage piix e1000 gfs lock_harness dm_mod
CPU:    0
EIP:    0060:[<f89bf999>]    Tainted: GF     VLI
EFLAGS: 00010246   (2.6.14.7-sara1)
EIP is at gfs_create+0xa9/0x1e0 [gfs]
eax: ffffffef   ebx: ffffffef   ecx: 00000001   edx: 00000000
esi: f296e24c   edi: ebf01e18   ebp: ebf01e84   esp: ebf01df8
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 16924, threadinfo=ebf00000 task=ebe84540)
Stack: ebf01e48 f296e24c 00000001 00008180 ebf01e18 00000001 f8cb9000 dd042254 ebf01e18 ebf01e18 00000000 ebe84540 00000001 00000120 00000000 000000c2 00000000 00000001 ebf01e40 ebf01e40 ebf01e48 ebf01e48 df0bd858 ebe84540
Call Trace:
[<c0103e5f>] show_stack+0x7f/0xa0
[<c0104012>] show_registers+0x162/0x1d0
[<c0104224>] die+0xf4/0x180
[<c035f697>] do_page_fault+0x2e7/0x6b2
[<c0103b03>] error_code+0x4f/0x54
[<c016b663>] vfs_create+0x83/0xf0
[<c01b81ce>] nfsd_create_v3+0x40e/0x550
[<c01bed2d>] nfsd3_proc_create+0x11d/0x180
[<c01b2f87>] nfsd_dispatch+0xd7/0x200
[<c0353a96>] svc_process+0x536/0x670
[<c01b2d1d>] nfsd+0x1bd/0x350
[<c010127d>] kernel_thread_helper+0x5/0x18
Code: 24 08 8d 45 c4 89 54 24 0c 89 74 24 04 89 04 24 e8 1d c3 fe ff 85 c0 89 c3 0f 84 2e 01 00 00 83 f8 ef 0f 85 13 01 00 00 8b 55 14 <80> 7a 38 00 0f 88 06 01 00 00 89 7c 24 0c 31 c0 8d 55 c4 89 44






--
Bas van der Vlies
basv sara nl




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]