[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Timeout causing GFS filesystem inaccessibility



On Fri, Jun 03, 2005 at 09:48:12PM -0400, Rich Paredes wrote:
> Assumptions: 3 node cluster. 
> All 3 nodes are lock managers
> Nodes 1 and 2 mount GFS filesystems
> Node 1 during failure is master, node 2 and node 3 are slaves
> 
> Error on node 2 is:
> lock_gulmd_LT000[3608]: Timeout (15000000) on idx: 2 fd:7 (node1:192.168.101.11)
> 
> This error keeps repeating in the logs and GFS filesystem are totally
> inaccessible.  To fix, the master lock manager needs to be manually
> expired and then rebooted because applications were accessing GFS
> filesystems.
> 
> It looks like error message is generated from lock_io.c.
> 
> Does anyone know exactly what causes this error?

New sockets have a sepcific time slot in which they must send a valid
login packet before they are kicked out.  The message you're seeing is
form this.  There should be a metching set of messages from node1 saying
it is trying to log into node2.  (the message might be supressed though.
You will probably need to add the LoginLoops to the verbosity setting.)

That error message should provide some clues as to why the timeouts are
happening.

-- 
Michael Conrad Tadpol Tilstra
For some inexplicable reason, you just wish it would rain.

Attachment: pgpWdpzjcRYEt.pgp
Description: PGP signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]