[Linux-cluster] GFS over AOE without fencing?

Kadlecsik Jozsi kadlec at sunserv.kfki.hu
Fri Apr 20 08:49:12 UTC 2007


On Fri, 20 Apr 2007, Jayson Vantuyl wrote:

> However, cec is not so reliable of a connection.  It is NOT TCP.  

Yes, it's true.

> I have little information about how resilient the protocol is, however, 
> in a unit we have with a bad disk, I've had the cec connection 
> spontaneously drop mid-command.  I'm sure they're working to fix this, 
> but it doesn't bode well for something as critical as fencing.  I'm also 
> unclear on whether a dropped connection generates a non-zero exit code 
> (i.e. is even detectable).

The fence_coraid script I wrote uses expect in perl. So if the cec 
connection fails (at any point) it is detected and reported by the script.

> Also, on APCs, the fence_apc script has the benefit that the APC 
> switches do not allow more than one concurrent telnet connection, which 
> effectively serializes fence requests.  With the cec, not so much.

This is problematic: the requests are not serialized at all, two 
concurrent cec sessions are totally mixed: command issued in one cec
appears in the other (letter by letter). Yes, this is a real issue.
 
> Also, this fences the entire Coraid device in a way that must be manually
> cleared if it gets left masked.  This is a real possibility where multiple
> nodes are racing to fence each other--especially on multiple Coraid shelfs (as
> it must be done per shelf).
> 
> Since we use our Coraids for non-GFS boot volumes as well, this is also
> problematic for us, since a stale mask entry keeps us from booting.

The masking disallows the access to the logical blades only. The host 
still able to connect to the Coraid box over cec and re-enable it's 
access rights to the lblades.

Best regards,
Jozsef
--
E-mail : kadlec at sunserv.kfki.hu, kadlec at blackhole.kfki.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
         H-1525 Budapest 114, POB. 49, Hungary


More information about the Linux-cluster mailing list