[Linux-cluster] GFS/SCSI Lost

isplist at logicore.net isplist at logicore.net
Mon Nov 6 15:33:45 UTC 2006


>> Do I need a keep alive script or is there a configuration somewhere I've
>> missed? Here is a snippet from where SCSCI errors started overnight.

> This one can't be blamed on GFS or the cluster infrastructure.
> The messages indicate that GFS withdrew because of underlying SCSI errors,
> which could mean a number of things underneath GFS, like flaky hardware,
> cables, etc.

Ok, so nothing to do with GFS or the cluster other than it pulling out due to 
failed storage. Thanks very much, it was not clear to me which problem came 
first.

Mike


> Maybe even the storage adapter or possibly even its device driver.
> The problem is not that your mount is temporary, and you shouldn't need any
> kind of keepalive script, that I'm aware of.
 



>> Nov  5 21:16:02 qm250 kernel: SCSI error : <0 0 2 1> return code = 0x10000
>> Nov  5 21:16:02 qm250 kernel: end_request: I/O error, dev sdf, sector 655
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0: fatal: I/O error
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0:   block = 26
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0:   function =
>> gfs_dreread
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0:   file =
>> /home/xos/gen/updates-2006-08/xlrpm21122/rpm/BUILD/gfs-kerne
>> l-2.6.9-58/up/src/gfs/dio.c, line = 576
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0:   time = 1162782962
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0: about to withdraw
>> from
>> the cluster
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0: waiting for
>> outstanding
>> I/O
>> Nov  5 21:16:02 qm250 kernel: GFS: fsid=vgcomp:qm.0: telling LM to
>> withdraw
>> Nov  5 21:16:05 qm250 kernel: lock_dlm: withdraw abandoned memory
>> Nov  5 21:16:05 qm250 kernel: GFS: fsid=vgcomp:qm.0: withdrawn
> Regards,
> 
> Bob Peterson
> Red Hat Cluster Suite







More information about the Linux-cluster mailing list