Thanks for the suggestion, seems reasonable unfortunately on a operational system it means a lot of down time, but we end up there anyway. Thanks -Sev Martial Herbaut wrote:
But we didn't actually lose power on the raid or hosts just the connecting switches, so we lost all communication.Presumably, in this situation the controller cache should have been emptied Is my reasoning correct here ?Correct. If your RAID has w/b cache enabled, but is battery backed, you should be OK. Beyond this, I'm not sure what else you can look at.don't mean to barge in, however I have seen similar corruption happen in the past where the fabric went away momentarily, like unplugging and replugging a fibre cable on a non-dualpath/failover setup but the host was not killed/rebooted. From memory the corruption was not immediately apparent and became so later. I think the best thing to do in that case scenario is force a reboot of the host and then force fsck as opposed to continuing on and hope for the best.Martial Herbaut --------------- Server101.com _______________________________________________ Ext3-users mailing list Ext3-users redhat com https://www.redhat.com/mailman/listinfo/ext3-users
-- Sev Binello Brookhaven National Laboratory Upton, New York 631-344-5647 sev bnl gov