Andreas Dilger wrote:
The raid system does run with write back cache enabled.On Apr 13, 2006 10:40 -0400, Sev Binello wrote: [ still HTML-only email, extracting text from HTML is getting dull ]Since it seemed to mount okay only 3mins earlier,<br> can we assume that it was initially uncorrupted ?<br> Or, is that not valid assumption ?<br>No, at mount time there is only very cursory checking done of the group descriptors and superblock. The corruption reported appears to be from bad indirect blocks.Is there anything that we can check, test etc...<br> any advice, action at this point is better than waiting for the next fileystem disaster to ocurr.<br>Do you run with write cache enabled on your device? That can potentially cause filesystem corruption even in the face of ext3 journaling, because the journal atomicity guarantees are lost when the device reports a write is complete on disk when it really isn't.
I don't believe the actual drives have this enabled, but I'd have to check.
But we didn't actually lose power on the raid or hosts
just the connecting switches, so we lost all communication.
Presumably, in this situation the controller cache should have been emptied
Is my reasoning correct here ?
Either way, you are saying is best to avoid write cacheing in the future.
Also, in looking and comparing error msgs in the log files
I noticed that on the host where the corruption occurred,
the call to abort the journal didn't seem to actually happen for an hour
Does that have any significance ?
Mar 25 14:38:52 acnlin83 kernel: Error (-5) on journal on device 08:211hr gap
Mar 25 15:39:19 acnlin83 kernel: ext3_abort called.
Mar 25 15:39:19 acnlin83 kernel: EXT3-fs abort (device sd(8,33)): ext3_journal_start: Detected aborted journal
Mar 25 15:39:19 acnlin83 kernel: Remounting filesystem read-only
Mar 25 15:39:19 acnlin83 kernel: EXT3-fs error (device sd(8,33)) in start_transaction: Journal has aborted
Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
-- Sev Binello Brookhaven National Laboratory Upton, New York 631-344-5647 sev bnl gov