[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: is this hardware failure ??



Gregory Machin wrote:
Hi my server hung, and when I checked the logs there's lots of nasty
looking entries ... Are these hardware failure and if so what hardware
?

Oct 31 15:44:47 server kernel: ata1.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x0
Oct 31 15:44:47 server kernel: ata1.00: cmd
b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
Oct 31 15:44:47 server kernel:          res
51/04:00:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
Oct 31 15:44:47 server kernel: ata1.00: configured for UDMA/133
Oct 31 15:44:47 server kernel: ata1: EH complete
Oct 31 15:44:47 server kernel: ata1.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x0
Oct 31 15:44:47 server kernel: ata1.00: cmd
b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
Oct 31 15:44:47 server kernel:          res
51/04:00:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)

As a layman, I think it doesn't look good. I do not like those words, "device error."

On the basis that I had what looked like terminated errors on my laptop yesterday (could not read _any_ files) but it seems okay after cycling power, I suggest you shut down and turn the port off at the wall.

After a minute - no more - restart the thing and run smartctl against all the ATA/SATA drives.


And make sure of your backups, you may need a really good one RSN.






Oct 31 15:44:47 server kernel: ata1.00: configured for UDMA/133
Oct 31 15:44:47 server kernel: ata1: EH complete
Oct 31 15:44:48 server kernel: ata1.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x0
Oct 31 15:44:48 server kernel: ata1.00: cmd
b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
Oct 31 15:44:48 server kernel:          res
51/04:00:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
Oct 31 15:44:48 server kernel: ata1.00: configured for UDMA/133
Oct 31 15:44:48 server kernel: ata1: EH complete
Oct 31 15:44:48 server kernel: ata1.00: exception Emask 0x0 SAct 0x0

I don't know what that work "exception" means in this context. I'm familiar with it on IBM mainframes where "unit exception" means "end of file" and it's what tapes report when they read a tape mark, disk drives say when they read a zero-length block (IBM drives historically are not sectored at all) and card readers say when they reach the end of the deck and the operator's pressed the appropriate button. In that context, a device error might be reported as "unit check."



--

Cheers
John

-- spambait
1aaaaaaa coco merseine nu  Z1aaaaaaa coco merseine nu
-- Advice
http://webfoot.com/advice/email.top.php
http://www.catb.org/~esr/faqs/smart-questions.html
http://support.microsoft.com/kb/555375

Please do not reply off-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]