[rhelv6-beta-list] Errors - OS or Hardware
Paul Krizak
paul.krizak at amd.com
Wed May 5 17:08:46 UTC 2010
That's a hardware error -- the memory controller has reported a (likely
corrected) ECC error. You can check the statistics under
/sys/devices/system/edac/mc to see which dimm is throwing the errors.
If it's just a few then it's probably OK, but if you're getting hundreds
or thousdands of corrected ECC errors, then you probably have a
stuck/bad bit in one of your dimms.
Also check for uncorrectable errors -- any of those and you know that
data is actually getting corrupted -- bad!!!
Paul Krizak 7171 Southwest Pkwy MS B200.3A
MTS Systems Engineer Austin, TX 78735
Advanced Micro Devices Desk: (512) 602-8775
Linux/Unix Systems Engineering Cell: (512) 791-0686
Global IT Infrastructure Fax: (512) 602-0468
On 05/05/10 12:03, William T. Trotter wrote:
> If I open a gnome terminal, then
> every few seconds, the following errors
> are reported:
>
> Message from syslogd at trotteroffice5 at May 5 12:16:07 ...
> kernel: Northbridge Error, node 0, core: -1
>
> Message from syslogd at trotteroffice5 at May 5 12:16:07 ...
> kernel:K8 ECC error.
>
> Is this an RHEL 6.0 issue or is the os telling me
> that I have a hardware problem?
>
> Thanks,
>
> Tom Trotter
More information about the rhelv6-beta-list
mailing list