Re: [linux-lvm] Server Crashes Sometimes

On Sun, 3 Apr 2011, Jonathan Tripathy wrote:

still respond to ping), and dmseg is flooded with

Uhhuh. NMI received. Dazed and confused, but trying to continue
You probably have a hardware problem with your RAM chips
This issue is very rare, and has only happened to me maybe 3 times over the
past 7 months, each time being when I issued an LVM command

Has anybody experienced this before?

Yes.  The message means just what it says.  NMI is a hardware interrupt
usually reserved for machine errors such as an uncorrectable memory error.
In my case, it was a defective PCI card (with USB ports) raising the NMI.
(Which I determined by process of elimination.)  Note that many system buses,
including PCI, have error checking and will raise NMI on failure.

I have heard of hardware that raised NMI in normal operation as a kind
of highest priority interrupt.  However, such hardware is generally
equivalent to broken.

