[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

RE: [Linux-cluster] kernel: CMANsendmsg failed: -101



Thanks for the reply, Lon.  I'm not sure why cman would spin forever
considering that ILO is one of the configurable power fences, and it is
supposed to turn the server off in case of a network disconnect.  The
SAN stays up, by the way.

Regarding shutdowns: I'm not actually performing a shutdown - ILO is.

Thanks again,
Jeff

-----Original Message-----
From: Lon Hohberger [mailto:lhh redhat com] 
Sent: Tuesday, July 19, 2005 4:23 PM
To: linux clustering
Cc: Jeff Harr
Subject: RE: [Linux-cluster] kernel: CMANsendmsg failed: -101

On Mon, 2005-07-18 at 14:26 -0400, Eric Kerin wrote:
> On Mon, 2005-07-18 at 14:06 -0400, Jeff Harr wrote:
> > Thanks for the help, Eric.  Its interesting that you mentioned
taking
> > down cman BEFORE taking the interfaces down, because when I was
testing
> > my failover I did: ifdown bond0.  My thinking was that the heartbeat
> > would die and everything would work.  It didn't occur to me that it
> > would mess up cman - do you think that's what's doing it?  Should I
> > instead just pull the cables? (I'm asking because it's a long drive
to
> > the site just for the test, but will if you think that's the
problem).
> > 
> > Thanks again,
> > Jeff
> > 
> 
> Well normally when a machine is fenced, it is not shut down, just
> powered off.  So that would definatly be the problem if the network
> interface is still down when you are shutting down the system.  How
are
> you issuing commands to the servers after taking down the network
> interface if you're not on site?
> 
> I take it you aren't using power controllers to fence the machines,
just
> using fence_manual, or an I/O fencing mechanism?

Jeff - 

Here's one way to avoid the problem.  Try this next time:

    reboot -fn

I wouldn't expect much of anything to work after the the network and SAN
paths just got turned off, so I'm not surprised it hangs during
shutdown.

(That said, CMAN probably shouldn't spin forever trying to call
sendmsg.)

-- Lon



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]