Re: [Linux-cluster] really reliable?

On Tue, 2009-04-14 at 22:52 +0100, Gordan Bobic wrote:
> David Teigland wrote:
> >> Even when I try to reboot the nodes, I can't because the whole system 
> >> hangs on various processes that don't ever shut down.  I have to 
> >> physically reboot these boxes.
> > 
> > If something has gone wrong, it's often impossible to shutdown without a hard
> > reboot.  Even when things are working, rebooting can be a delicate task
> > because the system may easily be configured to stop things in the wrong order,
> > and one thing out of place can cause a wreck.
> If things don't come down in the correct order using the standard init 
> scripts, you should file a bug report about this. I've never seen it 
> happen on any of my clusters.

I believe he means the administrator makes a configuration change to the
init scripts which result in a shutdown not working.  RHCS amplifies
this problem somewhat so more care must be taken in these situations.


