[Linux-cluster] debuggin

Tue Oct 14 13:48:02 UTC 2008

ok ! that was a mistake. sorry.

Paras.

On Tue, Oct 14, 2008 at 5:00 AM, Andrew Beekhof <beekhof at gmail.com> wrote:

> You;re better off asking about the (old) heartbeat resource manager on
> the heartbeat mailing list.
>
> 2008/10/14 Paras pradhan <pradhanparas at gmail.com>:
> > My ha.cf entry looks like:
> > node1:
> >
> > logfacility local0
> > keepalive 2
> > udpport 694
> > deadtime 15
> > warntime 5
> > initdead 60
> > ucast eth0 10.42.40.198
> > ucast eth0 10.42.40.26
> > auto_failback off
> > stonith_host * suicide ha1.domain.local
> > watchdog /dev/watchdog
> > debugfile /var/log/ha-debug
> > node ha1.domain.local
> > node ha2.domain.local
> >
> > node2:
> > logfacility local0
> > keepalive 2
> > udpport 694
> > deadtime 15
> > warntime 5
> > initdead 60
> > ucast eth0 10.42.40.198
> > ucast eth0 10.42.40.26
> > auto_failback off
> > stonith_host * suicide ha2.domain.local
> > watchdog /dev/watchdog
> > debugfile /var/log/ha-debug
> > node ha1.domain.local
> > node ha2.domain.local
> > What does the below log file on node2 means when I turn off the eth0 on
> > node1.
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: node ha1.domain.local: is
> dead
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: info: Link ha1.domain.local:eth0
> > dead.
> > Oct 13 17:09:25 ha2 heartbeat: [6980]: info: Resetting node
> ha1.domain.local
> > with [Suicide STONITH device]
> > Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: glib: ha2.domain.local
> doesn't
> > control host [ha1.domain.local]
> > Oct 13 17:09:25 ha2 heartbeat: [6980]: ERROR: Host ha1.domain.local not
> > reset!
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: WARN: Managed STONITH
> > ha1.domain.local process 6980 exited with return code 1.
> > Oct 13 17:09:25 ha2 heartbeat: [6841]: ERROR: STONITH of ha1.domain.local
> > failed.  Retrying...
> > Oct 13 17:09:30 ha2 heartbeat: [6981]: info: Resetting node
> ha1.domain.local
> > with [Suicide STONITH device]
> > Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: glib: ha2.domain.local
> doesn't
> > control host [ha1.domain.local]
> > Oct 13 17:09:30 ha2 heartbeat: [6981]: ERROR: Host ha1.domain.local not
> > reset!
> >
> >
> > I need node1 to be shutdown when eth0 on node1 is down.
> >
> >
> > Any help will be greatly appreciated.
> >
> > Paras.
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081014/cad6ecb9/attachment.htm>