[Linux-cluster] virtual address went down? (with panic link)
jason at monsterjam.org
jason at monsterjam.org
Fri Oct 20 00:40:30 UTC 2006
I think the mailing list doesnt like attachments, so heres a link to the panic that was supposed
to go along with this post.
http://monsterjam.org/crash/panic.jpg
I tried stopping the services on
the first box of my 2 node cluster:
service rgmanager stop
service gfs stop
service clvmd stop
service fenced stop
service cman stop
service ccsd stop
everything came down fine.
then I started em back up..
service ccsd start
this seemed to hang for about 2 minutes, then I got a panic..
as shown in the linked above graphic..
this is on 2.6.9-34.ELsmp redhat Enterprise Linux AS release 4 (Nahant
Update 4)
running ccs-1.0.3-0,
cman-kernel-hugemem-2.6.9-43.8
cman-kernel-2.6.9-43.8
cman-1.0.4-0
cman-kernel-smp-2.6.9-43.8
cman-kernheaders-2.6.9-43.8
built from sources..
heres my cluster.conf
<?xml version="1.0"?>
<cluster config_version="22" name="progressive">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes>
<clusternode name="tf1" votes="1">
<fence>
<method name="1">
<device name="apc_power_switch"
option=" off"
port="1" switch="1"/>
<device name="apc_power_switch"
option=" off"
port="2" switch="1"/>
<device name="apc_power_switch"
option=" on"
port="1" switch="1"/>
<device name="apc_power_switch"
option=" on"
port="2" switch="1"/>
</method>
</fence>
</clusternode>
<clusternode name="tf2" votes="1">
<fence>
<method name="1">
<device name="apc_power_switch"
option=" off"
port="3" switch="1"/>
<device name="apc_power_switch"
option=" off"
port="4" switch="1"/>
<device name="apc_power_switch"
option=" on"
port="3" switch="1"/>
<device name="apc_power_switch"
option=" on"
port="4" switch="1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_apc" ipaddr="192.168.1.8"
login="xxx"
name="apc_power_switch" passwd="xxx"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="httpd" ordered="1"
restricted="1">
<failoverdomainnode name="tf1"
priority="1"/>
<failoverdomainnode name="tf2"
priority="2"/>
</failoverdomain>
</failoverdomains>
<resources>
<script file="/etc/init.d/httpd"
name="cluster_apache"/>
<fs device="/dev/mapper/diskarray-lv1"
fstype="ext3" mou
ntpoint="/mnt/gfs/htdocs" name="apache_content"/>
<ip address="192.168.1.7" monitor_link="1"/>
</resources>
<service autostart="1" domain="httpd" name="Apache
Service">
<ip ref="192.168.1.7"/>
<script ref="cluster_apache"/>
<fs ref="apache_content"/>
</service>
</rm>
</cluster>
ooh and shortly after the first box came back up, the second one got
rebooted automagically (power fenced from the first one im guessing) for
good measure.
any help appreciated
Jason
On Tue, Oct 17, 2006 at 09:37:15PM -0400, jason at monsterjam.org wrote:
> so Ive had a test cluster running for quite a while now, both nodes of a 2 node cluster are up,
> but the virtual address seems to have disappeared.. its not pingable, neither server has it
> configured anymore.. The only application I had using the virtual address was apache (just for
> testing it). what logs/information should I be looking at to see what happened and why?
>
> regards,
> Jason
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
More information about the Linux-cluster
mailing list