[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Cluster node hangs



Hi Dominic,

 

Below is my cluster.conf:

===================================

<?xml version="1.0"?>

<cluster alias="rhel5_cluster" config_version="21" name="rhel5_cluster">

        <fence_daemon post_fail_delay="0" post_join_delay="3"/>

        <clusternodes>

                <clusternode name="rhel5cln1.home.com" nodeid="1" votes="1">

                        <fence>

                                <method name="1">

                                        <device name="manual_fence" nodename="rhel5cln1.home.com"/>

                                </method>

                        </fence>

                </clusternode>

                <clusternode name="rhel5cln2.home.com" nodeid="2" votes="1">

                        <fence>

                                <method name="1">

                                        <device name="manual_fence" nodename="rhel5cln2.home.com"/>

                                </method>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="1" two_node="1"/>

        <fencedevices>

                <fencedevice agent="fence_manual" name="manual_fence"/>

        </fencedevices>

        <rm log_level="7" log_facility="local3">

                <failoverdomains/>

                <resources>

                        <script file="/usr/local/httpd2.2.16/bin/apachectl" name="Apache_Script"/>

                        <ip address="192.168.30.137" monitor_link="1"/>

                        <clusterfs device="/dev/sdc" force_unmount="0" fsid="22440" fstype="gfs2" mountpoint="/usr/local/httpd2.2.16/htdocs/" name="gfs2share" options=""/>

                </resources>

                <service autostart="1" name="Apache_Service" recovery="restart">

                        <ip ref="192.168.30.137"/>

                        <script ref="Apache_Script"/>

                </service>

                <service autostart="1" name="gfs2share" recovery="relocate">

                        <clusterfs ref="gfs2share"/>

                </service>

        </rm>

<logging to_syslog="yes" to_logfile="yes" syslog_facility="local3">

<logging_daemon name="corosync" logfile="/var/log/cluster.log"/>

</logging>

</cluster>

=================================

 

One thing which I noticed is when I move the service on other node, it generates the following logs:

 

Feb 20 21:50:48 rhel5cln1 clurgmgrd[13764]: <notice> Stopping service service:gfs2share

Feb 20 21:50:48 rhel5cln1 clurgmgrd: [13764]: <debug> Not umounting /dev/sdc (clustered file system)

Feb 20 21:50:48 rhel5cln1 clurgmgrd[13764]: <notice> Service service:gfs2share is stopped

 

Cluster is configured in such that only one node should be mounting the GFS2 FS. When I start the cluster only one node mounts GFS2, however when service is moved GFS2 gets mounted on both the node but it is still accessible. It hangs when the owner node goes down and services move to other node automatically.

 

 

From: linux-cluster-bounces redhat com [mailto:linux-cluster-bounces redhat com] On Behalf Of dOminic
Sent: Sunday, February 13, 2011 8:03 PM
To: linux clustering
Subject: Re: [Linux-cluster] Cluster node hangs

 

Hi,

 

Whats the msg you are getting in logs ?. It would be great if you could attach log mesgs along with cluster.conf 

 

-dominic 

 

On Sun, Feb 13, 2011 at 3:49 PM, Sachin Bhugra <sachinbhugra hotmail com> wrote:

Thank for the reply and link. However, GFS2 is not listed in fstab, it is only handled by cluster config.


Date: Sun, 13 Feb 2011 10:52:51 +0100
From: ekuric redhat com
To: linux-cluster redhat com
Subject: Re: [Linux-cluster] Cluster node hangs



On 02/13/2011 10:41 AM, Elvir Kuric wrote:

On 02/13/2011 10:14 AM, Sachin Bhugra wrote:

Hi ,

I have setup a two node cluster in lab, with Vmware Server, and hence used manual fencing. It includes a iSCSI GFS2 partition and it service Apache in Active/Passive mode.

Cluster works and I am able to relocate service between nodes with no issues. However, the problem comes when I shutdown the node, for testing, which is presently holding the service. When the node becomes unavailable, service gets relocated and GFS partition gets mounted on the other node, however it is not accessible. If I try to do a "ls/du" on GFS partition, the command hangs. On the other hand the node which was shutdown gets stuck at "unmounting file system".

I tried using fence_manual -n nodename and then fence_ack_manual -n nodename, however it still remains the same.

Can someone please help me is what I am doing wrong?

Thanks,


--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster

It would be good to see  /etc/fstab configuration used on cluster nodes. If /gfs partition is mounted manually it will not be unmounted correctly in case you restart node ( and not executing umount prior restart ), and will hang during shutdown/reboot process.

More at:  http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Global_File_System_2/index.html


Edit: above link, section 3.4 Special Considerations when Mounting GFS2 File Systems



Regards,

Elvir

 

 


--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster

 

-- Linux-cluster mailing list Linux-cluster redhat com https://www.redhat.com/mailman/listinfo/linux-cluster


--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster

 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]