[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] newbie questions



> First thing to test is that you can configure the IP address manually, 
> mount the filesystem, and start apache "the old-fashioned way", using 
> the /etc/init.d/httpd script on either machine.

[root tf1 log]# /etc/init.d/httpd start
Starting httpd: (99)Cannot assign requested address: make_sock: could not bind to address 
192.168.1.7:80
no listening sockets available, shutting down

> 
> If that works, then I'd guess your problem with the cluster service is 
> that the <ip > resource needs to be listed before the <script > 
> resource, inside the <service/> block, since apache will bomb if the IP 
> address you told it to bind to isn't present (and I assume apache is 
> configured to bind to that address).  If that's the case, then you 
> should see an error concerning it in the apache error.log.
> 
> As far as nothing being logged about the cluster service trying to 
> start, it SHOULD be logging in /var/log/messages, but I've seen some 
> wierdness with this in the past.  A healthy cluster node should show 
> something like this when the service starts:
> 
> Jun 22 09:36:51 knob clurgmgrd[3652]: <notice> Starting stopped service 
> maps_ip
> Jun 22 09:36:51 knob clurgmgrd: [3652]: <info> Adding IPv4 address 
> x.y.8.60 to eth0
> Jun 22 09:36:52 knob clurgmgrd[3652]: <notice> Service maps_ip started
> Jun 22 09:36:52 knob clurgmgrd[3652]: <notice> Starting stopped service 
> httpd
> Jun 22 09:36:52 knob clurgmgrd: [3652]: <info> Executing 
> /etc/init.d/httpd start
> Jun 22 09:36:54 knob httpd: httpd startup succeeded
> Jun 22 09:36:54 knob clurgmgrd[3652]: <notice> Service httpd started

well, I see messages, but never ones with clurgmgrd 
Jul  1 08:27:10 tf1 network: Setting network parameters:  succeeded 
Jul  1 08:27:10 tf1 network: Bringing up loopback interface:  succeeded 
Jul  1 08:27:14 tf1 network: Bringing up interface eth0:  succeeded 
Jul  1 08:27:19 tf1 network: Bringing up interface eth2:  succeeded 
Jul  1 08:27:19 tf1 procfgd: Starting procfgd:  succeeded 
Jul  1 08:27:24 tf1 kernel: CMAN: Waiting to join or form a Linux-cluster
Jul  1 08:27:24 tf1 ccsd[3928]: Connected to cluster infrastruture via: CMAN/SM Plugin v1.1.5 
Jul  1 08:27:24 tf1 ccsd[3928]: Initial status:: Inquorate 
Jul  1 08:27:56 tf1 kernel: CMAN: forming a new cluster
Jul  1 08:27:56 tf1 kernel: CMAN: quorum regained, resuming activity
Jul  1 08:27:56 tf1 ccsd[3928]: Cluster is quorate.  Allowing connections. 
Jul  1 08:27:56 tf1 kernel: DLM 2.6.9-41.7 (built May 22 2006 17:34:37) installed
Jul  1 08:27:56 tf1 cman: startup succeeded
Jul  1 08:27:56 tf1 lock_gulmd: no <gulm> section detected in /etc/cluster/cluster.conf 
succeeded
Jul  1 08:27:57 tf1 fenced: startup succeeded
Jul  1 08:27:59 tf1 clvmd: Cluster LVM daemon started - connected to CMAN
Jul  1 08:27:59 tf1 clvmd: clvmd startup succeeded
Jul  1 08:27:59 tf1 kernel: cdrom: open failed.
Jul  1 08:28:00 tf1 kernel: cdrom: open failed.
Jul  1 08:28:00 tf1 vgchange:   1 logical volume(s) in volume group "diskarray" now active
Jul  1 08:28:00 tf1 clvmd: Activating VGs: succeeded
Jul  1 08:28:00 tf1 netfs: Mounting other filesystems:  succeeded
Jul  1 08:28:00 tf1 kernel: Lock_Harness 2.6.9-49.1 (built May 22 2006 17:38:48) installed
Jul  1 08:28:00 tf1 kernel: GFS 2.6.9-49.1 (built May 22 2006 17:39:06) installed
Jul  1 08:28:00 tf1 kernel: GFS: Trying to join cluster "lock_dlm", "progressive:lv1"
Jul  1 08:28:00 tf1 kernel: Lock_DLM (built May 22 2006 17:38:50) installed
Jul  1 08:28:02 tf1 kernel: GFS: fsid=progressive:lv1.0: Joined cluster. Now mounting FS...
Jul  1 08:28:02 tf1 kernel: GFS: fsid=progressive:lv1.0: jid=0: Trying to acquire journal 
lock...
Jul  1 08:28:02 tf1 kernel: GFS: fsid=progressive:lv1.0: jid=0: Looking at journal...
Jul  1 08:28:03 tf1 kernel: GFS: fsid=progressive:lv1.0: jid=0: Done

I compiled/installed all this from source.. Im guessing I missed the 
clurgmgrd part.. Ill go back and look.

> (I always find the concept of "starting" an IP address faintly 
> hilarious), and then you should see something like:
> 
> Jun 22 09:37:33 knob clurgmgrd: [3652]: <info> Executing 
> /etc/init.d/httpd status
> 
> every 30 seconds or so.
yeah, I never see this.

> 
> That brings me to an important point - the apache init script doesn't 
> follow whatever standard RedHat init script are supposed to follow 
> (there's a thread about this that I was involved in 6-9 months back), 
> with respect to the status command.  At least, it didn't at the time, 
> maybe they've fixed it (I hope, by now).  The stop action return(s/ed) 
> non-zero (failure) if apache wasn't running.  If the cluster manager 
> thinks that service was failed, it will first try to stop it before 
> starting it.  If the apache script returns failure on the attempt to 
> stop it because it was stopped already, then the cluster manager will 
> think something's wrong and never try to start it.  The upshot of which 
> is, you have to hack the init script to make it return 0 in this 
> situation.  I took the copout approach of just forcing it to always 
> return 0:
> 
>  stop() {
>          echo -n $"Stopping $prog: "
>          killproc $httpd
> -        RETVAL=$?
> +        RETVAL=0 # makes cluster admin less crazy
>          echo
>          [ $RETVAL = 0 ] && rm -f ${lockfile} ${pidfile}
>  }
> 
> which should be safe enough (if killproc fails to kill it you've 
> probably got bigger problems on your hands), but could be better. 
> Someone else may have pasted a better patch on this list, check the 
> archives.
> 
> I just checked a fresh install of httpd on an AS 4 latest box, and the 
> script is still the same.  Convenient, since httpd is the specific 
> example service used for setting up a cluster service in the Cluster 
> Suite docs.  ;-)
> 
> I hope this helps - I'll stop rambling now.
> 
> Oh, one other thing - if the filesystem is GFS, why bother 
> mounting/unmounting at all?  Just have it mounted in fstab, or make it a 
> separate cluster service if you want the extra assurance that it'll stay 
> mounted.
ooh I do have it in the fstab... thats just me not fully understanding how all this is supposed 
to work.

Jason


> >
> >
> ><?xml version="1.0"?>
> ><cluster config_version="22" name="progressive">
> >        <fence_daemon clean_start="0" post_fail_delay="0" 
> >        post_join_delay="3"/>
> >        <clusternodes>
> >                <clusternode name="tf1" votes="1">
> >                        <fence>
> >                                <method name="1">
> >                                        <device name="apc_power_switch" 
> >                                        option="off" port="1" switch="1"/>
> >                                        <device name="apc_power_switch" 
> >                                        option="off" port="2" switch="1"/>
> >                                        <device name="apc_power_switch" 
> >                                        option="on" port="1" switch="1"/>
> >                                        <device name="apc_power_switch" 
> >                                        option="on" port="2" switch="1"/>
> >                                </method>
> >                        </fence>
> >                </clusternode>
> >                <clusternode name="tf2" votes="1">
> >                        <fence>
> >                                <method name="1">
> >                                        <device name="apc_power_switch" 
> >                                        option="off" port="3" switch="1"/>
> >                                        <device name="apc_power_switch" 
> >                                        option="off" port="4" switch="1"/>
> >                                        <device name="apc_power_switch" 
> >                                        option="on" port="3" switch="1"/>
> >                                        <device name="apc_power_switch" 
> >                                        option="on" port="4" switch="1"/>
> >                                </method>
> >                        </fence>
> >                </clusternode>
> >        </clusternodes>
> >        <cman expected_votes="1" two_node="1"/>
> >        <fencedevices>
> >                <fencedevice agent="fence_apc" ipaddr="192.168.1.8" 
> >                login="apc" name="apc_power_switch" passwd="apc"/>
> >        </fencedevices>
> >        <rm>
> >                <failoverdomains>
> >                        <failoverdomain name="httpd" ordered="1" 
> >                        restricted="1">
> >                                <failoverdomainnode name="tf1" 
> >                                priority="1"/>
> >                                <failoverdomainnode name="tf2" 
> >                                priority="2"/>
> >                        </failoverdomain>
> >                </failoverdomains>
> >                <resources>
> >                        <script file="/etc/init.d/httpd" 
> >                        name="cluster_apache"/>
> >                        <fs device="/dev/mapper/diskarray-lv1" 
> >                        fstype="ext3" mountpoint="/mnt/gfs/htdocs" name="apache_content"/>
> >                        <ip address="192.168.1.7" monitor_link="1"/>
> >                </resources>
> >                <service autostart="1" domain="httpd" name="Apache 
> >                Service">
> >                        <script ref="cluster_apache"/>
> >                        <fs ref="apache_content"/>
> >                        <ip ref="192.168.1.7"/>
> >                </service>
> >        </rm>
> ></cluster>
> >
> >
> >ooh the other thing is that I had to lie about the filesystem in which it 
> >lives, it only gave me the ext2/ext3 options, (i chose ext3) but its on a 
> >gfs partition.
> >
> >Jason
> >
> >--
> >Linux-cluster mailing list
> >Linux-cluster redhat com
> >https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
================================================
|    Jason Welsh   jason monsterjam org        |
| http://monsterjam.org    DSS PGP: 0x5E30CC98 |
|    gpg key: http://monsterjam.org/gpg/       |
================================================


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]