[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] cluster issues - configuration OK?



I have a two node cluster on RHEL 6.3.  It is serving up three NFS mounts and a Postgres 9.0 database.  The database uses a GFS2 disk and the NFS mount points are ext4.   I can't seem to fail the services between nodes with out a disable/enable.  On top of that issue, please just look at my config and let me know where it can be improved in general.  Here's a log showing me trying to relocate postgres from one node to the other:

Aug 26 10:50:35 omadvnfs01c rgmanager[9149]: Stopping service service:postgresql90
Aug 26 10:50:35 omadvnfs01c rgmanager[19756]: [ip] Removing IPv4 address 10.198.1.112/24 from bond0
Aug 26 10:50:35 omadvnfs01c avahi-daemon[6596]: Withdrawing address record for 10.198.1.112 on bond0.
Aug 26 10:50:35 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:50:45 omadvnfs01c rsyslogd-2177: imuxsock lost 270 messages from pid 5431 due to rate-limiting
Aug 26 10:50:45 omadvnfs01c rgmanager[20118]: [script] Executing /etc/init.d/postgresql-9.0 stop
Aug 26 10:50:45 omadvnfs01c postgres[18312]: [2-1] LOG:  received fast shutdown request
Aug 26 10:50:45 omadvnfs01c postgres[18312]: [3-1] LOG:  aborting any active transactions
Aug 26 10:50:45 omadvnfs01c postgres[19284]: [10-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19207]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19102]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19100]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19099]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19141]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19142]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19072]: [2-1] LOG:  autovacuum launcher shutting down
Aug 26 10:50:45 omadvnfs01c postgres[19138]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19137]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19139]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19134]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19110]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19136]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19098]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19101]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19140]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19135]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:45 omadvnfs01c postgres[19133]: [2-1] FATAL:  terminating connection due to administrator command
Aug 26 10:50:46 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:50:55 omadvnfs01c nrpe[20652]: Error: Could not complete SSL handshake. 5
Aug 26 10:50:55 omadvnfs01c rsyslogd-2177: imuxsock lost 352 messages from pid 5431 due to rate-limiting
Aug 26 10:50:57 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:51:05 omadvnfs01c rsyslogd-2177: imuxsock lost 32 messages from pid 5431 due to rate-limiting
Aug 26 10:51:15 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:51:24 omadvnfs01c rsyslogd-2177: imuxsock lost 212 messages from pid 5431 due to rate-limiting
Aug 26 10:51:27 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:51:45 omadvnfs01c rsyslogd-2177: imuxsock lost 38 messages from pid 5431 due to rate-limiting
Aug 26 10:51:46 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:51:46 omadvnfs01c rgmanager[22393]: [script] script:postgresql90-init: stop of /etc/init.d/postgresql-9.0 failed (returned 1)
Aug 26 10:51:46 omadvnfs01c rgmanager[9149]: stop on script "postgresql90-init" returned 1 (generic error)
Aug 26 10:51:46 omadvnfs01c rgmanager[22492]: [fs] unmounting /data03
Aug 26 10:51:46 omadvnfs01c rgmanager[22533]: [fs] Sending SIGTERM to processes on /data03
Aug 26 10:51:52 omadvnfs01c rsyslogd-2177: imuxsock lost 248 messages from pid 5431 due to rate-limiting
Aug 26 10:51:52 omadvnfs01c rgmanager[22636]: [fs] unmounting /data03
Aug 26 10:51:52 omadvnfs01c rgmanager[22677]: [fs] Sending SIGKILL to processes on /data03
Aug 26 10:51:55 omadvnfs01c rsyslogd-2177: imuxsock begins to drop messages from pid 5431 due to rate-limiting
Aug 26 10:51:57 omadvnfs01c rgmanager[23435]: [fs] unmounting /data03
Aug 26 10:51:58 omadvnfs01c rsyslogd-2177: imuxsock lost 344 messages from pid 5431 due to rate-limiting
Aug 26 10:51:58 omadvnfs01c rgmanager[9149]: #12: RG service:postgresql90 failed to stop; intervention required
Aug 26 10:51:58 omadvnfs01c rgmanager[9149]: Service service:postgresql90 is failed

Here is my cluster.conf:


<?xml version="1.0"?>
<cluster config_version="166" name="omadvnfs01">
        <cman expected_votes="1" two_node="1"/>
        <clusternodes>
                <clusternode name="omadvnfs01c.sec.jel.lc" nodeid="1">
                        <fence>
                                <method name="drac">
                                        <device name="omadvnfs01c-drac"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="omadvnfs01b.sec.jel.lc" nodeid="2">
                        <fence>
                                <method name="drac">
                                        <device name="omadvnfs01b-drac"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <fencedevices>
                <fencedevice agent="fence_drac5" ipaddr="10.98.1.213" login="root" module_name="omadvnfs01c" name="omadvnfs01c-drac" passwd="narf" secure="on"/>
                <fencedevice agent="fence_drac5" ipaddr="10.98.1.212" login="root" module_name="omadvnfs01b" name="omadvnfs01b-drac" passwd="narf" secure="on"/>
        </fencedevices>
        <rm>
                <resources>
                        <nfsexport name="data01a"/>
                        <nfsexport name="data01b"/>
                        <nfsexport name="data01c"/>
                        <nfsclient allow_recover="on" name="omadvdss01a" options="rw,no_root_squash,async" target="omadvdss01a"/>
                        <nfsclient allow_recover="on" name="omadvdss01b" options="rw,no_root_squash,async" target="omadvdss01b"/>
                        <nfsclient allow_recover="on" name="omadvdss01c" options="rw,no_root_squash,async" target="omadvdss01c"/>
                        <script file="/etc/init.d/postgresql-9.0" name="postgresql90-init"/>
                        <script file="/etc/init.d/postgresql-9.1" name="postgresql91-init"/>
                        <ip address="10.198.1.112" monitor_link="on" sleeptime="10"/>
                        <ip address="10.198.1.113" monitor_link="on" sleeptime="10"/>
                        <ip address="10.198.1.114" monitor_link="on" sleeptime="10"/>
                        <ip address="10.198.1.115" monitor_link="on" sleeptime="10"/>
                        <script file="/etc/init.d/postgresql-8.4" name="postgresql84-init"/>
                        <fs device="/dev/vg_data01a/lv_data01a" force_unmount="1" fsid="18521" self_fence="1" fstype="ext4" mountpoint="/data01a" name="omadvnfs01-data01a" nfslock="1" options="noatime,nodiratime,data="">
                        <fs device="/dev/vg_data01b/lv_data01b" force_unmount="1" fsid="6623" self_fence="1" fstype="ext4" mountpoint="/data01b" name="omadvnfs01-data01b" nfslock="1" options="noatime,nodiratime,data="">
                        <fs device="/dev/vg_data01c/lv_data01c" force_unmount="1" fsid="91523" self_fence="1" fstype="ext4" mountpoint="/data01c" name="omadvnfs01-data01c" nfslock="1" options="noatime,nodiratime,data="">
                        <fs device="/dev/vg_data03/lv_data03" force_unmount="1" force_fsck="1" self_fence="1" fsid="15631" fstype="gfs2" mountpoint="/data03" name="omadvnfs01-data03" options=""/>
                </resources>
                <failoverdomains>
                        <failoverdomain name="fd_omadvnfs01c" nofailback="1" ordered="1" restricted="0">
                                <failoverdomainnode name="omadvnfs01c.sec.jel.lc" priority="1"/>
                                <failoverdomainnode name="omadvnfs01b.sec.jel.lc" priority="2"/>
                        </failoverdomain>
                        <failoverdomain name="fd_omadvnfs01b" nofailback="1" ordered="1" restricted="0">
                                <failoverdomainnode name="omadvnfs01b.sec.jel.lc" priority="1"/>
                                <failoverdomainnode name="omadvnfs01c.sec.jel.lc" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <service domain="fd_omadvnfs01b" name="omadvnfs01-nfs-data01b" nfslock="1" recovery="relocate">
                        <fs ref="omadvnfs01-data01b">
                                <nfsexport ref="data01b">
                                        <ip ref="10.198.1.114"/>
                                        <nfsclient ref="omadvdss01a"/>
                                        <nfsclient ref="omadvdss01b"/>
                                        <nfsclient ref="omadvdss01c"/>
                                </nfsexport>
                        </fs>
                </service>
                <service domain="fd_omadvnfs01c" name="omadvnfs01-nfs-data01a" nfslock="1" recovery="relocate">
                        <fs ref="omadvnfs01-data01a">
                                <nfsexport ref="data01a">
                                        <ip ref="10.198.1.113"/>
                                        <nfsclient ref="omadvdss01a"/>
                                        <nfsclient ref="omadvdss01b"/>
                                        <nfsclient ref="omadvdss01c"/>
                                </nfsexport>
                        </fs>
                </service>
                <service domain="fd_omadvnfs01c" name="omadvnfs01-nfs-data01c" nfslock="1" recovery="relocate">
                        <fs ref="omadvnfs01-data01c">
                                <nfsexport ref="data01c">
                                        <ip ref="10.198.1.115"/>
                                        <nfsclient ref="omadvdss01a"/>
                                        <nfsclient ref="omadvdss01b"/>
                                        <nfsclient ref="omadvdss01c"/>
                                </nfsexport>
                        </fs>
                </service>
                <service domain="fd_omadvnfs01b" name="postgresql90" recovery="relocate">
                        <ip ref="10.198.1.112"/>
                        <fs ref="omadvnfs01-data03">
                                <script ref="postgresql90-init"/>
                        </fs>
                </service>
        </rm>
        <logging debug="on" logfile="/var/log/cluster.log" logfile_priority="debug"/>
</cluster>

There's nothing of interest in my cluster.log file during the time when I attempted to relocate.
[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]