[Linux-cluster] What does it means "rgmanager status 139"?

Tue Feb 28 12:00:13 UTC 2012

On Tue, Feb 28, 2012 at 12:49 PM, emmanuel segura <emi2fast at gmail.com> wrote:
> sorry martinez
>
> but i told you to try with rg_test start NOT noop
>
> because like the we can see every action when rgmanager try to make start
> the service
>

Ok, start doesn't reports any problem, but status:

[root at clunode01 ~]# rg_test test /etc/cluster/cluster.conf status
service splunksrv-svc
Running in test mode.
Loading resource rule from /usr/share/cluster/SAPInstance
Loading resource rule from /usr/share/cluster/nfsserver.sh
Loading resource rule from /usr/share/cluster/ocf-shellfuncs
Loading resource rule from /usr/share/cluster/lvm_by_vg.sh
Loading resource rule from /usr/share/cluster/ip.sh
Loading resource rule from /usr/share/cluster/ASEHAagent.sh
Loading resource rule from /usr/share/cluster/lvm_by_lv.sh
Loading resource rule from /usr/share/cluster/samba.sh
Loading resource rule from /usr/share/cluster/checkquorum
Loading resource rule from /usr/share/cluster/service.sh
Loading resource rule from /usr/share/cluster/apache.sh
Loading resource rule from /usr/share/cluster/svclib_nfslock
Loading resource rule from /usr/share/cluster/script.sh
Loading resource rule from /usr/share/cluster/mysql.sh
Loading resource rule from /usr/share/cluster/tomcat-6.sh
Loading resource rule from /usr/share/cluster/SAPDatabase
Loading resource rule from /usr/share/cluster/oralistener.sh
Loading resource rule from /usr/share/cluster/vm.sh
Loading resource rule from /usr/share/cluster/oracledb.sh
Loading resource rule from /usr/share/cluster/lvm.sh
Loading resource rule from /usr/share/cluster/openldap.sh
Loading resource rule from /usr/share/cluster/fence_scsi_check.pl
Loading resource rule from /usr/share/cluster/nfsclient.sh
Loading resource rule from /usr/share/cluster/postgres-8.sh
Loading resource rule from /usr/share/cluster/fs.sh
Loading resource rule from /usr/share/cluster/netfs.sh
Loading resource rule from /usr/share/cluster/named.sh
Loading resource rule from /usr/share/cluster/nfsexport.sh
Loading resource rule from /usr/share/cluster/orainstance.sh
Loading resource rule from /usr/share/cluster/clusterfs.sh
Checking status of splunksrv-svc...
<debug>  Checking 192.168.44.4, Level 10
<debug>  192.168.44.4 present on eth0
<debug>  Link for eth0: Detected
<debug>  Link detected on eth0
<debug>  Local ping to 192.168.44.4 succeeded
<info>   Executing /data/config/etc/init.d/splunksrv-cluster status
+ . /etc/init.d/functions
++ TEXTDOMAIN=initscripts
++ umask 022
++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
++ export PATH
++ '[' -z '' ']'
++ COLUMNS=80
++ '[' -z '' ']'
+++ /sbin/consoletype
++ CONSOLETYPE=pty
++ '[' -f /etc/sysconfig/i18n -a -z '' -a -z '' ']'
++ . /etc/profile.d/lang.sh
++ unset LANGSH_SOURCED
++ '[' -z '' ']'
++ '[' -f /etc/sysconfig/init ']'
++ . /etc/sysconfig/init
+++ BOOTUP=color
+++ RES_COL=60
+++ MOVE_TO_COL='echo -en \033[60G'
+++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
+++ SETCOLOR_FAILURE='echo -en \033[0;31m'
+++ SETCOLOR_WARNING='echo -en \033[0;33m'
+++ SETCOLOR_NORMAL='echo -en \033[0;39m'
+++ PROMPT=yes
+++ AUTOSWAP=no
+++ ACTIVE_CONSOLES='/dev/tty[1-2]'
+++ SINGLE=/sbin/sushell
++ '[' pty = serial ']'
++ __sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
+ '[' '!' -d /data/splunk/instance/historydb ']'
+ prog=/data/soft/splunk/bin/splunk
+ pid_files='/data/soft/splunk/var/run/splunk/splunkd.pid
/data/soft/splunk/var/run/splunk/splunkweb.pid'
+ options_up=start
+ options_down=stop
+ case "$1" in
+ status
+ for i in '$pid_files'
+ status -p /data/soft/splunk/var/run/splunk/splunkd.pid
/usr/share/cluster/script.sh: line 115:  5532 Segmentation fault
${OCF_RESKEY_file} $1
<err>    script:splunksrv-cluster: status of
/data/config/etc/init.d/splunksrv-cluster failed (returned 139)
[script] script:splunksrv-cluster: status of
/data/config/etc/init.d/splunksrv-cluster failed (returned 139)
Status check of splunksrv-svc failed

Problem appears with this line:

113 # Don't need to catch return codes; this one will work.
114 ocf_log info "Executing ${OCF_RESKEY_file} $1"
115 ${OCF_RESKEY_file} $1

Then I don't understand why. From command line works, when system
starts and checks from a cron job works but when rgmanager does
status, no .... why??