[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Cluster-devel] postgres/drbd start-up issue with clusvcadm



Dear cluster-devel list,


I have a very strange problem with my postgres service. I implemented the

service in the cluster.conf and tested it with


rg_test test cluster.conf start service pgsql


<service autostart="1" name="pgsql" recovery="relocate">

<ip address="10.0.1.15" monitor_link="1">

<drbd name="pgsql_bd" resource="pgsql">

<fs __independent_subtree="1"

ref="pgsql_fs">

<postgres-8

config_file="/var/lib/pgsql/data/postgresql.conf" name="pgsqld"

postmaster_options="-D/var/lib/pgsql/data" postmaster_user="postgres"/>

</fs>

</drbd>

</ip>

</service>


The result was a succesfully started postgres service.


However if I start the service via

clusvcadm -e pgsql


Oct 07 14:26:16 rgmanager Starting disabled service service:pgsql

Oct 07 14:26:16 rgmanager [ip] Link for bridge0: Detected

Oct 07 14:26:16 rgmanager [ip] Adding IPv4 address 10.0.1.15/21 to bridge0

Oct 07 14:26:16 rgmanager [ip] Pinging addr 10.0.1.15 from dev bridge0

Oct 07 14:26:18 rgmanager [ip] Sending gratuitous ARP: 10.0.1.15

00:25:90:a2:c7:b6 brd ff:ff:ff:ff:ff:ff

Oct 07 14:26:19 rgmanager [drbd] Setting resource pgsql to state : primary

Oct 07 14:26:20 rgmanager [fs] mounting /dev/drbd6 on /var/lib/pgsql

Oct 07 14:26:20 rgmanager [fs] mount -t ext4 -o noatime /dev/drbd6

/var/lib/pgsql

Oct 07 14:26:20 rgmanager [postgres-8] Verifying Configuration Of

postgres-8:pgsqld

Oct 07 14:26:20 rgmanager [postgres-8] Verifying Configuration Of

postgres-8:pgsqld > Succeed

Oct 07 14:26:20 rgmanager [postgres-8] Starting Service postgres-8:pgsqld

Oct 07 14:26:20 rgmanager [postgres-8] PID File

"/var/run/cluster/postgres-8/postgres-8:pgsqld.pid" Was Removed - Zero length

Oct 07 14:26:20 rgmanager [postgres-8] Looking For IP Addresses

Oct 07 14:26:20 rgmanager [postgres-8] IP 10.0.1.15 found @

/cluster/rm/service[ name="pgsql"]/ip[1]

Oct 07 14:26:21 rgmanager [postgres-8] 1 IP addresses found for pgsql/pgsqld

Oct 07 14:26:21 rgmanager [postgres-8] Looking For IP Addresses > Succeed -

IP Addresses Found

Oct 07 14:26:21 rgmanager [postgres-8] Checking: SHA1 checksum of config file

/etc/cluster/postgres-8/postgres-8:pgsqld/postgresql.conf

Oct 07 14:26:21 rgmanager [ip] Checking 10.0.1.12, Level 0

Oct 07 14:26:21 rgmanager [ip] Checking 10.0.1.13, Level 0

Oct 07 14:26:21 rgmanager [postgres-8] Checking: SHA1 checksum > succeed

Oct 07 14:26:21 rgmanager [ip] Checking 10.0.1.14, Level 0

Oct 07 14:26:21 rgmanager [ip] 10.0.1.12 present on bridge0

Oct 07 14:26:21 rgmanager [ip] 10.0.1.13 present on bridge0

Oct 07 14:26:21 rgmanager [postgres-8] Generating New Config File

/etc/cluster/postgres-8/postgres-8:pgsqld/postgresql.conf From

/var/lib/pgsql/data/posOct 07 14:26:21 rgmanager [ip] 10.0.1.14 present on

bridge0

Oct 07 14:26:21 rgmanager [postgres-8] #x#x#x# forcing a cr here

Oct 07 14:26:22 rgmanager [postgres-8] Generating New Config File

/etc/cluster/postgres-8/postgres-8:pgsqld/postgresql.conf From

/var/lib/pgsql/data/posOct 07 14:26:22 rgmanager [ip] Link detected on bridge0

Oct 07 14:26:22 rgmanager [fs] Checking fs "install_fs", Level 10

Oct 07 14:26:22 rgmanager [postgres-8] #x#x#x# forcing a cr here

Oct 07 14:26:22 rgmanager [fs] Checking fs "www_fs", Level 10

Oct 07 14:26:22 rgmanager [postgres-8] Waiting for 2 seconds before calling

pg_ctl status..

Oct 07 14:26:24 rgmanager [postgres-8] trying to get status : su - "postgres"

-c "/usr/bin/pg_ctl status -D/var/lib/pgsql/data" &> /dev/null

Oct 07 14:26:24 rgmanager [postgres-8] pg_ctl status: failed

Oct 07 14:26:24 rgmanager [postgres-8] Starting Service postgres-8:pgsqld >

Failed

Oct 07 14:26:24 rgmanager start on postgres-8 "pgsqld" returned 1 (generic

error)


I get a failed service. I tried to debug the problem by adding additional

ocf_log lines into the postgres-8.sh script. However, the results are rather

confusing, since it seems that the line that starts the postmaster process is

not generating any output. I redirected the output to a file instead of a

/dev/null -> nothing. I enabled syslogging for the postgres process in the

postgresql.conf file under /var/lib/pgsql/data/postgresql.conf.

I also checked the latest git version of the postgres-8.sh script and found a

small change which is related to stopping the service but the starting part is

the same.

I am at a loss and any help to further debug this issue is greatly

appreciated.


In addition I found the following small issues:

a) the loglines generated from the generate_config_file() call somehow miss a CR

so that followin messages are printed overlapping.

b) during start of the service a variable pguser_group is set with


pguser_group=`groups $OCF_RESKEY_postmaster_user | cut -f1 -d' '`


I believe that this is incorrect, as the first field in the groups call delivers

the user name and not the group. In this case it should not matter as the

group name and user name for postgres is the same but I believe it should

read:


pguser_group=`groups $OCF_RESKEY_postmaster_user | cut -f3 -d' '`


Thanks

Andi


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]