[Linux-cluster] Info on restart of non critical resources
Gianluca Cecchi
gianluca.cecchi at gmail.com
Tue Nov 19 10:06:56 UTC 2013
Hello,
I have a cluster with RH EL 6.3
cman-3.0.12.1-32.el6_3.2.x86_64
rgmanager-3.0.12.1-12.el6.x86_64
I configure ssh in cluster changing the default init script.
Then I configure it as a non critical resource in a service section
<resources>
...
<script file="/etc/init.d/sshd" name="clusterssh"/>
</resources>
...
<service autostart="0" domain="PABX" name="PABX">
<resource 1 ...>
. . .
<script __independent_subtree="2" ref="clusterssh"/>
<resource
</service>
If the pid of sshd process related to VIP is 2689
[root at myserver cluster]# kill 9 2689
[root at myserver cluster]# tail f /var/log/messages
Nov 15 16:30:22 myserver rgmanager[4694]: [script] Executing
/etc/init.d/sshd status
Nov 15 16:30:22 myserver rgmanager[4722]: [script]
script:clusterssh: status of
/etc/init.d/sshd failed (returned 1)
Nov 15 16:30:22 myserver rgmanager[11542]: status on
script "clusterssh" returned 1
(generic error)
Nov 15 16:30:22 myserver rgmanager[11542]: Some
independent resources in service:PABX
failed; Attempting inline recovery
Nov 15 16:30:22 myserver rgmanager[4753]: [script] Executing
/etc/init.d/sshd stop
Nov 15 16:30:22 myserver rgmanager[11542]: Inline recovery of
service:PABX complete
Nov 15 16:30:22 myserver rgmanager[11542]: Note: Some noncritical
resources were stopped
during recovery.
Nov 15 16:30:22 myserver rgmanager[11542]: Run 'clusvcadm -c
service:PABX' to restore them
to operation.
The ssh resource remains stopped and the service gets a [P] flag in
clustat output.
# clustat
Cluster Status for mycluster @ Fri Nov 15 16:30:54 2013
Member Status: Quorate
Member Name ID Status
node1 1 Online, rgmanager
node2 2 Online,
Local, rgmanager
/dev/block/253:5 0
Online, Quorum Disk
Service Name Owner (Last)
State
service:PABX node2
started [P]
The suggested command
clusvcadm -c service:PABX
takes it online again:
Nov 15 16:31:22 myserver rgmanager[11542]: Repairing service:PABX
Nov 15 16:31:22 myserver rgmanager[6787]: [script] Executing /etc/init.d/sshd
start
Nov 15 16:31:22 myserver rgmanager[11542]: Repair of service:PABX was successful
Is this expected behaviour? Any way to configure to try to restart in
place the resource without manual intervention when a resource is
configured as non critical?
Thanks in advance,
Gianluca
More information about the Linux-cluster
mailing list