[Linux-cluster] suggestion on freeze-on-node1 and unfreeze-on-node2 approach?

Gianluca Cecchi gianluca.cecchi at gmail.com
Fri Jan 8 15:12:21 UTC 2010


On Fri, 08 Jan 2010 09:06:57 -0500 Lon Hohberger wrote:
>You could set 'recovery="relocate"', freeze the service, stop the
> database cleanly, then unfreeze the service.

Ah, thanks, it should work.
The only "limit" would be that any recovery action will imply
relocation, correct?
(Some problems here with Oracle license in theory, because they let
you pay only one license in a two node cluster only if total time
where DB runs on one of the two node is less than a small amount of
time....)

Re-reading the manual about rhel 5.4 cluster administration, it puts
another doubt in my mind....

Section D.4 Failure Recovery and Independent Subtrees:
"... if any of the scripts defined in this service fail, the normal course
of action is to restart (or relocate or disable, according to the
service recovery policy) the service..."

Does this mean that if my service definition is the one below:

                 <service autostart="1" name="ACSSRV" recovery="relocate">
                        <ip ref="10.4.5.123"/>
                        <fs ref="oradata"/>
                        <fs ref="orasave"/>
                        <fs ref="rdoffline"/>
                        <fs ref="appl"/>
                        <script ref="ACS"/>
                </service>

If I get a network problem and my vip goes down for more than 30
seconds (that should be default interval between checks), it will
cause a relocation of the whole service and not a try-restart of only
the vip, correct?

Without the recovery="relocate" option would this imply again that the
whole service would be restarted  in the running node in this network
problem scenario (so shutdown abort of DB, umount of the FS and
restart of all of them)?

If this is true, could the modification below prevent this (I don't
care about fs, because if one of them goes down, probably I have
problems impacting DB itself, so it is safer to stop it....) for the
general case

                 <service autostart="1" name="ACSSRV" >
                        <ip ref="10.4.5.123" __independent_subtree="1" />
                        <fs ref="oradata"/>
                        <fs ref="orasave"/>
                        <fs ref="rdoffline"/>
                        <fs ref="appl"/>
                        <script ref="ACS"/>
                </service>


And what about this one, does it make sense at all if I add the
recovery=relocate policy?

                 <service autostart="1" name="ACSSRV" recovery="relocate">
                        <ip ref="10.4.5.123" __independent_subtree="1" />
                        <fs ref="oradata"/>
                        <fs ref="orasave"/>
                        <fs ref="rdoffline"/>
                        <fs ref="appl"/>
                        <script ref="ACS"/>
                </service>

Thanks again,
Gianluca




More information about the Linux-cluster mailing list