[Linux-cluster] HA LVM won't strip tags

urgrue urgrue at bulbous.org
Wed Sep 19 08:09:24 UTC 2012


On Wed, Sep 19, 2012, at 00:03, Volker Dormeyer wrote:
> You can use the self_fence option of the LVM resource agent.
> If the node which loses disk-access is not able to clean-up the tags,
> it tries to reboot itself by issuing "reboot -fn".

I have that option on and was wondering why it didn't quite seem to work
as I expected.
Looking back through my logs, I can see the reason: sometimes the
unmount succeeds, therefore self_fence doesn't take effect at that
point. It should then try to strip the lvm tags and self_fence if THAT
fails, but it doesn't do this part at all?

What I do is put the LUN in a 'not ready' state, so it becomes
unreadable (and unwriteable). Here's an example of where it failed:

On node 2:
Sep 18 14:56:43 rgmanager [fs] fs:fs_sanlv: is_alive: failed write test
on [/var/lib/mysql]. Return code: 1
Sep 18 14:56:43 rgmanager [fs] fs:fs_sanlv: Mount point is not
accessible!
Sep 18 14:56:43 rgmanager status on fs "fs_sanlv" returned 1 (generic
error)
Sep 18 14:56:43 rgmanager Stopping service service:srv_mysql
Sep 18 14:56:43 rgmanager [mysql] Verifying Configuration Of
mysql:res_mysql
Sep 18 14:56:43 rgmanager [mysql] Verifying Configuration Of
mysql:res_mysql > Succeed
Sep 18 14:56:44 rgmanager [mysql] Stopping Service mysql:res_mysql
Sep 18 14:56:48 rgmanager [mysql] Stopping Service mysql:res_mysql >
Succeed
Sep 18 14:56:48 rgmanager [ip] Removing IPv4 address 10.1.0.7/22 from
bond1
Sep 18 14:56:58 rgmanager [fs] unmounting /var/lib/mysql
Sep 18 14:57:00 rgmanager Service service:srv_mysql is recovering
Sep 18 14:57:00 rgmanager Sent remote-start request to 1

Now from node 1:
Sep 18 14:57:00 rgmanager Recovering failed service service:srv_mysql
Sep 18 14:57:02 rgmanager [lvm] Starting volume group, sanvg
Sep 18 14:57:02 rgmanager [lvm] Someone else owns this volume group
Sep 18 14:57:02 rgmanager start on lvm "res_sanvg" returned 1 (generic
error)
Sep 18 14:57:02 rgmanager #68: Failed to start service:srv_mysql; return
value: 1

Here's the relevant part of cluster.conf:
                <resources>
                        <lvm name="res_sanvg" self_fence="on"
                        vg_name="sanvg"/>
                        <ip address="10.1.0.7/22" sleeptime="10"/>
                        <fs device="/dev/sanvg/sanlv" fsid="29088"
                        mountpoint="/var/lib/mysql" name="fs_sanlv"
                        options="noatime" self_fence="on"/>
                        <mysql config_file="/etc/my.cnf"
                        listen_address="10.1.0.7" name="res_mysql"
                        shutdown_wait="10" startup_wait="5"/>
                </resources>
                <service domain="DC0" name="srv_mysql"
                recovery="relocate">
                        <lvm ref="res_sanvg">
                                <fs ref="fs_sanlv">
                                        <ip ref="10.1.0.7/22">
                                                <mysql ref="res_mysql"/>
                                        </ip>
                                </fs>
                        </lvm>
                </service>


Shouldn't there be an "[lvm] stripping tags from xxxx" after the umount,
which should fail and result in self_fence?

Thanks.




More information about the Linux-cluster mailing list