[linux-lvm] vgscan fails when other nodes quit cleanly.

Xinwei Hu hxinwei at gmail.com
Tue Mar 9 07:52:55 UTC 2010


Hi all,

   Here's an interesting issue. When we shutdown the cluster stack
cleanly, all lvm commands will
fail to grab the global lock. Like this:
--->8----
sys3:~ # vgscan
  cluster request failed: Host is down
  Unable to obtain global lock.
---8<----
   I went through the code history a bit. It seems to be caused by
e65ffb8e, which is for gulm only I think.
--->8----
commit e65ffb8e687bbce4e7edff70ebff2b3f1c0b6157
Author: Christine Caulfield <ccaulfie at redhat.com>
Date:   Fri Jun 20 10:58:28 2008 +0000

    Make clvmd return immediately if other nodes are down in a gulm cluster.
    bz#447799

diff --git a/WHATS_NEW b/WHATS_NEW
index ec7ff54..023659e 100644
--- a/WHATS_NEW
+++ b/WHATS_NEW
@@ -1,5 +1,6 @@
 Version 2.02.39 -
 ================================
+  Make clvmd return immediately if other nodes are down in a gulm cluster.
   Improve/Fix read ahead 'auto' calculation for stripe_size
   Fix lvchange output for -r auto setting if auto is already set
   Add testcase for read ahead
diff --git a/daemons/clvmd/clvmd-gulm.c b/daemons/clvmd/clvmd-gulm.c
index 3a230b5..a2f2148 100644
--- a/daemons/clvmd/clvmd-gulm.c
+++ b/daemons/clvmd/clvmd-gulm.c
@@ -665,6 +665,7 @@ static int _cluster_do_node_callback(struct
local_client *master_client,
 {
     struct dm_hash_node *hn;
     struct node_info *ninfo;
+    int somedown = 0;

     dm_hash_iterate(hn, node_hash)
     {
@@ -686,12 +687,14 @@ static int _cluster_do_node_callback(struct
local_client *master_client,
            client = dm_hash_lookup_binary(sock_hash, csid, GULM_MAX_CSID_LEN);

        }
+ DEBUGLOG("down_callback2. node %s, state = %d\n", ninfo->name, ninfo->state);
        if (ninfo->state != NODE_DOWN)
                callback(master_client, csid, ninfo->state == NODE_CLVMD);

-
+ if (ninfo->state != NODE_CLVMD)
+         somedown = -1;
     }
-    return 0;
+    return somedown;
 }

 /* Convert gulm error codes to unix errno numbers */
---8<----

  clvmd-corosync.c is copied over from clvmd-openais.c, then from clvmd-gulm.c.
I'd suggest to remove this patch for both clvmd-corosync and clvmd-gulm.

  Any comments ?
  Thanks.




More information about the linux-lvm mailing list