[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] clvmd hangs

On 08/07/2012 01:07 AM, Chip Burke wrote:
I had a node crash (actually, lost power) and now when the cluster comes
back up none of the PV/VG/Lvs that contain the GFS2 volumes can be
found. Pvscan, lvscan, vgscan etc all hang.

# pvscan -vvvv
#lvmcmdline.c:1070         Processing: pvscan -vvvv
#lvmcmdline.c:1073         O_DIRECT will be used
#libdm-config.c:789       Setting global/locking_type to 3
#libdm-config.c:789       Setting global/wait_for_locks to 1
#locking/locking.c:271       Cluster locking selected.

The output is more or less the same from lvscan and vgscan.

The cluster is pretty basic and I was in the midst of configuring
fencing when this went down, thus the config has no fence in it.

<?xml version="1.0"?>
<cluster config_version="5" name="Xanadu">
<clusternode name="xanadunode1" nodeid="1"/>
<clusternode name="xanadunode2" nodeid="2"/>
<cman expected_votes="3"/>
<quorumd label="quorum"/>

Additionally the cluster logs all show similar unending messages such as:

Aug 07 01:03:12 dlm_controld daemon cpg_join error retrying
Aug 07 01:03:46 corosync [TOTEM ] Retransmit List: 13
Aug 07 01:04:04 gfs_controld cpg_mcast_joined retry 31200 protocol
Aug 07 01:04:12 fenced daemon cpg_join error retrying


# cman_tool status
Version: 6.2.0
Config Version: 5
Cluster Name: Xanadu
Cluster Id: 10121
Cluster Member: Yes
Cluster Generation: 2084
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Quorum device votes: 1
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 11
Ports Bound: 0 11 178
Node name: xanadunode2
Node ID: 2
Multicast addresses:
Node addresses:

So cman is up and working. It seems that clvmd and the tools it depends
on are simply not wanting to play nice. What do I have to do to get
those volumes to mount?

Without a way to put the lost node into a known state, the only safe option remaining is to hang. This is by design. You have to add fencing to your cluster.

This explains it in detail;


Papers and Projects: https://alteeve.com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]