[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace



Hi David,

I got quite some trouble with clvmd on corosync 2.3.0/dlm; apparently a nonfunctional clvmd in the cluster can block all others (kern.log states clvmd stuck for >120s in some dlm call). I tried to clean things up killing -9 clvmd, but it will remain on state D or Z. Unfortunately, it seems that those zombies still keep some dlm stuff locked. When I restart corosync on a node and dlm_controld -D on it, I see "found uncontrolled lockspace, tell corosync to remove nodeid from cluster".

Well, that's fine for the first step, but how about cleaning up the dlm lockspace? dlm_tool leave <lockspace> hangs as well (sometimes it just fails with error 49). The comment in dlm_controld/action.c isn't too satisfactory: need reboot, not funny if a whole cluster is affected. I'd really appreciate a way to manually clean old lockspaces. I'd presume that an uncontrolled lockspace on an isolated node should be easily removable...


Regards
Andreas


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]