[Linux-cluster] Deadlock when using clvmd + OpenAIS + Corosync

Mon Jan 11 09:38:04 UTC 2010

On 11/01/10 09:32, Evan Broder wrote:
> On Mon, Jan 11, 2010 at 4:03 AM, Christine Caulfield
> <ccaulfie at redhat.com>  wrote:
>> On 08/01/10 22:58, Evan Broder wrote:
>>>
>>> [please preserve the CC when replying, thanks]
>>>
>>> Hi -
>>>     We're attempting to setup a clvm (2.02.56) cluster using OpenAIS
>>> (1.1.1) and Corosync (1.1.2). We've gotten bitten hard in the past by
>>> crashes leaving DLM state around and forcing us to reboot our nodes,
>>> so we're specifically looking for a solution that doesn't involve
>>> in-kernel locking.
>>>
>>> We're also running the Pacemaker OpenAIS service, as we're hoping to
>>> use it for management of some other resources going forward.
>>>
>>> We've managed to form the OpenAIS cluster, and get clvmd running on
>>> both of our nodes. Operations using LVM succeed, so long as only one
>>> operation runs at a time. However, if we attempt to run two operations
>>> (say, one lvcreate on each host) at a time, they both hang, and both
>>> clvmd processes appear to deadlock.
>>>
>>> When they deadlock, it doesn't appear to affect the other clustering
>>> processes - both corosync and pacemaker still report a fully formed
>>> cluster, so it seems the issue is localized to clvmd.
>>>
>>> I've looked at logs from corosync and pacemaker, and I've straced
>>> various processes, but I don't want to blast a bunch of useless
>>> information at the list. What information can I provide to make it
>>> easier to debug and fix this deadlock?
>>>
>>
>> To start with, the best logging to produce is the clvmd logs which can be
>> got with clvmd -d (see the man page for details). Ideally these should be
>> from all nodes in the cluster so they can be correlated. If you're still
>> using DLM then a dlm lock dump from all nodes is often helpful in
>> conjunction with the clvmd logs.
>
> Sure, no problem. I've posted the logs from clvmd on both processes in
> <http://web.mit.edu/broder/Public/clvmd/>. I've annotated them at a
> few points with what I was doing - the annotations all start with "
>>> ", so they should be easy to spot.
>
> One interesting thing was the output from lvcreate when I SIGKILLed
> the clvmd process:
>
> root at black-mesa:~# lvcreate -L 1G -n broder-test-1 xenvg
>    Error reading data from clvmd: Connection reset by peer
>    Aborting. Failed to activate new LV to wipe the start of it.
>    Error writing data to clvmd: Broken pipe
>    Unable to deactivate failed new LV. Manual intervention required.
>    Error writing data to clvmd: Broken pipe
>    Internal error: Volume Group xenvg was not unlocked
>    Device '/dev/sdb' has been left open.
>    Device '/dev/sdc' has been left open.
>    Device '/dev/sdc' has been left open.
>    Device '/dev/sdb' has been left open.

That's perfectly normal for LVM if you killed clvmd with SIGKILL

> I'm not entirely sure what to make of that, or how worried I should
> be. After I killed clvmd and restarted corosync, broder-test-1 (the
> first lvcreate I started) did seem to exist, although I didn't look
> too hard at it.
>
>> Also, did you know it's possible to use clvmd without the DLM? The -I
>> openais option will tell it to use the Lck service in userspace - though if
>> there are DLM bugs I think we'd like to fix them if possible ;-)
>
> I suspect that the DLM bugs we ran into were all from using old
> software - it just left a bit of a bitter taste in our mouth when we
> were left with no way to recover from problems. We  built the clvmd
> we're using now without support for any clustering infrastructure but
> openais, so we should always be using the Lck service.
>

If you were using the DLM in RHEL4 (or equivalent) then that doesn't 
entirely surprise me, it was written and tested largely to support GFS 
and other applications were a bit of an afterthought, sadly. The new 
upstream DLM is a huge improvement.

I'll try and get some time to look at your logs, thanks. But it might 
not be today.

Chrissie