[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Unable to lock any resource

I am debugging a program that uses DLM (lock_resource()) to lock a resource. If I kill the process within GDB and leave it running for a long time (for example overnight), I am not longer able to lock any resources. I obviously killed gdb and verified that I have no leftovers.

To verify that it is not just my resource that I can not lock I use: dlmtest from ...dlm/tests/usertests/ directory to lock any resource:

[root bof227 usertest]# ./dlmtest -m NL TEST
locking TEST NL ...
lock: Invalid argument

The error code returned on the lock_resources is EINVAL (22).

I can obviously fix this by rebooting the system, however it is a pain. I tried to fix it by restarting cman and clvmd services - no success. And I can not reload dlm kernel module as it is in use.

The content of dlm_stats shows that there is the same number of locks as unlocks:

[root bof227 usertest]# cat /proc/cluster/dlm_stats
DLM stats (HZ=1000)

Lock operations:         21
Unlock operations:       21
Convert operations:       0
Completion ASTs:         42
Blocking ASTs:            0

Lockqueue        num  waittime   ave
WAIT_RSB          19         8     0
Total             19         8     0

I was wondering if anybody could provide an insight on this. I was also wondering if there is a better way to deal with this than just rebooting the system.

Thanks, Mike

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]