[libvirt] [PATCH v4 00/23] Introduce metadata locking

Thu Sep 27 10:07:51 UTC 2018

On 09/27/2018 11:11 AM, Bjoern Walk wrote:
> Michal Privoznik <mprivozn at redhat.com> [2018-09-27, 10:15AM +0200]:
>> On 09/27/2018 09:01 AM, Bjoern Walk wrote:
>>> Michal Privoznik <mprivozn at redhat.com> [2018-09-19, 11:45AM +0200]:
>>>> On 09/19/2018 11:17 AM, Bjoern Walk wrote:
>>>>> Bjoern Walk <bwalk at linux.ibm.com> [2018-09-12, 01:17PM +0200]:
>>>>>> Michal Privoznik <mprivozn at redhat.com> [2018-09-12, 11:32AM
>>>>>> +0200]:
>>>>
>>>>>>
>>>>>
>>>>> Still seeing the same timeout. Is this expected behaviour?
>>>>>
>>>>
>>>> Nope. I wonder if something has locked the path and forgot to
>>>> unlock it (however, virtlockd should have unlocked all the paths
>>>> owned by PID on connection close), or something is still holding
>>>> the lock and connection opened.
>>>>
>>>> Can you see the timeout even when you turn off the selinux driver 
>>>> (security_driver="none' in qemu.conf)? I tried to reproduce the
>>>> issue yesterday and was unsuccessful. Do you have any steps to
>>>> reproduce?
>>>
>>> So, I haven't been able to actually dig into the debugging but we
>>> have been able to reproduce this behaviour on x86 (both with SELinux
>>> and DAC). Looks like a general problem, if a problem at all, because
>>> from what I can see in the code, the 60 seconds timeout is actually
>>> intended, or not? The security manager does try for 60 seconds to
>>> acquire the lock and only then bails out. Why is this?
>>
>> The ideal solution would be to just tell virlockd "these are the paths I
>> want you to lock on my behalf" and virtlockd would use F_SETLKW so that
>> the moment all paths are unlocked virtlockd will lock them and libvirtd
>> can continue its execution (i.e. chown() and setfcon()). However, we
>> can't do this because virtlockd runs single threaded [1] and therefore
>> if one thread is waiting for lock to be acquired there is no other
>> thread to unlock the path.
>>
>> Therefore I had to move the logic into libvirtd which tries repeatedly
>> to lock all the paths needed. And this is where the timeout steps in -
>> the lock acquiring attempts are capped at 60 seconds. This is an
>> arbitrary chosen timeout. We can make it smaller, but that will not
>> solve your problem - only mask it.
> 
> I still don't understand why we need a timeout at all. If virtlockd is
> unable to get the lock, just bail and continue with what you did after
> the timeout runs out. Is this some kind of safety-measure? Against what?

When there are two threads fighting over the lock. Imagine two threads
trying to set security label over the same file. Or imagine two daemons
on two distant hosts trying to set label on the same file on NFS.
virtlockd implements try-or-fail approach. So we need to call lock()
repeatedly until we succeed (or timeout).

> 
>>
>>>
>>> However, an actual bug is that while the security manager waits for
>>> the lock acquire the whole libvirtd hangs, but from what I understood
>>> and Marc explained to me, this is more of a pathological error in
>>> libvirt behaviour with various domain locks.
>>>
>>
>> Whole libvirtd shouldn't hang. Only those threads which try to acquire
>> domain object lock. IOW it should be possible to run APIs over different
>> domains (or other objects for that matter). For instance dumpxml over
>> different domain works just fine.
> 
> Yes, but from a user perspective, this is still pretty bad and
> unexpected. libvirt should continue to operate as usual while virtlockd
> is figuring out the locking.

Agreed. I will look into this.

> 
>> Anyway, we need to get to the bottom of this. Looks like something keeps
>> the file locked and then when libvirt wants to lock if for metadata the
>> timeout is hit and whole operation fails. Do you mind running 'lslocks
>> -u' when starting a domain and before the timeout is hit?
> 
> There IS a lock held on the image, from the FIRST domain that we
> started. The second domain, which is using the SAME image, unshared,
> runs into the locking timeout. Sorry if I failed to describe this setup
> appropriately. I am starting to believe that this is expected behaviour,
> although it is not intuitive.
> 
> 	# lslocks -u
> 	COMMAND            PID   TYPE SIZE MODE  M START        END PATH
> 	...
> 	virtlockd       199062  POSIX 1.5G WRITE 0     0          0 /var/lib/libvirt/images/u1604.qcow2

But this should not clash, because metadata lock use different bytes:
the regular locking uses offset=0, metadata lock uses offset=1. Both
locks lock just one byte in length.

However, from the logs it looks like virtlockd doesn't try to actually
acquire the lock because it found in internal records that the path is
locked (even though it is locked with different offset). Anyway, now I
am finally able to reproduce the issue and I'll look into this too.

Quick and dirty patch might look something like this:

diff --git i/src/util/virlockspace.c w/src/util/virlockspace.c
index 79fa48d365..3c51d7926b 100644
--- i/src/util/virlockspace.c
+++ w/src/util/virlockspace.c
@@ -630,8 +630,9 @@ int virLockSpaceAcquireResource(virLockSpacePtr
lockspace,
     virMutexLock(&lockspace->lock);

     if ((res = virHashLookup(lockspace->resources, resname))) {
-        if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) &&
-            (flags & VIR_LOCK_SPACE_ACQUIRE_SHARED)) {
+        if ((res->flags & VIR_LOCK_SPACE_ACQUIRE_SHARED &&
+            flags & VIR_LOCK_SPACE_ACQUIRE_SHARED) ||
+            start == 1) {

             if (VIR_EXPAND_N(res->owners, res->nOwners, 1) < 0)
                 goto cleanup;


But as I say, this is very dirty hack. I need to find a better solution.
At least you might have something to test.

Michal