[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH] Release VM lock before acquiring virDomainObjListPtr lock

On 04/09/13 13:22, Daniel P. Berrange wrote:
On Tue, Apr 09, 2013 at 01:19:05PM +0200, Peter Krempa wrote:
On 04/09/13 12:26, Daniel P. Berrange wrote:
On Tue, Apr 09, 2013 at 10:08:31AM +0100, Daniel P. Berrange wrote:
On Mon, Apr 08, 2013 at 09:15:28PM -0600, Eric Blake wrote:
On 02/11/2013 09:46 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange redhat com>

When removing a VM from the virDomainObjListPtr, we must not
be holding the VM lock while acquiring the list lock. Re-order
code to ensure that we can release the VM lock early.
  src/conf/domain_conf.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 5e16ddf..d92e54a 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -2115,11 +2115,10 @@ void virDomainObjListRemove(virDomainObjListPtr doms,
      char uuidstr[VIR_UUID_STRING_BUFLEN];

-    virObjectLock(doms);
      virUUIDFormat(dom->def->uuid, uuidstr);

+    virObjectLock(doms);

This patch seems to be implicated in Peter's latest proof of a
use-after-free data race:

The caller of this method should own a reference on virDomainObjPtr.
This method will result in that reference being released, as the
dom is removed from the list.

If any other thread experiances a use-after-free, then that thread
must have been missing a reference.

Actually, it is this code which is at fault. I forgot the rule
that if you unlock 'dom', then you are forbidden to access it
again, unless you had done a virDomainObjRef on it prior to
unlocking. Once we hold the lock on 'doms', we must also relock
'dom' to ensure we block any other thread from using it. So
this method should be changed to look like

void virDomainObjListRemove(virDomainObjListPtr doms,
                             virDomainObjPtr dom)
     char uuidstr[VIR_UUID_STRING_BUFLEN];

     virUUIDFormat(dom->def->uuid, uuidstr);

     virHashRemoveEntry(doms->objs, uuidstr);

Hmm, yeah, this seems to be the correct minimal approach. Are you
going to post the patch separately?

Since you can reproduce the problem can you post & test a patch
todo this.

Okay, I will prepare it then.

Anyways I still think that we should re-work this to the reference
counting approach as it would allow us to avoid having to take the
lock of the domain object and potentially having to wait with the
domain objects hash lock until an API finishes and thus blocking
other domain object lookups. It would also avoid the need to check
if the exit from a qemu job freed the domain object.

We would still have to hold the lock, since the lock is there to ensure
two things don't read/modify fields in virDomainObjPtr at the same time.

Yeah, in the individual API's the lock will still be needed. I want to avoid taking the lock in the helper that looks up the domain object in the hash that has to have the domain object list locked too. At that point adding a reference would allow to move locking of the domain object after the list is unlocked.

It would, however, mean that the {Enter/Leave}Monitor fnuctions do not
need to play games with the reference count, and we could simplify the
code which follows since we can assume our 'dom' is always valid and
not have to set it to NULL after exiting jobs.

Yep, that would be one of the benefits.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]