[Freeipa-devel] [PATCH] 294 sleep before doing a task

Simo Sorce ssorce at redhat.com
Thu Oct 15 16:08:38 UTC 2009


On Thu, 2009-10-15 at 08:15 -0700, Nathan Kinder wrote:
> On 10/15/2009 06:40 AM, Simo Sorce wrote:
> > On Thu, 2009-10-15 at 15:28 +0200, Pavel Zuna wrote:
> >    
> >> Rob Crittenden wrote:
> >>      
> >>> One of the last steps of an install is to run through any updates. This
> >>> change adds a sleep() prior to calling tasks to ensure postop writes are
> >>> done
> >>>
> >>> We were seeing a rare deadlock of DS when creating the memberOf task
> >>> because one thread was adding memberOf in a postop while another was
> >>> trying to create an index and this was causing a PRLock deadlock.
> >>>
> >>> rob
> >>>
> >>>        
> >> sleep might not be the best synchronization mechanism out there, but I think
> >> that in this case it is pretty much the only one available and it gets the job
> >> done, so ack.
> >>      
> > So are we covering a DS bug here ? Or are we doing an asynchronous ldap
> > request when we should do a synchronous one and wait for it to finish
> > (I've fixed another place where we were doing that and racing against
> > our own requests) ?
> >    
> It has nothing to do with the way you are performing your LDAP operations.
> 
> The issue really stems from what I consider to be a bug in NSPR's 
> implementation of reader-writer locks.  It is documented that a single 
> thread can hold multiple reader locks safely, but I've found that to not 
> exactly be the case.  The NSPR implementation favors writers, so a 
> thread waiting for the writer lock will block attempts by other threads 
> to get a reader lock.  The problem is that we use reader locks in a 
> re-entrant fashion, so a thread that already has a reader lock can be 
> blocked when attempting to get a second reader lock due to a waiting 
> writer.  This thread in turn blocks the writer thread since it already 
> holds a reader lock.
> 
> I have proposed a solution to the NSPR developers that would allow an 
> attempt to get a reader lock to go through even if a writer is waiting 
> if the requesting thread already has another reader lock.  I'm hoping 
> that this can be resolved in NSPR, otherwise we may have to change DS to 
> use the pthread_rwlock_* interfaces instead.
> 
> The sleep is a temporary workaround.  This issue should not arise in 
> normal operation since the lock in question is around the backend 
> struct, which is only modified when there is some sort of database 
> maintenance operation (such as the reindexing task that Rob triggered it 
> with).

Nathan,
thanks for the explanation, very much appreciated.

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York




More information about the Freeipa-devel mailing list