[Freeipa-devel] [PATCH] 294 sleep before doing a task
Simo Sorce
ssorce at redhat.com
Thu Oct 15 16:08:38 UTC 2009
On Thu, 2009-10-15 at 08:15 -0700, Nathan Kinder wrote:
> On 10/15/2009 06:40 AM, Simo Sorce wrote:
> > On Thu, 2009-10-15 at 15:28 +0200, Pavel Zuna wrote:
> >
> >> Rob Crittenden wrote:
> >>
> >>> One of the last steps of an install is to run through any updates. This
> >>> change adds a sleep() prior to calling tasks to ensure postop writes are
> >>> done
> >>>
> >>> We were seeing a rare deadlock of DS when creating the memberOf task
> >>> because one thread was adding memberOf in a postop while another was
> >>> trying to create an index and this was causing a PRLock deadlock.
> >>>
> >>> rob
> >>>
> >>>
> >> sleep might not be the best synchronization mechanism out there, but I think
> >> that in this case it is pretty much the only one available and it gets the job
> >> done, so ack.
> >>
> > So are we covering a DS bug here ? Or are we doing an asynchronous ldap
> > request when we should do a synchronous one and wait for it to finish
> > (I've fixed another place where we were doing that and racing against
> > our own requests) ?
> >
> It has nothing to do with the way you are performing your LDAP operations.
>
> The issue really stems from what I consider to be a bug in NSPR's
> implementation of reader-writer locks. It is documented that a single
> thread can hold multiple reader locks safely, but I've found that to not
> exactly be the case. The NSPR implementation favors writers, so a
> thread waiting for the writer lock will block attempts by other threads
> to get a reader lock. The problem is that we use reader locks in a
> re-entrant fashion, so a thread that already has a reader lock can be
> blocked when attempting to get a second reader lock due to a waiting
> writer. This thread in turn blocks the writer thread since it already
> holds a reader lock.
>
> I have proposed a solution to the NSPR developers that would allow an
> attempt to get a reader lock to go through even if a writer is waiting
> if the requesting thread already has another reader lock. I'm hoping
> that this can be resolved in NSPR, otherwise we may have to change DS to
> use the pthread_rwlock_* interfaces instead.
>
> The sleep is a temporary workaround. This issue should not arise in
> normal operation since the lock in question is around the backend
> struct, which is only modified when there is some sort of database
> maintenance operation (such as the reindexing task that Rob triggered it
> with).
Nathan,
thanks for the explanation, very much appreciated.
Simo.
--
Simo Sorce * Red Hat, Inc * New York
More information about the Freeipa-devel
mailing list