[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: broken database locking (at least in rpm 4.0.4)



On Fri, 2002-07-26 at 14:33, Jeff Johnson wrote:
> > To test, you'll need two terminals.  In one, as root, run:
> > 
> > # ./testcase
> > 
> > In the other, run
> > 
> > $ rpm -qa
> > 
> > You should get a locking error.
> > 
> > Now, ^C the testcase, and run it again, as:
> > 
> 
> Here's a reasonable guess (not having looked at testcase) at what's happening:
> 
> The ^C leaves a stale lock in /var/lib/rpm/__db* ...

Not in my testing.  Nor is it really necessary; it was merely for
comparison purposes.  Clean out any stale lock files from /var/lib/rpm,
and then proceed directly to step 2:

> > # ./testcase 1
> 
> ... but (as currently implemented) the __db* files are removed at next root
> open ...
> 
> > 
> > In the other, run
> > 
> > $ rpm -qa
> > 
> 
> ... so the query runs to completion.

More to the point, after running "./testcase 1" (which has opened the
database O_RDWR), you can then, in another terminal, run rpm commands
that actually modify the database -- even though, in a more fleshed out
example, the first program could have been in the middle of its own
transaction.  This is what happened to me.

> > It will work, which is bad, I think.  Note that instead of that being a
> > -qa, it could be a -i.  Even if the other process were in the middle of
> > a transaction.  Which will corrupt the rpm database.
> 
> 
> Can you characterize corruption please? Duplicate headers? Damaged headers?
> Database doesn't db_verify? If duplicates, then I need a write-write lock
> (as hinted at below). If damaged headers, then I need to certify unloaded
> headers to pinpoint that, indeed, the header is passed to Berekeley db
> correctly, but then I probably need to prevent the damage (by taking a lock?) at
> the rpm level.  If db_verify failure, then something else (CDB?) is boogered.

Damaged headers.  There was one transaction running, which was upgrading
~100 packages, and I used the rpm CLI to remove a single package.  It
appeared to remove the files for the package successfully, but gave
errors (sorry, I didn't think to copy them), and required an rpm
--rebuilddb to recover (which ended up removing the entire entry from
the database).

Ian

-- 
Ian Peters <itp@ximian.com>
Ximian, Inc.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []