[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: broken database locking (at least in rpm 4.0.4)
- From: Jeff Johnson <jbj redhat com>
- To: rpm-list redhat com
- Subject: Re: broken database locking (at least in rpm 4.0.4)
- Date: Fri, 26 Jul 2002 14:33:33 -0400
On Fri, Jul 26, 2002 at 01:14:33PM -0400, Ian Peters wrote:
> The attached test program demonstrates -- once you've opened the rpm
> database once, and then closed it, successive openings (with O_RDWR)
> will not lock the database.
>
> To compile the attached program, run
>
> gcc -o testcase -I/usr/include/rpm -lrpm -lrpmdb -lrpmio -lpopt
> testcase.c
>
Oooh, a reproducible bug, and with a reasonable analysis. Thank you *very* much.
> To test, you'll need two terminals. In one, as root, run:
>
> # ./testcase
>
> In the other, run
>
> $ rpm -qa
>
> You should get a locking error.
>
> Now, ^C the testcase, and run it again, as:
>
Here's a reasonable guess (not having looked at testcase) at what's happening:
The ^C leaves a stale lock in /var/lib/rpm/__db* ...
> # ./testcase 1
... but (as currently implemented) the __db* files are removed at next root
open ...
>
> In the other, run
>
> $ rpm -qa
>
... so the query runs to completion.
The answer will be signal handling in rpmlib to close all open cursors
(i.e. no stale locks in __db files), and privilege sparation through a
setgid helper to permit non-root write access to __db files, and (finally)
removing the current (and deficient) unlink of __db files.
> It will work, which is bad, I think. Note that instead of that being a
> -qa, it could be a -i. Even if the other process were in the middle of
> a transaction. Which will corrupt the rpm database.
Can you characterize corruption please? Duplicate headers? Damaged headers?
Database doesn't db_verify? If duplicates, then I need a write-write lock
(as hinted at below). If damaged headers, then I need to certify unloaded
headers to pinpoint that, indeed, the header is passed to Berekeley db
correctly, but then I probably need to prevent the damage (by taking a lock?) at
the rpm level. If db_verify failure, then something else (CDB?) is boogered.
I'll verify by running the steps above in a bit, but here's some quick notes
regarding what I think are the open issues that need to be solved:
>
> Is this known and/or intentional? I've worked around it by using fcntl
> to manually lock /var/lib/rpm/Packages after rpmdbOpen, but... should I
> file a bug?
Now that I've removed the fcntl lock on Packages, I almost certainly
need to restore exclusive locking for installing. The open question
is at what granularity.
I'm hoping to do this explictly, through other means than fcntl locking on
Packages, in order to permit concurrent access. Taking out an fcntl lock
as before will "work", but I'd like to permit (at least) reading of
the database in %post. If I can get concurrent writes, then package bundles,
(i.e. a meta-package with rpm -Uvh run in %post) becomes possible.
FYI, one approach might be a write cursor on instance #0 in Packages.
That should (will?) stop rpmdbAdd()
FWIW What I *really* needed to know was whether CDB was reliable.
For my purposes the lack of reports of segfaults is very, very promising.
Deadlock issues remain to be figgered. I'm betting that a write-write lock
at the rpmtsRun() level, released and reacquired when running scriptlets,
is a viable scheme.
You might as well open a new (and publically accessible) bug for tracking.
I should have a fix (even if reestablishing the fcntl lock on Packages)
sometime next week.
HTH
73 de Jeff
--
Jeff Johnson ARS N3NPQ
jbj@redhat.com (jbj@jbj.org)
Chapel Hill, NC
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
[]