[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: --retry <num> and --sleeptime <num>



On Mon, Jul 23, 2001 at 01:46:50PM +0200, Peter A Jonsson wrote:
> > There's a better way IMHO. The Berkeley DB permits a CDB model
> > (Concurrent DataBase) that has finer grained locking than the exclusive
> > lock traditionally used by rpm. I know that CDB works fine for the rpmdb, but
> > other mechanisms are gonna be needed to synchronize rpm install operations that
> > currently implicitly assume that an exclusive lock on the database provides
> > a mutex. Anyways a CDB model will permit concurrent access from, say, an
> > rpm query run from a %post scriptlet, and should remove the need to
> > sleep while waiting for an exclusive lock.
> 
> I have been reading some parts of the docs on sleepycat.com for db. This 
> concurrent db sounds interesting to me. Changing to the cdb-model seems to 
> involve a few things:
> 
> * Creating/Opening the db with the DB_INIT_CDB and DB_INIT_MPOOL. 
> * Making sure no cursors are read/write when already holding a cursor with 
> read/write access.
> * Make sure the error return codes are checked for.
> * Make sure no cursors are open when doing put/del.
> 
> Am I missing something?

No, the above is basically what is needed to use a CDB model, all mostly in
rpm-4.0.3, largely untested.

rpm-specific issues include:

- changing ownership of rpmdb files to permit group write access in order
to create locks. rpm-4.0.3 rpmdb files now have g+w rpm.rpm ownership.

- creating a two tiered locking mechanism, so that legacy rpm apps are locked
out when a CDB model with fine grained locks is used. Again, in rpm-4.0.3.

- adding locking to other non-rpmdb operations which have traditionally
implicitly relied on the rpmdb exclusive lock, rather than explicitly locking
the operation.

- assess the risk of deadlock's. This is gonna be the rate limiting step
in switching to a CDB model.

> 
> Some creative use of grep makes me believe that lib/db3.c lib/depends.c and 
> lib/rpmdb.c could need modification. Any other file? Is this something wanted 
> in rpm?
> 

Most of the db work is done, adding locking to the rest of rpm is what's
gonna be needed. I hope to support a fullblown --apply/--commit/--undo
package management methodology, where --apply installs all files with
a temporaray name, --commit renames into place, and --undo erases the
temporary files. This work is just starting ...

> However, this does not solve the "problem" with two rpm's trying to write to 
> the same db as far as I can understand. It will still die unconditionally. 
> Several processes can read from the db at the same time but only one can 
> write. See below for more discussion.
> 

Hmmm, what's wrong with your assumed implementation is that an rpmdb (at the
moment) is per-machine. There's no way (or need) to have several machines
writing entries to the same database, as the entries apply only to a single
machine anyways.

Meanwhile, having run a CDB model with multiple instances of loops over
a single package install/query/erase cycle, I disagree "die unconditionally",
The CDB model handles this fine. What needs work is to close the other
racy windows within rpm, as I had thousands of entries in the database for
the single package when I was done. This is a problem with how rpm uses the
database, not with the Berkley CDB model.

> > There's nothing other than rpm that calls rpmInstall() that I'm aware of.
> > Meanwhile, that interface is changing anyways in rpm-4.0.3, as rpm install
> > and erase modes need to be merged into a common routine. The new args will look
> > much more like the rpmQuery()/rpmVerify() interface when I'm done.
> 
> -- Rearrangement --
> 
> > There are other problems with several machines accessing an rpmdb
> > simultaneously, not the least of which is that Berkeley DB does not
> > support locking across NFS. I believe the fcntl scheme might "work"
> > across NFS, haven't looked seriously at all. There's also the Berkeley DB RPC
> > that could be used for several machines accessing an rpmdb. I believe that
> > most of the pieces are there, but, again, I haven't looked seriously at
> > using, only at configuring.
> 
> Without looking at the code, can there be an option put in for retrying to 
> lock the db (if the first lock fails) after a delay in that API? If the 
> parameters have sane values (maybe through a macro) it should have very little 
> impact on the other code. An alternative could be to wait until the lock is 
> gotten, I don't know if that is what one want though. Something to the 
> commandline so it doesn't have a default behaviour that is dangerous. I am 
> open for suggestions.
> 
> The locking over NFS is flakey (to say the least) in several operatingsystems. 
> Another possible usage for the feature of retrying the db-opening is that 
> several build-/install-scripts might be running at the same time, either 
> building from different pools or building several rpms due to the fact that 
> the load on the machine is low (ie it is nighttime). Ofcourse this mainly 
> applies to SMP-machines. Veritas has a cluster filesystem, etc..
> 

What's wrong with a timeout/retry mechanism to acquire a coarse grained lock
is that there's no obvious and/or easy way to guess an appropriate value
for the timeout. Better to use fine grained locks, which should have more
predictable estimates for the duration of the locking operation. And far
better still is to let a db3 CDB model manage it's own locks with it's own
retry mechanism.

Again, an rpm database is per-machine, and only install/erase modes need write
access. The CDB model permits multiple readers while installing/upgrading is
writing, and writers will be single threaded for other reasons. SMP is not
an issue until rpmlib is threaded, but that's a whole different can of worms.
What's needed is an alternative locking mechanism to the traditional
dependence on the exclusive database lock to protect other rpm operations.

> I believe that there could be uses for several rpm's trying to write to the 
> same db at the same time. The user might be one of those who happens to have 
> some odd hardware and wants to do something odd.
> 
> > Bug reports and enhancement requests in bugzilla
> > 	http://bugzilla.redhat.com
> > please, as I sometimes abuse tha delete key when reading mail.
> 
> Ok, will do.
> 
> > You might also look at patching against rpm-4.0.3 from the rpm-4_0
> > branch in CVS
> 
> My firewall blocks CVS. Are there any snapshots availible via ftp/http? Will 
> the ones in pub/rpm/test-4.0.3 do or are the changes in CVS too big? Will 
> patches against 4.0.3 be much more helpful to you than 4.0.2?
> 

rpm is available through Raw Hide, daily at the moment. Grab the src rpm.
Recent versions of rpm-4.0.3 include
	/etc/rpm/macros.cdb
Uncomment the one line config, do a --rebuilddb, and you will be running
an EXPERIMENTAL BUGGY versions of a CDB model. Apologies for shouting,
but I want to make it crystal clear that this code is not cooked yet.

Lots of the database code has been changed in rpm-4.0.3, which has an
internal copy of db-3.3.4.

73 de Jeff

-- 
Jeff Johnson	ARS N3NPQ
jbj@jbj.org	(jbj@redhat.com)
Chapel Hill, NC





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] []