[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: SUG: Automatic RPM database verification and repair

At 4:41 PM -0500 12/3/06, Jeff Johnson wrote:
>On Dec 3, 2006, at 3:49 PM, Tony Nelson wrote:
>>Umm, in order to help me when I try to read yum to understand what it is
>>doing, are you saying that yum is copying some data from RPM and then
>>assuming that it will not change across unlocking / relocking the
>>database? That is, it uses possibly stale data, but not invalid
>>iterators? And please pardon my ignorance of BerkelyDB; I will work to
>>learn about it.
>This code from yum/rpmUtils/transaction.py is what all the fuss is about

Thank you for pointing me to the offending part of yum.  It will help me
focus on the problem, and possibly come up with a solution acceptable to
yum's developers.

>1.24         (mjs      03-Sep-06):         self.open = True
>1.24         (mjs      03-Sep-06):
>1.24         (mjs      03-Sep-06):     def __del__(self):
>1.24         (mjs      03-Sep-06):         # Automatically close the
>rpm transaction when the reference is lost
>1.24         (mjs      03-Sep-06):         self.close()
>1.24         (mjs      03-Sep-06):
>1.24         (mjs      03-Sep-06):     def close(self):
>1.24         (mjs      03-Sep-06):         if self.open:
>1.24         (mjs      03-Sep-06):             self.ts.closeDB()
>1.24         (mjs      03-Sep-06):             self.ts = None
>1.24         (mjs      03-Sep-06):             self.open = False
>That overloads a transaction object with a lazy open/close
>of an rpmdb in order to handle ^C.

Destructors are really hard to get right in Python, because if there is a
reference loop the garbage collector zaps the objects in it in arbitrary
order.  This one looks OK on the surface, as it only releases a resource,
but I'll look harder.  I'd actually have expected the TransactionSet would
already do this on its own when it's last reference disappeared; I'll look
at its code later.

>While I think the code is insanely clever, and I'm quite pleased
>that rpm is surviving as well as it is in spite of unanticipated
>and bizzarre uses of the implementation, the code is solving
>entirely the wrong problem, and triggering a lot of instability.

I don't quite see that this code is a problem.  It only runs at task time
(not at signal time), when the object is being reclaimed by the Python
interpreter.  It should be like any other call to close the database.  It
can't happen inside a call to rpmlib (unless there is more than one
thread).  Am I misunderstanding the issue?

>Here was the first indication I received:
>     https://lists.dulug.duke.edu/pipermail/rpm-devel/2006-November/

I've read this thread; I will try to understand it better with time.

>Here is part of the history (which goes back even further)
>     http://www.archivesat.com/


>And there is another recent post by Panu (which I can't find, from
>11/2006) identifying the overloading as a cause of worse performance in
>yum (even though the claim is faster!).
>The slowdown is rather easy to understand, repoening a Berkeley DB is
>not exactly a cheap operation.

AIUI, yum does that less often now than did the version in FC6 test
releases, and Yum's developers realize that it is a trigger for the
problem.  Whether or not it is a bug in RPM or BerkelyDB or yum that is
being triggered, it is not necessary to do it, or even particularly helpful
in order to get SIGINT to work usefully.  I hope to develop a solution that
is acceptable to RPM and to Yum.  We'll see what I manage.


I think it does, thanks.
TonyN.:'    The Great Writ     <mailto:tonynelson georgeanelson com>
      '      is no more.             <http://www.georgeanelson.com/>

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]