[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Script to detect conflicting files in PATH within a yum repo (was Re: conflict between libotf and openmpi)



On Thu, 2009-09-17 at 14:14 -0400, Seth Vidal wrote:
> 
> On Thu, 17 Sep 2009, David Malcolm wrote:
> 
> > On Thu, 2009-09-17 at 09:57 -0400, Seth Vidal wrote:
> >>
> >> On Wed, 16 Sep 2009, David Malcolm wrote:
> >>
> >>> On Wed, 2009-09-16 at 18:45 -0400, Neal Becker wrote:
> >>>> Which makes me wonder, how could this conflict have been avoided?  Is there
> >>>> a tool that would check any new package to see if any object* in it would
> >>>> conflict with any existing package?  If not, sounds like a good thing to
> >>>> have.
> >>>>
> >>>> * Here, object means filesystem object.  I'm not sure if there are any other
> >>>> types of objects to worry about.
> >>> Brainstorming: a script that walks the yum repo's filelist.tar.gz, and
> >>> figures out a list of filename collisions, filtering by directories in
> >>> the default PATH
> >>>
> >>>
> >>> Attached is a first pass at a python script that does this.
> >>>
> >>> Output from the script when run upon [1] is below.  Caveat: the script
> >>> probably has bugs.
> >>>
> >>> Does this look useful?
> >>
> >> David,
> >>   Yes it does look useful.
> >>
> >> I wrote something similar:
> >>
> >> http://skvidal.fedorapeople.org/misc/potential_conflict.py
> >>
> >> which is what I believe autoqa is starting from for their file conflict
> >> checker.
> > Aha!  Your approach looks superior, as you're leveraging all that extra
> > info from the RPM headers about file hashes etc.  Thanks.
> >
> 
> maybe a bit more thourough but I wouldn't call it superior - it takes 
> forever to complete b/c you have to look at all those headers :(
> 
> Magic alternatives welcome.

Well, define "forever"... my script takes about 30 seconds (and ~1GB
RAM) on my workstation; if that's a bit improvement over your runtimes,
perhaps you could try a hybrid approach of walking the filelist.xml.gz
to quickly find possible conflicts, then only opening the rpm headers as
needed to reject the false conflicts?  Dunno

[snip content-addressed storage/hashing ideas]

Dave


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]