[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: rpmreaper

On Wed, 4 Jun 2008, Steve Grubb wrote:

On Wednesday 04 June 2008 02:55:19 Panu Matilainen wrote:
On Tue, 3 Jun 2008, Steve Grubb wrote:
Not really. Bash has been patched to to spit out the programs it calls
(/bin/bash --rpm-requires). So, its a matter of overriding
%__find_requires to run a program that gathers the information for shell
scripts and falls back to the old way for others.

No one should have to specify this, it can be automated easily. Without
taking shell scripts into account, you run the risk of breaking
unspecified requirements.

I wish it were that simple.

"bash --rpm-requires" does a fair job for the impossible task, but it
produces way too much bogus information and false positives to be
generally usable as is. A quick check at various scripts found on a stock
F9 system shows at least these problems:

1) It mistakes functions declared in sourced scripts as executables
2) It mistakes functions used before declared as executables

In my opinion, these ^^ should be fixed.

Yup, that'd be the first step in making --rpm-requires actually usable beyond just curiosity.

3) It thinks of sourced scripts as executables

In a sense, they are. My init scripts source /etc/init.d/functions, so that is
a real dependency.

It's a real dependency yes, but sourced files need not be *executable*, they just need to be there. Whether the difference matters depends on later implementation details: if PATH or executable bits of files are involved, sourced files need to be separated from executables, file(/some/path) notation or such.

4) It produces hard dependencies for conditional items

I agree this is a problem. I think it gets worse the further nested a program
would be in if staements. But as a first pass, one could fix it to only check
files not within a if statement and add logic later to go deeper. Something
is better than nothing as right now we do not capture shell script
dependencies and they *are* real.

Ignoring dependencies from all conditional execution paths (except constant conditions like "while [ 1 -eq 1 ]") is the only 100% correct and safe thing you can do. Beyond that, bash simply cannot know whether something is a hard dependency or not at package build time. So either the conditional paths are ignored, or you live with the fact that you WILL need to filter out dependencies manually.

If bash could classify it's findings into conditional and unconditional, that'd at least make life easier for the human filtering the deps.

Initscripts (and mkinitrd) might well be about the worst case you can get for this, as they do a whole lot of things like "if x happens to be installed then enable/do something with it, otherwise it doesn't matter" which should not be turned into hard dependencies. A good example is rhgb-client - you can bet that lot of folks would be upset if that was made into hard dependency of initscripts :)

5) For most executables, path is unknown

There is a standard PATH that the distribution expects. So there is some
defined search order. I solved this in the build system I wrote by keeping a
list of all files installed by rpm as packages were built. Then the
find-requires script would resolve the name to full path based on the
standard PATH. This is solvable.

It's solvable by various means, yes. Anything requiring rpm to be aware of distribution contents at build time is not really a generic solution though.

Assuming 1-3) are fixed and ignoring 4), 5) could be dealt with, at least
to some extent, but it's a big can of worms too. For the dependencies to
be discoverable by yum & friends, there would have to be matching provides
for all executable(foo) items bash --rpm-requires produces.

Rpm could automatically add Provides: executable(foo) for any file with
executable bits on, but it would cause *enormous* bloat of metadata.

Bloat, to me, means something that would never be used. If the dependencies
are real, they should be captured. Do you need to have the dependency at the
file level or package level? Maybe that reduces some of the metadata?

Note that I wasn't speaking of requires, but provides to satisfy the requires IF automatically added as

    Provides: executable(<basename of executable>)

for all executable files (in system PATH or otherwise). Most of those provides would never be used by anything so they would be nothing but bloat.

Resolving file dependencies into packages at build time would require rpm to be aware of "outside world", ie what's available in repositories, and would require unnecessary rebuilds on package splits and renames (the good old "file dependencies considered harmful or not" issue)

So solving 5) should be possible if 1-3) were fixed, but it'd still be
pretty moot because 4) can't generally be solved (apart from manually
filtering bogus dependencies, at which point it's hardly "easily
automated" :)

I don't think #4 is impossible. Its not easy either. But I think we could get
a first pass that is pretty good and make it better over time. Right now, we
capture nothing. So, a first pass solution that captures 25% accurately is
better than where we are.

Except for unconditional execution, #4 is impossible to solve programmatically. The moment you start down the conditional paths, it's just blind poking around in the dark - heuristics based on no knowledge at all. Even if you assume access to the distributions file list, there's no way to tell if something is intentionally optional (in which case making it a hard requirement would be an error) or not, or if a given condition is supposed to ever occur on the target platform and version.

In the build system I wrote, I lumped #4 and #5 together and solved them with
a lookup table. It was good enough for my needs. If I resolved the path, the
dependency was recorded. If not, I didn't record it. So,
if /sbin/solaris-specific was not in my distribution's file list, it was
quietly removed from the possible dependencies.

See above, this still catches all sorts of things that should not be hard dependencies.

Mind you, I don't disagree with the goal at all: automatically recording unconditional script dependencies would be a very good thing if it can be made reliable - fixing 1-2) would be a start.

	- Panu -

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]