Need review: ghc

Ralf Corsepius rc040203 at freenet.de
Fri May 13 12:20:02 UTC 2005


On Fri, 2005-05-13 at 12:58 +0200, Nicolas Mailhot wrote:
> On Ven 13 mai 2005 2:44, Jens Petersen a écrit :
> > seth vidal wrote:
> 
> >> a package cannot [build]require itself.
> >> that makes no sense at all.
> >
> > Erm, well actually it makes perfect sense for a compiler. :)
> > How do you think gcc builds itself?
> 
> Actually we have a ton of those bootstraps loops in the java world.
> Sometimes they are obvious like this, other times it's more indirect. For
> example :
> a java lib is build using ant but ant needs a version of this same lib to
> build (because ant uses it for xml parsing, error logging, etc)
> 
> In the end of the day for any given language you need to declare basic
> facilities part of the core platform (so not Requires) or add a seed
> package of these very same utilities to the repository to break dependency
> loops.

The classical example for such a dependency chain on Linux is binutils,
"c-compiler", libc and kernel - They all circularly depend on each
other. 

> Seed package can be pre-built binary, util version with all build options
> disabled, basic version only good enougth to build the real one, etc
> 
> I imagine it's the same for all non-interpretated languages.
Well, not quite - Or a matter of perspective and of compiler design.

In GCC, the "seed" to any compiler/language is "a native c-toolchain",
i.e. all compilers and all libraries are rooted at having "a native c-
toolchain", i.e. to build any language you at minimum need a "c-
toolchain". On Linux systems, this "native c-toolchain" typically will
be an older GCC.

This "native c-toolchain" then is used to build a "new native c-GCC",
which then later is being used to build other languages.

At this point, a compiler's design come into play. There exist languages
in GCC which can be built with a c-only compiler (e.g. fortran), there
exist languages which require a language specific-compiler (e.g. Ada).

To work around all these problems GCC is being built multiple times
(offical term: "multi-staged bootstrap").

What Jens describes, looks like an incomplete multi-staged bootstrap,
with him wanting to take the "dirty" short-circuit.

In theory, one would have
1. To bootstrap/build/provide the toolchains infrastructure needed to
build a compiler with system-resources, only.
2. To bootstrap the toolchain using the resources having been build by
step 1.

Then it might be necessary to reiterate through steps 1.-2. several
times, until a clean system has been built.

In practice, this can become tedious, in some (exceptional) cases there
exist situations where this is impossible to implement (IMO this is a
strong indication for a badly designed toolchain), and one has to/or
might want to prefer to resort to a short-cut solution, such a Jens.

Try building a cross-GCC-4.0-GNAT toolchain on FC3 and you probably
understand what I am talking about. I do this regularly and could write
novels on this topic :-)


To cut a long story short: I agree, there exist situations were building
a package requires a binary of an older version of itself. In most cases
these are strong symptoms of bad design which should be fixed upstream.
In practice, these are inevitable, and are easy to work-around: Insert a
binary package having been built outside of the buildsystem into the
buildsystem once, and then use this binary package to rebuild the
package.

Ralf






More information about the fedora-extras-list mailing list