Issue #12 October 2005

Maintaining an autotools-enabled package

Introduction

All of the software available from the Fedora™ Project is distributed using RPM files. Fedora Core 4 is made up of over two thousand RPMs.

Software packages do not start out as RPM files. Before they are distributed in RPM format, each package developer creates a tarball of the software source code with a name like bash-3.0.tar.gz. This tarball is an archive containing the source code files and the scripts for compiling them: a Makefile, and perhaps a configure script.

This article explains where the configure scripts come from, what they are for, the advantages they provide, and (to put it nicely) some of the challenges they present.

In the name of portability

A software developer may use one platform (say, Fedora Core 4) for their development efforts, but in most cases they would not expect their software to be useful only to users of Fedora Core 4. People who use other platforms such as Debian® or FreeBSD® might find the software useful as well. These different platforms each have their own peculiarities. For example the locations of certain files needed for compilation or linking might differ, or even which compiler is available and how to run it might be different on each platform. These are factors that the software developer cannot make assumptions about.

The point of the configure script is to adapt the software to these platform variations. When the software developer knows that a particular configuration differs from platform to platform, they can write a configure test for it and program the software to act accordingly.

The configure script itself is built from a template file called configure.in (or sometimes configure.ac). The program that creates the configure script is called autoconf.

From scratch (in brief)

The GNU programs autoconf, automake, and libtool are collectively known as autotools.

GNU autoconf takes instructions about what platform variations to check for from a template file usually called configure.in. It generates a configure script that performs the appropriate checks and creates a Makefile from a Makefile.in template file.

The Makefile.in file contains complete rules for compiling the software but with placeholders for paths or filenames that differ between platforms. The Makefile file is created by filling in those placeholders with the actual values. As an example, see Example 1, “Snippet from xmlto.in” and Example 2, “Snippet from xmlto”. Here, xmlto is created from xmlto.in in the same way the Makefiles are created.

[...]
# Utilities that we need that aren't everywhere
FIND=@FIND@     # This must be GNU find (need -maxdepth)
MKTEMP=@MKTEMP@ # See http://www.mktemp.org if missing on your system
BASH=@BASH@     # GNU bash, for running the format scripts
[...]
Example 1. Snippet from xmlto.in
[...]
# Utilities that we need that aren't everywhere
FIND=/usr/local/bin/find     # This must be GNU find (need -maxdepth)
MKTEMP=mktemp # See http://www.mktemp.org if missing on your system
BASH=bash     # GNU bash, for running the format scripts
[...]
Example 2. Snippet from xmlto

GNU automake is intended to simplify the process of writing Makefile rules. The way it works is that automake generates a Makefile.in from a Makefile.am (see Figure 1, “Simplification of autotools”), which, roughly speaking, contains shorthand for those files needing to be compiled and in which order. The auto-made Makefile rules also provide facilities for running test suites and for creating a tarball of the software ready for distribution.

Simplification of autotools
Figure 1. Simplification of autotools

For software development libraries, GNU libtool is used, usually in conjunction with automake. This provides rules for creating and versioning software libraries.

The autotools take the Makefile.am and configure.in templates, as well as some other bits and pieces, and create the configure scripts (see “Further reading” for more information).

The software developer then runs configure on the local machine. Next, the developer runs make distcheck to create a tarball containing the configure script, the templates it needs, and the source code of course. The tarball is then ready for download.

When the RPM packager comes along and wants to make the software available as an RPM, everything just works. All the build section of the RPM spec file needs to say is %configure and make, and all the install section needs to say is %makeinstall. These RPM macros know how to use autotools-enabled packages.

Great idea in theory

It sounds like a wonderful system, and indeed it is when it works. The trouble starts when it does not. The basic problem originates from the fact that not everyone who downloads a source tarball just runs ./configure; make.

When an RPM packager fixes a bug, they make a patch. The RPM build process unpacks the tarball and applies the patches. When a patch alters an autotools input file, the corresponding output file needs alteration as well. For example, if the configure.in file is modified, the configure script needs recreating to take account of the change. GNU automake adds Makefile dependencies to track these changes and will attempt to generate new files as necessary. Alternatively, the autoreconf command can take care of this.

Unfortunately, due to the fact that the original software developer probably uses older versions of the autotools than the RPM packager, this does not always do the right thing.

It is difficult for a new version of GNU autoconf to guarantee that all existing configure.in files will continue to work. This situation is improving, but it has certainly been problematic in the past.

Newer versions of GNU autoconf are more strict in what they accept. As a result, it is not uncommon for an RPM packager to run autoreconf, only to see an error about outdated syntax or some other seemingly unrelated problem during the build. The seemingly unrelated problems are what give RPM packagers less than positive opinions about autotools. It is often due to a badly written configure.in file, but it can sometimes be very difficult indeed to decide whether the bug is in the configure.in file or in autoconf.

One way to avoid all this is for the patch to make equivalent changes to the configure script and configure.in file simultaneously. If it is a simple change, this is quite easy to do: just edit the configure script “by hand”. This future-proofs the change against newer versions of the autotools because they then no longer need to be involved.

For more complicated changes, the only way to make it work is for the RPM packager to use the same versions of the autotools as the original software developer. On the face of it, this is quite a problem because the RPM packager will have several packages to maintain, each with its own “upstream” software developer, and each requiring different versions of autoconf or automake. Fortunately it is possible to have several versions installed at once, a practice known as installing in parallel.

This is the reason that Fedora Core 4 comes with a large number of autotools-related packages:

  • autoconf213
  • autoconf (version 2.59)
  • automake14
  • automake15
  • automake16
  • automake17
  • automake (version 1.9.5)
  • libtool (version 1.5.20)

Installing in parallel like this allows the RPM packager to select, for example, version 1.5 of GNU automake by running the automake-1.5 command. This can be performed during the RPM build process to regenerate configure scripts and Makefiles. Alternatively, as with simple changes, it can be run once in order to recreate the generated files, and the resulting changes can be incorporated into the patch itself.

Sundries

Underquoted macro definitions

One example of more strict requirements for the input files for autotools is shown in this warning from aclocal, part of GNU automake:

acinclude.m4:2: warning: underquoted definition of AC_PATH_DIR
  run info '(automake)Extending aclocal'
  or see http://sources.redhat.com/automake/automake.html#Extending-aclocal

The offending line from acinclude.m4 reads:

AC_DEFUN(AC_PATH_DIR,

The fix is to change “AC_PATH_DIR” to “[AC_PATH_DIR]”. Enclosing the name in square brackets prevents the warning.

DESTDIR

The autotools were designed to handle this sequence of events:

  1. The software developer creates a tarball.
  2. The end user downloads the tarball.
  3. The end user builds and installs the software using ./configure; make; make install.

When the RPM packager steps in, the process is different:

  1. The software developer creates a tarball.
  2. The RPM packager downloads the tarball.
  3. The rpmbuild command builds the package and installs it into a temporary directory (a build root).
  4. The rpmbuild command creates an RPM file from the files in the build root.

Rather than installing the files into their final locations, the files are installed into, say, /var/tmp/bash-root/ first.

This can be handled in two ways. Firstly, a great many software packages honor the DESTDIR environment variable, using it for exactly the purpose needed here. So an RPM spec file can include make DESTDIR=$RPM_BUILD_ROOT install and the files will be installed in the correct place. GNU automake implements support for DESTDIR, although software developers must remember to take account of it when writing additional Makefile rules.

For software packages which do not respect DESTDIR, variables such as prefix, libdir, and datadir can often influence the Makefile to install files into a staging area as required. The RPM spec file macros %configure and %makeinstall take advantage of this. The %configure macro sets up the directory variables with the paths for the final installation, with prefix=/usr, datadir=/usr/share, and so on. For %makeinstall the staging directories are used, such as prefix=$RPM_BUILD_ROOT/usr and datadir=$RPM_BUILD_ROOT/usr/share. It is done this way because the package might hardcode the directory locations specified during configuration into the compiled application.

For autotools-aware packages, either approach is fine. Sometimes there might be bugs in the software package's Makefile rules which force the use of one or other method.

libtool .la files

There is a division of opinion about whether the .la files created by GNU libtool should be shipped in RPM files.

GNU libtool is a tool for creating and using libraries in a portable manner. The nuts and bolts of how to create libraries vary significantly between platforms. The point about libtool is that it can be used in the same way on all platforms it has been ported to.

Files ending in .la are abstract libraries. You can use libtool to link an application against somelibrary.la and it will do the right thing. It is actually a text file containing information about where the real library resides and how to use it.

Since these files are rather small and quite useful to developers who use libtool, so the argument goes, perhaps they should be included in our RPM packages.

Those who disagree point out that, on Linux at least, those files do not add enough value to make it worth the effort involved in making them accurate. The fact that the RPM build procedure uses a temporary staging area when installing can lead to incorrect directory names being stored in the .la files.

Conclusion

Regardless of the merits of software packages opting to use the autotools, RPM packagers need to understand the issues they may face. In my experience, the majority of the packaging problems associated with autotools are due to mistakes in the configure.in or Makefile.am files. The remaining problems seem to be due to incompatibilities between different versions of the tools. I hope that situation will improve.

Further reading

About the author

Tim Waugh is a Systems Engineer at Red Hat, primarily responsible for printing, DocBook, VNC, and some shell utilities. He has been using Linux since 1995. He lives with his wife in Surrey (England).