Issue #9 July 2005

64-bit computing: Co-existing in a 32-bit world

An introduction to 64-bit computing

Way back in the eighties, my first computer was a Commodore 64—so named because it had 64 kilobytes of memory. Today there are few applications around that could fit into such a small space, and these grow larger as time goes on. A web browser can use tens of megabytes of memory without even loading a web page. Computers are fitted with ever larger amounts of RAM to make the applications run faster, and the applications make use of this by providing an ever increasing range of features.

Hand in hand with the need for greater amounts of RAM is the need to access it quickly. Most home computers are fitted with 32-bit microprocessors, and these are beginning to show the strain.

A 64-bit microprocessor is one with integer registers that hold 64 bits and which can move 64 bits of data between memory storage and a register in the CPU in a single operation. The HP® Alpha™ processor, the Apple® PowerPC G5™, and AMD® Opteron™/Intel® EM64T are examples. Those last few are unlike the Alpha in that although they were designed for 64-bit operation, they can also run programs intended for earlier 32-bit processors.

This makes migrating from earlier systems much easier because generally the older software will still work. Indeed, at the moment you may well find that a brand new 64-bit computer will be sold with an operating system that was compiled for a 32-bit processor.

One of the interesting things about this type of processor is that a 64-bit operating system is able to run both 32-bit and 64-bit programs concurrently. Linux (meaning the GNU toolchain as well as the Linux kernel) has been ported to run on a variety of such processors.

If you are not lucky enough to have access to a 64-bit computer, you might be wondering about some of the implications of having a 32-bit environment alongside the 64-bit one. Read on.

Implications

File system layout

A 64-bit operating system provides libraries compiled for the newer instruction set. To run a dynamically-linked 32-bit binary application, any libraries it needs must also be available in the 32-bit instruction set. Also, any libraries those libraries are dynamically linked against need to be available in that form, right down to the C library itself (such as glibc on GNU systems).

So, to run 32-bit programs on a 64-bit system, two flavors of the C library (and more libraries besides) need to be provided by the operating system, and these extra libraries need to reside somewhere in the file system. Multiple instances of a particular library, each for a different instruction set supported by the processor, is often known as multilib.

Where to put the extra libraries is a problem without an obvious solution. Different approaches are equally arbitrary. The one used in Red Hat® Enterprise Linux® and in the Filesystem Hierarchy Standard (FHS) is that the /lib/ directory (and /usr/lib/) is for 32-bit libraries, and 64-bit libraries go in /lib64/ (and /usr/lib64/). Debian uses /lib/ for 64-bit libraries, and puts 32-bit ones in /lib32/. Sorting out which directory is which is done by the dynamic loader, and thus it is transparent to the 32-bit program running on a 64-bit processor.

32-bit (compat) libraries 64-bit (native) libraries
Fedora Core /lib/, /usr/lib/ /lib64/, /usr/lib64/
Debian /lib32/, /usr/lib32/ /lib/, /usr/lib/
Table 1. Multilib directory structure

You might wonder where that 32-bit program should be placed in the file system, and whether there should be a similar scheme for the /bin/ directory as there is for /lib/. However, there is generally no need for that. Binary programs provided by the operating system only need to be compiled for the 64-bit instruction set as a rule. Binary programs installed by third party packages will have different names to those provided by the operating system. Aside from this, third party packages often install their files into /usr/local/ or /opt/.

Plug-ins

The ability to run software compiled for older processors on newer ones is certainly useful but things are not always so simple. Not all applications are provided in the form of standalone programs—some are provided as plug-ins to other applications. One such example is the Macromedia® Flash™ plug-in for web browsers. Currently this plug-in is available for Linux but only for 32-bit Intel-compatible processors.

A web browser plug-in is a library that gets dynamically loaded by the web browser program. It is not currently possible in Linux for a 64-bit program to dynamically load a 32-bit library. This means that the 64-bit Mozilla Firefox in Fedora Core (for example) cannot use the Macromedia Flash plug-in at all.

One work-around is to uninstall the 64-bit Mozilla Firefox package and install the equivalent package from the 32-bit Fedora Core distribution. The result is that the web browser is a 32-bit application, and this makes it possible to load 32-bit plug-ins such as Flash. It does, however, make it impossible to use any 64-bit plug-ins.

Scripts and interpreters

Programs that are written in interpreted languages rather than being compiled like C have a somewhat easier time of it in this multilib environment. There are still complications even for them.

Shell scripts may even be affected by the architecture they are running on. It may be that a shell script will take different courses of action depending on the output of the uname command. This command uses the uname system call to ask the kernel about various aspects of the system hardware.

The behavior of the uname command can be adjusted using the setarch command. On an AMD64 machine, for instance, uname -m displays x86_64, but setarch i686 uname -m displays i686. This command comes from the setarch RPM package.

Additional complications

There are packaging issues which have been overcome to make Red Hat Enterprise Linux capable of installing a package in both 32-bit and 64-bit forms at the same time. Packages can be split into two groups: those that will, and those that will not, be installed in both 32-bit and 64-bit variants on the same system. Those that will be installed twice are known as multilib packages, and they have to meet certain requirements for things to work properly. Multilib packages tend to be system libraries such as glibc, zlib, and gtk2.

To install two instances of the same RPM package, certain rules are followed to avoid conflicts. For compiled programs—but not for libraries—the 64-bit version is installed and the 32-bit version is discarded. All other types of files must be exactly the same in both the 32-bit and the 64-bit package.

Build-time issues

The first group of problems has to do with writing RPM spec files. These are files that direct the RPM package build process and say where the resulting files should be installed in the file system. One RPM spec file can describe several sub-packages: for instance, cups.spec describes the cups, cups-libs, and cups-devel RPM packages.

The main issue is with paths that start with /lib or /usr/lib. On 64-bit architectures, these should nearly always be /lib64 and /usr/lib64, and this applies even to packages that are not multilib RPMs.

Where libraries are packaged, the RPM spec file should use the macro %{_libdir}: this is replaced by /usr/lib or /usr/lib64 appropriately. A similar macro is %{_lib}, replaced by lib or lib64 as appropriate.

In general RPM spec files should put libraries in their own sub-package. This avoids a lot of the potential problems.

Install-time issues

The bash package provides a small shell script for reporting bugs, named bashbug. This script sends a message to a mailing list detailing the hardware architecture, the compiler flags used, the version number, and several other pieces of information. As the compiler flags are different for the 32-bit and 64-bit packages, the two shell scripts have different content.

As bash is a multilib package, there is a problem: this bug-reporting program cannot be called /usr/bin/bashbug in both packages or else there is a conflict. The RPM rule for resolving conflicts between compiled programs—to give priority to the 64-bit version—does not apply to shell scripts. The solution has been to name the program bashbug-32 in the 32-bit package and bashbug-64 in the 64-bit package.

Another problem that can cause conflicts when both instances of a multilib package are installed has to do with compression. Some packages compress certain files (documentation, for instance) using the gzip command. Care must be taken when doing this because this command embeds a timestamp into the compressed file. All files except compiled programs have to be identical between the two versions of a multilib package to avoid conflicts. If the 32-bit package is built just one second later than the 64-bit package, the timestamp in the compressed file will be different in each package.

The way to avoid this is to use the gzip -n option: this prevents the timestamp from being embedded in the compressed file.

Conclusion

There are many more relevant details than those described here, and there are still multilib RPM package problems to fix. For the most part, everything works very well. A large problem is the lack of experience and knowledge of multilib environments in the open source community. Assumptions are still made, for instance, that all libraries live in /usr/lib. As 64-bit computers become more widespread, my own hope is the situation will continue to improve.

In another thirty years or so, time will run out for 32-bit computing. The way of measuring time in "seconds since 1970" will have to measure a quantity that cannot be expressed in a 32-bit integer. Perhaps by then we will all have 128-bit computers.

About the author

Tim Waugh is a Systems Engineer at Red Hat, primarily responsible for scanning/printing, DocBook, VNC and some shell utilities. He has been using Linux since 1995. He lives with his wife in Surrey (England).