United States (change)
Shortcuts: Downloads Fedora Red Hat Network
Issue #9 July 2005
Way back in the eighties, my first computer was a Commodore 64—so named because it had 64 kilobytes of memory. Today there are few applications around that could fit into such a small space, and these grow larger as time goes on. A web browser can use tens of megabytes of memory without even loading a web page. Computers are fitted with ever larger amounts of RAM to make the applications run faster, and the applications make use of this by providing an ever increasing range of features.
Hand in hand with the need for greater amounts of RAM is the need to access it quickly. Most home computers are fitted with 32-bit microprocessors, and these are beginning to show the strain.
A 64-bit microprocessor is one with integer registers that hold 64 bits and which can move 64 bits of data between memory storage and a register in the CPU in a single operation. The HP® Alpha™ processor, the Apple® PowerPC G5™, and AMD® Opteron™/Intel® EM64T are examples. Those last few are unlike the Alpha in that although they were designed for 64-bit operation, they can also run programs intended for earlier 32-bit processors.
This makes migrating from earlier systems much easier because generally the older software will still work. Indeed, at the moment you may well find that a brand new 64-bit computer will be sold with an operating system that was compiled for a 32-bit processor.
One of the interesting things about this type of processor is that a 64-bit operating system is able to run both 32-bit and 64-bit programs concurrently. Linux (meaning the GNU toolchain as well as the Linux kernel) has been ported to run on a variety of such processors.
If you are not lucky enough to have access to a 64-bit computer, you might be wondering about some of the implications of having a 32-bit environment alongside the 64-bit one. Read on.
A 64-bit operating system provides libraries compiled for the newer instruction set. To run a dynamically-linked 32-bit binary application, any libraries it needs must also be available in the 32-bit instruction set. Also, any libraries those libraries are dynamically linked against need to be available in that form, right down to the C library itself (such as glibc on GNU systems).
So, to run 32-bit programs on a 64-bit system, two flavors of the C library (and more libraries besides) need to be provided by the operating system, and these extra libraries need to reside somewhere in the file system. Multiple instances of a particular library, each for a different instruction set supported by the processor, is often known as multilib.
Where to put the extra libraries is a problem without an
obvious solution. Different approaches are equally arbitrary.
The one used in Red Hat® Enterprise Linux® and in the Filesystem
Hierarchy Standard (FHS) is that the /lib/
directory (and /usr/lib/) is for 32-bit
libraries, and 64-bit libraries go in
/lib64/ (and
/usr/lib64/). Debian uses
/lib/ for 64-bit libraries, and puts 32-bit
ones in /lib32/. Sorting out which
directory is which is done by the dynamic loader, and thus it is
transparent to the 32-bit program running on a 64-bit processor.
| 32-bit (compat) libraries | 64-bit (native) libraries | |
|---|---|---|
| Fedora Core | /lib/,
/usr/lib/ |
/lib64/,
/usr/lib64/ |
| Debian | /lib32/,
/usr/lib32/ |
/lib/,
/usr/lib/ |
You might wonder where that 32-bit program should be
placed in the file system, and whether there should be a similar
scheme for the /bin/ directory as there is
for /lib/. However, there is generally no
need for that. Binary programs provided by the operating system
only need to be compiled for the 64-bit instruction set as a
rule. Binary programs installed by third party packages will
have different names to those provided by the operating system.
Aside from this, third party packages often install their files
into /usr/local/ or
/opt/.
The ability to run software compiled for older processors on newer ones is certainly useful but things are not always so simple. Not all applications are provided in the form of standalone programs—some are provided as plug-ins to other applications. One such example is the Macromedia® Flash™ plug-in for web browsers. Currently this plug-in is available for Linux but only for 32-bit Intel-compatible processors.
A web browser plug-in is a library that gets dynamically loaded by the web browser program. It is not currently possible in Linux for a 64-bit program to dynamically load a 32-bit library. This means that the 64-bit Mozilla Firefox in Fedora Core (for example) cannot use the Macromedia Flash plug-in at all.
One work-around is to uninstall the 64-bit Mozilla Firefox package and install the equivalent package from the 32-bit Fedora Core distribution. The result is that the web browser is a 32-bit application, and this makes it possible to load 32-bit plug-ins such as Flash. It does, however, make it impossible to use any 64-bit plug-ins.
Programs that are written in interpreted languages rather than being compiled like C have a somewhat easier time of it in this multilib environment. There are still complications even for them.
Shell scripts may even be affected by the architecture
they are running on. It may be that a shell script will take
different courses of action depending on the output of the
uname command. This command uses the
uname system call to ask the kernel about
various aspects of the system hardware.
The behavior of the uname command
can be adjusted using the setarch command.
On an AMD64 machine, for instance, uname
-m displays x86_64,
but setarch i686 uname -m displays
i686. This command comes from
the setarch RPM package.
There are packaging issues which have been overcome to make Red Hat Enterprise Linux capable of installing a package in both 32-bit and 64-bit forms at the same time. Packages can be split into two groups: those that will, and those that will not, be installed in both 32-bit and 64-bit variants on the same system. Those that will be installed twice are known as multilib packages, and they have to meet certain requirements for things to work properly. Multilib packages tend to be system libraries such as glibc, zlib, and gtk2.
To install two instances of the same RPM package, certain rules are followed to avoid conflicts. For compiled programs—but not for libraries—the 64-bit version is installed and the 32-bit version is discarded. All other types of files must be exactly the same in both the 32-bit and the 64-bit package.
The first group of problems has to do with writing RPM spec
files. These are files that direct the RPM package build
process and say where the resulting files should be installed in
the file system. One RPM spec file can describe several
sub-packages: for instance, cups.spec
describes the cups, cups-libs, and cups-devel RPM
packages.
The main issue is with paths that start with
/lib or /usr/lib. On
64-bit architectures, these should nearly always be
/lib64 and /usr/lib64,
and this applies even to packages that are not multilib RPMs.
Where libraries are packaged, the RPM spec file should use
the macro %{_libdir}: this is replaced by
/usr/lib or /usr/lib64
appropriately. A similar macro is %{_lib},
replaced by lib or
lib64 as appropriate.
In general RPM spec files should put libraries in their own sub-package. This avoids a lot of the potential problems.
The bash package provides a small shell script for
reporting bugs, named bashbug. This script
sends a message to a mailing list detailing the hardware
architecture, the compiler flags used, the version number, and
several other pieces of information. As the compiler flags are
different for the 32-bit and 64-bit packages, the two shell
scripts have different content.
As bash is a multilib package, there is a problem: this
bug-reporting program cannot be called
/usr/bin/bashbug in both packages or else
there is a conflict. The RPM rule for resolving conflicts
between compiled programs—to give priority to the 64-bit
version—does not apply to shell scripts. The solution has
been to name the program bashbug-32 in the
32-bit package and bashbug-64 in the 64-bit
package.
Another problem that can cause conflicts when both
instances of a multilib package are installed has to do with
compression. Some packages compress certain files
(documentation, for instance) using the gzip
command. Care must be taken when doing this because this
command embeds a timestamp into the compressed file. All files
except compiled programs have to be identical between the two
versions of a multilib package to avoid conflicts. If
the 32-bit package is built just one second later than the
64-bit package, the timestamp in the compressed file will be
different in each package.
The way to avoid this is to use the
gzip -n option: this
prevents the timestamp from being embedded in the compressed
file.
There are many more relevant details than those described
here, and there are still multilib RPM package problems to fix.
For the most part, everything works very well. A large problem is
the lack of experience and knowledge of multilib environments in
the open source community. Assumptions are still made, for
instance, that all libraries live in /usr/lib.
As 64-bit computers become more widespread, my own hope is the
situation will continue to improve.
In another thirty years or so, time will run out for 32-bit computing. The way of measuring time in "seconds since 1970" will have to measure a quantity that cannot be expressed in a 32-bit integer. Perhaps by then we will all have 128-bit computers.