Issue #12 October 2005

The state of Java on Linux

Introduction

The Java™ programming language is very popular, and there are a large number of useful free libraries and programs written using it. However, until recently, there was no way to run these programs using only free software—instead, a proprietary JVM was required.

Beginning with Fedora™ Core 4, we've been able to deploy some of these applications using gcj, the GNU compiler for Java.

Deploying With gcj

The core piece of the solution is gcj. This is an ahead-of-time compiler for the Java programming language. It can compile Java source code to bytecode or object code and can also compile bytecode to object code. gcj comes with its own runtime, called libgcj, which includes a class library based on GNU Classpath, a garbage collector, and a bytecode interpreter. The interpreter is used as a fallback in those cases where, for some reason, code could not be compiled to object code ahead of time.

We've generally followed the JPackage guidelines for packaging Java programs, and in particular we're using their system of JRE alternatives. This enables switching between different runtimes; a package called java-gcj-compat presents gcj and libgcj as the default alternative.

Starting with GCC 4.0, gcj has included a new "binary compatibility ABI" which implements the binary compatibility rules as laid out by the Java Language Standard. This ABI makes it much simpler to compile existing applications.

In addition, the gcj runtime includes a class map which maps class file contents onto shared libraries. This makes it possible for gcj-compiled shared libraries to be used without any changes to the compiled applications, regardless of how they happen to find the classes that they load. There is a built-in, system-wide class mapping database; the gcj-dbtool program can be used to manipulate it.

The simplest way to deploy an application using gcj is to use these two features. We provide a helper program, aot-compile-rpm, which we use in all the applications we ship. This program will compile all the jar files in a package and register the resulting shared libraries in the global database.

Compiling ahead of time is optional—it is entirely possible to run applications using the bytecode interpreter that comes with gcj—but it results in more efficient programs. Because the use of shared libraries is invisible the application, switching JRE alternatives will work fine.

How to compile an existing java application

Here is how to compile and deploy an existing jar using the binary compatibility ABI. This example shows the simplest possible approach; when building RPMs a somewhat more complicated approach is taken, one that allows for multiple applications to easily share a single database.

First, compile the jar to native code:

    gcj -shared -findirect-dispatch -fjni -fPIC -Wl,-Bsymbolic \
      mypackage.jar \
      -o mypackage.jar.so

What this means:

  • -shared and -fPIC are standard options when creating shared libraries.
  • -Wl,-Bsymbolic tells the linker to change how global symbol references are handled; see the 'ld' manual for more information. In our context this is a useful optimization.
  • -findirect-dispatch is the flag to enable the binary compatibility ABI.
  • -fjni is used to tell gcj that 'native' methods should use JNI, rather than the default CNI.

Now add the new jar and shared library to the database. You can see the name of the global database like so:

     gcj-dbtool -p

Adding a jar is simple; note that in order to add a jar to the global database, you must be root:

     gcj-dbtool -a `gcj-dbtool -p` mypackage.jar mypackage.jar.so

You can see if it worked by listing the contents of the database:

     gcj-dbtool -l `gcj-dbtool -p` | grep mypackage

Finally, run your application the usual way, e.g.:

     java -cp mypackage.jar com.mypackage.Main

This will automatically use the shared library; you can verify that it is working by looking in the process' 'maps' file in /proc.

Programs and libraries

Using this setup, Red Hat is able to build and ship a number of open source programs written in Java:

  • Eclipse, a Java-based integrated development environment. We also include Eclipse plugins for C/C++ development, Python development, and Bugzilla.
  • Tomcat, a servlet container.
  • Ant, a popular Java-based build tool.
  • OpenOffice.org, which includes some code written in Java.

The above programs have many library dependencies, which Red Hat also distributes.

One other library worth noting is java-gnome, a set of Java bindings for the GNOME and GTK+ libraries. This package makes it possible to write GNOME applications in Java.

Debugging

One advantage of gcj's approach is that we're able to reuse many of the existing tools, rather than write our own. So, for instance, OProfile and QProf can be used for profiling, and Valgrind can be used to track down memory problems in Java native code.

For debugging, we've modified gdb so that it understands gcj-compiled code. The gdb command line understands Java expression syntax and package naming. One of the major benefits of this approach is that it makes JNI debugging much, much easier—you can debug from your Java code through a small gcj-generated stub, into your JNI method, and then back out again.

Current limitations

This approach to running existing Java applications works well, but it is not without its difficulties.

Gcj's class library is not yet complete; while a number of applications run using it, some will not. At the moment, the only way to tell whether a given application will work is to try it.

Debugging is also somewhat difficult. While gdb works reasonably well, it has some gcj-related bugs. More importantly, there is no way to debug a gcj-compiled application using existing Java tools, like Eclipse. That is, the existing tools don't match what experienced Java developers expect.

Currently gcj does not support the 1.5 language features or class library additions.

Future plans

We are actively developing gcj to address most of the deficiencies previously mentioned. In particular:

  • We are looking into improving the performance of gcj-compiled code. Historically we have not written many gcj-specific optimizations in GCC; we are looking to add optimization passes that specifically improve performance of typical Java code.
  • We are working on integrating JDWP-based debugging into libgcj. This will enable debugging applications using Eclipse's Java debugger.
  • The class library is rapidly nearing completion. The development version is 95% complete, as compared to the 1.4 JDK (and 85% complete as compared to 1.5). However, development is ordinarily focused on getting real applications working, not on reaching an arbitrary amount of coverage.
  • Work is proceeding on a rewrite of the gcj frontend, called gcjx, which support all of the 1.5 language features.
  • We are looking at including more applications and libraries in Fedora Core. JOnAS is nearly ready and should be working in the development tree soon. We've looked into Azureus, a BitTorrent client; and RSSOwl, an RSS reader.
  • We're also looking into finishing the security infrastructure in gcj, so that we could ship a web browser plugin.

Further reading

About the author

Tom Tromey is an engineer working on gcj at Red Hat. He has worked on many free software projects over the years, and co-wrote GNU Autoconf, Automake, and Libtool.