4. Run-time
Running a compiled Java program will need a suitable Java run-time
environment. This is in principle no different from C++ (which
requires an extensive library, as well as support for memory
allocation, exceptions, and so on). However, Java requires more
run-time support than "traditional" languages: It needs
support for threads, garbage collection, type reflection (meta-data),
and all the primitive Java methods. Full Java support also means being
able to dynamically load new bytecoded classes, though this may not be
needed in some embedded environments. Basically, the appropriate Java
run-time environment is a Java Virtual Machine.
It is possible to have jc1 produce code
compatible with the Java Native Interface ABI. Such code could run
under any Java VM that implements the JNI. However, the JNI has
relatively high overhead, so if you are not concerned about binary
portability it is better to use a more low-level ABI, similar to the
VM's internal calling convention. (If you are concerned about
portability, use .class files.) While we
plan to support the portable JNI, we will also support such a
lower-level ABI. Certainly the standard Java functionality (such as
that in java.lang will be compiled to the
lower-level ABI.
A low-level ABI is inherently dependent on a specific VM. We are
using Kaffea free Java VM, written by Tim Wilkinson
(see Bibliography for Kaffe), with help from
volunteers around the "Net." Kaffe uses either a JIT
compiler on many common platforms, or a conventional bytecode
interpreter, which is quite portable (except for thread
support). Using a JIT compiler makes it easy to call between
pre-compiled and dynamically loaded methods (since both use the same
calling convention).
We are making many enhancements to make Kaffe a more suitable
target for pre-compiled code. One required change is to add a hook so
that pre-compiled and pre-allocated classes can be added to the global
table of loaded classes. This means implementing the RegisterClass function.
Other changes are not strictly needed, but are highly
desirable. The original Kaffe meta-data had some redundant
data. Sometimes redundancy can increase run-time efficiency
(e.g., caching, or a method dispatch table for virtual method
calls). However, the gain has to be balanced against the extra
complication and space. Space is especially critical for embedded
applications, which is an important target for us. Therefore, we have
put some effort into streamlining the Kaffe data structures, such as
replacing linked lists by arrays.
4.1 Debugging
Our debugging strategy for Java is to enhance gdb (the GNU debugger) so it can understand
Java-compiled code. This follows from our philosophy of treating Java
like other programming languages. This also makes it easier to debug
multi-language applications (C and Java).
Adding support for Java requires adding a Java expression parser,
and routines to print values and types in Java syntax. It should be
easy to modify the existing C and C++ language support routines, since
Java expressions and primitive types are very similar to those of C++.
Adding support for Java objects is somewhat more work. Getting,
setting, and printing of Object fields is
basically the same as for C++. Printing an Object reference can be
done using a format similar to that used by the default toString method the class followed by the
address such as java.io.DataInput@cf3408. Sometimes you instead
want to print the contents of the object, rather than its
address (or identity). Strings should, by default, be printed using
their contents, rather than their address. For other objects, gdb can invoke the toString method to get a printable
representation, and print that. However, there should be different
options to get different styles of output.
gdb can evaluate a general
user-supplied expression, including a function call. For Java, this
means we must add support for invoking a method in the program we are
debugging. Thus, gdb has to be able to
know the structure of the Java meta-data so it can find the right
method. Alternatively, gdb could invoke
functions in the VM to do the job on its behalf.
gdb has an internal representation of
the types of the variables and functions in the program being
debugged. Those are read from the symbol-table section of the
executable file. To some extent this information duplicates the
meta-data that we already need in the program's address space. We
can save some file space if we avoid putting duplicate meta-data in
the symbol table section, and instead extend gdb
so it can get the information it needs from the running
process. This also makes gdb start-up faster, since it makes it easier
to only create type information when needed.
Potentially duplicated meta-data includes the source line
numbers. This is because a Java program needs to be able to do a stack
trace, even without an external debugger. Ideally, the stack trace
should include source line numbers. Therefore, it is best to put the
line numbers in a special read-only section of the executable. This
would be pointed to by the method meta-data, where both gdb and the
internal Java stack dumper can get at it. (For embedded systems one
would probably leave out line numbers in the run-time, and only keep
it in the debug section of the executable file.)
Extracting symbolic information from the process rather than from
the executable file is also more flexible, since it makes it easier to
also support new classes that are loaded in at run-time. While the
first releases will concentrate on debugging pre-compiled Java code,
we will want to debug bytecodes that have been dynamically loaded into
the VM. This problem is eased if the VM uses JIT (as Kaffe does),
since in that case the representation of dynamically-(JIT-)compiled
Java code is the same as pre-compiled code. However, we still need to
provide hooks so that gdb knows when a new
class is loaded into the VM.
Long-term, it might be useful to download Java code into gdb itself (so we can extend gdb using Java), but that requires integrating a
Java evaluator into gdb.
4.2 Profiling
One problem with Java is the lack of profiling tools. This makes it
difficult to track down the "hot-spots" in an
application. Using GCC to compile Java to native code lets us use
existing profiling tools, such as gprof,
and the gcov coverage analyzer.
|