No Silver Bullet - Garbage Collection for Java in Embedded Systems


  Contents Next >

Definitions

Garbage collection relies on the observation that a previously allocated memory location that is no longer referenced by any pointers is no longer accessible to the application and is therefore reclaimable. The role of a garbage collector is to identify and recycle these inaccessible allocated memory chunks. There are two families of garbage collecting techniques -- reference counting and tracing [1].

Reference counting keeps track of the number of references to a memory cell. The cell is considered reclaimable whenever this number becomes zero. The compiler or the runtime system is responsible for adjusting the reference count every time a pointer is updated or copied. The main advantage of reference counting is that the computational overhead is distributed throughout the execution of the program, which ensures a smooth response time. Reclaiming an isolated cell doesn't imply accessing other cells, and there are optimizations that reduce the cost of deleting the last reference to an entire group of cells. The major weakness of reference counting is its inability to reclaim cyclic structures such as doubly-linked lists or trees featuring leaf nodes with pointers back to their roots.

Reference counting can be optimized to reduce the number of times reference count fields have to be updated and to reduce the size of the reference count field. It can also be modified to detect pointers that are part of a cycle or can even be combined with a tracing algorithm, described below, in order to reclaim cyclic data.

Reachable (live) memory locations are those that are directly or indirectly pointed to by a primordial set of pointers called the roots. Tracing algorithms start from the roots and examine each allocated memory cells' connectivity.

There are two kinds of tracing algorithms: "mark and sweep" and "copying." Mark and sweep marks active memory cells (those that are reachable through pointer reference, starting with the roots) whenever the amount of free memory goes below a certain threshold, and then sweeps through the allocated cells. When sweeping is complete, unmarked cells are returned to the available memory pool. The advantage of this approach is its ability to reclaim cyclic data structures. But if implemented as described, it may not be acceptable for an application with some real-time constraints for reasons I will explain below.

The copying algorithm uses a heap divided into halves called semi-spaces. One semi-space, called the current space, is used for memory allocation. When garbage collection occurs, active memory cells are traced from the current space and copied into the second space. When done, the second space contains only live cells (non-garbage) and becomes the new active space. The cells get automatically compacted as a result of this process. Unreachable cells (garbage) are not copied and are thus naturally reclaimed. The copying algorithm ensures fast allocation of new cells and reduces fragmentation. On the other hand, it halves the memory available to the application and touches all the living memory cells during the copy algorithm, which may lead to a considerable number of page faults on a system using pagination.

Tracing garbage collectors can be either accurate/exact (able to distinguish pointer from non-pointer data) or conservative (consider every word encountered as a potential pointer). The type of tracing garbage collector you use -- exact or conservative -- affects the way pointers are identified within the location where roots can be found (registers, stack, static areas) and the way the garbage collector scans cells looking for pointers to other cells.

A conservative garbage collector needs to examine every memory word as a potential pointer. This process isn't guaranteed to succeed for all instances of the considered words and may lead to misidentification errors. This implies that memory cells that are actually garbage may accidentally be retained. Furthermore, a fully conservative garbage collector can't venture moving memory blocks around, wrongly updating memory references it thinks point to them. It implies that a fully conservative garbage collector can't be a fully compacting collector.

Conservative collectors can be altered to be mostly copying: objects that are solely accessible from other heap-allocated object are copied. They are deployed on systems where no cooperation from the compiler or the runtime system is expected to help the collector tell whether a word is a pointer or not.


  Contents Next >