Definitions
Garbage collection relies on the observation that a previously
allocated memory location that is no longer referenced by any pointers
is no longer accessible to the application and is therefore
reclaimable. The role of a garbage collector is to identify and
recycle these inaccessible allocated memory chunks. There are two
families of garbage collecting techniques -- reference counting and
tracing [1].
Reference counting keeps track of the number of references to a memory
cell. The cell is considered reclaimable whenever this number becomes
zero. The compiler or the runtime system is responsible for adjusting
the reference count every time a pointer is updated or copied. The
main advantage of reference counting is that the computational
overhead is distributed throughout the execution of the program, which
ensures a smooth response time. Reclaiming an isolated cell doesn't
imply accessing other cells, and there are optimizations that reduce
the cost of deleting the last reference to an entire group of cells.
The major weakness of reference counting is its inability to reclaim
cyclic structures such as doubly-linked lists or trees featuring leaf
nodes with pointers back to their roots.
Reference counting can be optimized to reduce the number of times
reference count fields have to be updated and to reduce the size of
the reference count field. It can also be modified to detect pointers
that are part of a cycle or can even be combined with a tracing
algorithm, described below, in order to reclaim cyclic data.
Reachable (live) memory locations are those that are directly or
indirectly pointed to by a primordial set of pointers called the
roots. Tracing algorithms start from the roots and examine each
allocated memory cells' connectivity.
There are two kinds of tracing algorithms: "mark and sweep" and
"copying." Mark and sweep marks active memory cells (those that are
reachable through pointer reference, starting with the roots) whenever
the amount of free memory goes below a certain threshold, and then
sweeps through the allocated cells. When sweeping is complete,
unmarked cells are returned to the available memory pool. The
advantage of this approach is its ability to reclaim cyclic data
structures. But if implemented as described, it may not be acceptable
for an application with some real-time constraints for reasons I will
explain below.
The copying algorithm uses a heap divided into halves called
semi-spaces. One semi-space, called the current space, is used for
memory allocation. When garbage collection occurs, active memory
cells are traced from the current space and copied into the second
space. When done, the second space contains only live cells
(non-garbage) and becomes the new active space. The cells get
automatically compacted as a result of this process. Unreachable
cells (garbage) are not copied and are thus naturally reclaimed. The
copying algorithm ensures fast allocation of new cells and reduces
fragmentation. On the other hand, it halves the memory available to
the application and touches all the living memory cells during the
copy algorithm, which may lead to a considerable number of page faults
on a system using pagination.
Tracing garbage collectors can be either accurate/exact (able to
distinguish pointer from non-pointer data) or conservative (consider
every word encountered as a potential pointer). The type of tracing
garbage collector you use -- exact or conservative -- affects the way
pointers are identified within the location where roots can be found
(registers, stack, static areas) and the way the garbage collector
scans cells looking for pointers to other cells.
A conservative garbage collector needs to examine every memory word as
a potential pointer. This process isn't guaranteed to succeed for all
instances of the considered words and may lead to misidentification
errors. This implies that memory cells that are actually garbage may
accidentally be retained. Furthermore, a fully conservative garbage
collector can't venture moving memory blocks around, wrongly updating
memory references it thinks point to them. It implies that a fully
conservative garbage collector can't be a fully compacting collector.
Conservative collectors can be altered to be mostly copying: objects
that are solely accessible from other heap-allocated object are
copied. They are deployed on systems where no cooperation from the
compiler or the runtime system is expected to help the collector tell
whether a word is a pointer or not.
|