The Heinsenburg Debugging Technology


< Prev Contents Next >

The Cygnus Solution: Introspect

Introspect is an extension to GDB which provides flexible, efficient, and non-intrusive debugging for embedded systems. Introspect consists of code in GDB, and a software 'agent' running on the target system, which together allow developers to specify 'tracepoints' (analogous to breakpoints). A tracepoint is a code location, and data to be

collected whenever execution reaches that location. Introspect has a number of nice properties:

  • Data collection is not instantaneous, but it is quick. Recording the data for a typical tracepoint takes a few thousand machine cycles --- less than the time required to send a single character over a 9600 baud serial line. This latency is small enough to be tolerable when debugging many embedded applications.
  • You choose what data to collect, and when to collect it, interactively at debug time. You can examine data collected, choose new tracepoints, and begin collection immediately, without recompiling and reloading the program.
  • You can use arbitrary source language expressions to specify which data to collect. This notation is both familiar and powerful.
  • You have the full power of GDB available to examine logged data. Once the user has selected a trace event, all GDB commands work normally, as long as they refer only to data (registers and memory values) actually collected at that event.
  • Introspect requires no special-purpose hardware, like emulators or analyzers. It is implemented completely in software.
  • Introspect's instrumentation can be removed as easily as it was added, so that if you are not actively recording data, it need not affect the software in the field at all.

 

For example, suppose you want to observe the behavior of the following code:

struct point {

double x, y;

};

/* A vector is an array of points. N is the number of

points, and p points to the first point in the array.

*/

struct vector {

int n;

struct point *p;

};

/* A binary tree of vectors, ordered by KEY. */

struct tree {

struct tree *left, *right;

int key;

struct vector *vector;

};

/* Return the node in TREE whose key is KEY.

Return zero if there if no such node. */

struct tree *

find (struct tree *tree, int key)

{

if (! tree)

return 0;

if (key < tree->key)

return find (tree->left, key);

else if (key > tree->key)

return find (tree->right, key);

else

return tree;

}


 

Each time the 'find' function is invoked, you would like to see the call stack, the tree node being visited, and (to make things interesting), the last point in that tree node's vector.

We could certainly do this with an ordinary debugger. Many debuggers will let you set a breakpoint at a certain location, and then list some commands or macros that will be executed when the breakpoint is hit. You could use these to "collect" the values of program variables.

However, most debuggers don't provide any convenient way to record the data collected. Moreover, although the debugger could collect data much faster than a human could, it would often not be fast enough for some purposes --- communication between the debugger and the target is usually slow, and the typical debugger's macro language is not especially efficient.

However, analyzing the collected data is something that a normal debugger is good at. Debuggers know how to reinterpret a set of raw register values and memory contents as source-level constructs like stack frames, variables with names and types, data structures, and so on. Given the name of a variable, a debugger knows how to look up the binding currently in scope for that name, determine its size and type, and find its value in a register, on the stack, or in static memory.

So we see that a debugger could be used for two of the three tasks involved in trace debugging, but would not really be good at the central task --- the actual data collection. Suppose we delegate that task to a separate trace collection agent, running on the target system and configured by GDB? Then a trace debugging experiment might look something like this:

  • Specify the trace experiment:

Using the debugger, the user places tracepoints in his program. For each tracepoint, the user specifies the data to be collected using source code names and expressions, just as one would use in a print, watch or display command.

  • Run the experiment:

After downloading the tracepoints to the trace collection agent, the debugger allows the program to run. Each time a tracepoint is reached, the trace collection agent (which may in fact be linked directly into the program) wakes up, quickly records the desired data in a memory buffer on the target board, and allows the program to resume. (Note that this involves no interaction with the debugger, and therefore no communication over slow serial links.)

  • Analyze the results:

By querying the trace collection agent, the debugger can access the collected data, and "replay" the tracepoint events. The contents of each record in the trace buffer (each corresponding to the execution of a tracepoint) can be displayed in sequence or in any order.

Although intrusive, this method will affect the timing of the running system far less than would be the case if the user, or even the debugger, were involved in collecting the data. No matter how long the trace collection agent requires to service an interrupt and collect the data, it will surely be less time than would be required to send a message over a serial line, or move a human's finger over a keyboard! The degree of intrusiveness can be reduced by careful optimization of the trace collection agent.

Let's look at the three phases in more detail.

Specification phase:

Using the traditional debugging model, you might ask your debugger to stop at the beginning of the 'find' function, display the function call stack, and show the values of '*tree' and 'tree->vector.p[tree->vector.n - 1]'.

Using GDB, you might accomplish that task using commands like these:

(gdb) break find

(gdb) commands

> where

> print *tree

> print tree->vector.p[tree->vector.n - 1]

> continue

> end

(gdb)

Suppose instead you wanted to set up a trace experiment to collect the same values. The analogous commands might look like this:

(gdb) trace find

(gdb) actions

> collect $stack

> collect $locals

> collect *tree

> collect tree->vector.p[tree->vector.n - 1]

> end

(gdb)

In both cases, GDB does not immediately do anything, other than to remember what the user wants to happen later. Nothing really happens until the program is run, and 'find' is called. Then, in the case of the breakpoint, the backtrace, '*tree', and the vector's point are displayed right away. In the case of the tracepoint, the values are stored in the trace buffer for later retrieval.

Special syntax is provided for collecting certain commonly useful sets of data:

> collect $regs // all registers

> collect $locals // all locals and arguments

> collect $stack // a fixed-size chunk of stack

Collecting a chunk of the program stack is especially useful during the analysis phase (see below).


 

The collection phase:

Each time the program reaches a tracepoint, the tracepoint's data is logged in a buffer on the target machine. Each log entry is called an 'event'; each event contains the number of the tracepoint reached, and any register values and memory contents needed to evaluate the tracepoint's expressions.

It is important to understand that an event does not simply record the values of each expression to be collected. Rather, it records everything GDB might need to re-evaluate that expression later. In the example above, to collect '*tree', the event would record both the register containing the variable 'tree', and the memory the tree node occupies.

To begin collection, we use GDB's 'tstart' command (which downloads the trace experiment to the trace collection agent), and then let the program run:

(gdb) tstart

(gdb) continue

As the program runs, the agent will collect trace data.

Analysis phase:

Again using the traditional debugging model, you might:

1) Run until you reach a breakpoint

2) Note where you are in the program

3) Look at the values of data and/or registers

4) Continue to the next breakpoint

If instead you were debugging the results of a trace experiment, you would:

1) Select a particular tracepoint event to example

2) Note where that event occurred

3) Look at the values of collected data and/or registers

4) Select another tracepoint event

Continuing our example above, we can use the 'tfind start' command to select the first recorded event:

(gdb) tfind start

Tracepoint 1, find (tree=0x8049a50, key=5) at samp.c:24

24 if (! tree)

Since we have collected '$stack', we can use GDB's 'where' command to show the currently active frames. '$stack' saves only a fixed (configurable) number of bytes from the top of the stack, but usually saves enough to capture the top few frames.

(gdb) where

#0 find (tree=0x8049a50, key=5) at samp.c:24

#1 0x8048744 in main () at main.c:8

Since we have collected '*tree', we can examine that data structure.

(gdb) print *tree

$1 = {left = 0x80499b0, right = 0x8049870, key = 100,

vector = 0x8049a68}

(gdb) print tree->key

$2 = 100

(gdb) print tree->left

$3 = (struct tree *) 0x80499b0

Note that only those objects actually collected are available for inspection. Although the left subtree was collected at the next tracepoint event, it was not collected in this one:

(gdb) print *tree->left

Data not collected.

However, in order to collect 'tree->vector.p[tree->vector.n - 1]', the agent had to collect both 'tree->vector.p' and 'tree->vector.n', so the entire 'tree->vector' structure is covered. Since the data is available, we can print it normally:

(gdb) print *tree->vector

$4 = {n = 2, p = 0x8049a78}

Introspect does not collect the entirety of every object mentioned in the expression. Rather, it collects the final value of the expression, along with any other data needed to evaluate the expression. Thus, although the last point in the vector was collected, none of the other points in the vector are available --- they were never referenced while evaluating the expression.

(gdb) print tree->vector.p[1]

$5 = {x = 3, y = -46}

(gdb) print tree->vector.p[0]

Data not collected.

So far, we've been inspecting the first tracepoint event. Let's walk forward through a few events, to see where the tree search ended. The 'tfind' command, given no arguments, selects the next trace event record:

(gdb) tfind

Tracepoint 1, find (tree=0x80499b0, key=5) at samp.c:24

24 if (! tree)

(gdb) where

#0 find (tree=0x80499b0, key=5) at samp.c:24

#1 0x80484fa in find (tree=0x8049a50, key=5) at

samp.c:28

#2 0x8048744 in main () at main.c:8

(gdb) print *tree

$6 = {left = 0x8049950, right = 0x80498f0, key = 3,

vector = 0x80499c8}

(gdb) tfind

Tracepoint 1, find (tree=0x80498f0, key=5) at samp.c:24

24 if (! tree)

(gdb) where

#0 find (tree=0x80498f0, key=5) at samp.c:24

#1 0x8048523 in find (tree=0x80499b0, key=5) at

samp.c:30

#2 0x80484fa in find (tree=0x8049a50, key=5) at

samp.c:28

#3 0x8048744 in main () at main.c:8

(gdb) print *tree

$7 = {left = 0x0, right = 0x0, key = 5, vector =

0x8049908}

Note that successive events record the growing stack as function 'find' walks the tree recursively. Since we have found the tree node we were looking for, this is the last call to 'find' and the last tracepoint event in the log:

(gdb) tfind

Target failed to find requested trace event.

Because all the tracepoint events are stored in a buffer which may be accessed at random, Introspect provides the (somewhat eerie) ability to travel backwards in time. For example, the command 'tfind - will select the tracepoint event immediately preceding the current event. As earlier events are selected, the program will appear to 'un-make' its recursive calls to find:

(gdb) tfind -

Tracepoint 1, find (tree=0x80499b0, key=5) at samp.c:24

24 if (! tree)

(gdb) where

#0 find (tree=0x80499b0, key=5) at samp.c:24

#1 0x80484fa in find (tree=0x8049a50, key=5) at

samp.c:28

#2 0x8048744 in main () at main.c:8

(gdb) tfind -

Tracepoint 1, find (tree=0x8049a50, key=5) at samp.c:24

24 if (! tree)

(gdb) where

#0 find (tree=0x8049a50, key=5) at samp.c:24

#1 0x8048744 in main () at main.c:8

Since all the events are available in the agentis buffer, you can examine them in any order you want. After examining one event, you could travel backwards in time, and examine the previous event.

Using all of these familiar commands in combination with GDB's built-in scripting language, the user can generate a listing or report of the trace results, in any format desired.


< Prev Contents Next >