The Heinsenburg Debugging Technology


  Contents Next >

The Problem

The traditional way to debug a program using a debugger is to stop the program, examine its state, let it run some more, and so on. Each time you stop the program, aeons go by, from the machine's point of view.

This isn't a problem if your program is interacting only with other systems that will wait for it. However, if your code has important real-time deadlines to meet, the system may fall out of sync. Even worse, if your program is controlling a piece of machinery with physical parts (disk heads, robot arms) in motion, depending on your program to control them, you may do real damage by stopping the program. Worse yet, if the system you are debugging is deployed in the field, and controlling a traffic light, an elevator, or a motor vehicle, lives could be lost while you step through your code!

This amounts to what could be called the "Heisenberg Principle of Software Development": debugging the system may change its behavior.

Debugging aids such as emulators and logic analyzers are very good at reducing or nearly eliminating this intrusive effect, but they require expensive hardware, and many developers don't really know how to use them. The learning curve can be steep, since these tools usually use quite a different metaphor from the traditional ''stop, look around, and continue'' cycle of breakpoint-based debugging. Some of the high-end (expensive) versions of these tools are at least loosely integrated with a debugger, but many of the middle- and low-end ones are not (or only superficially so).

The poor man's approach to debugging time-critical code, code running in the field, and sometimes code that only manifests a bug at odd intervals, has often been to instrument the source code by hand with instructions that will write debugging state information out to a console or to a file or buffer. We call this method "printf debugging", and it has an ancient and honorable history. However, as useful as it may be, it is also cumbersome and limited; to add or remove a trace, you must go through an entire edit / compile / load cycle. The output is only as readable and informative as you make it. Unless you're really dedicated to the technique (and keep libraries of debugging routines), you must reinvent the wheel every time you use it, and you usually have to remove it all before you deploy your software to the field.

These methods all share the advantage that you can let your program run at close to native speed, and then analyze what it did after the fact. How can we obtain this advantage without the disadvantages of expensive hardware, unfamiliar interface, and non-reusablility?


  Contents Next >