Issue #10 August 2005

Debugging code with strace

Have you ever racked your brain to find that bug? You know, the bug that you can't find in the source code, but which appears with deliberate consistency when your code is compiled and run. Hacker, meet strace. strace is a utility that allows you to trace system calls and signals for a given command and its arguments. What commands, you ask? Whatcha got?

strace is free software governed by a BSD-style license. It was originally written by Paul Kranenburg for SunOS and was inspired by the SunOS trace utility. The SunOS version was ported to Linux by Branko Lankester, who also wrote the Linux kernel support. In 1993, Rick Sladkey merged strace 2.5 for SunOS and the second release of strace for Linux, added many of the features of truss(1) from SVR4, producing an strace that worked on both platforms. Today, strace is maintained by Wichert Akkerman and Roland McGrath. strace is available in Red Hat® Enterprise Linux® 4 at version 4.5.8, Fedora™ Core 4 at version 4.5.11, and is also available at version 4.5.12 from SourceForge.net. Version 4.5.12 is discusses in this article.

Installing strace

If you don't have strace installed and are running Red Hat Enterprise Linux 4, execute the following command as root to install it:

up2date strace

To install strace on Fedora Core 4, execute the following command as root:

yum install strace

Alternatively, building strace is easy. Download strace-4.5.12.tar.bz2 from your closest SourceForge.net mirror.

Decompress and untar the contents of strace-4.5.12.tar.bz2:

tar -xjvf strace-4.5.12.tar.bz2

Change directories to strace-4.5.12:

cd strace-4.5.12

Run configure:

./configure

You will see the following output:

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether to enable maintainer-specific portions of Makefiles... no
checking build system type...
<snip>
checking whether sys_siglist is declared... yes
checking whether _sys_siglist is declared... yes
checking for perl... /usr/bin/perl
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: executing depfiles commands

Next, run make:

make

Run make install as root to install strace:

make install

When installing from the compiled source, strace will be installed in /usr/local/bin/ by default, but you can specify an installation prefix other than /usr/local/ by giving configure the option --prefix=PATH. (strace is installed in /usr/bin/ when installing from the Red Hat Enterprise Linux or Fedora Core RPMs.)

Tracing helloworld

strace works by intercepting and recording the system calls used by a process and the signals it receives. For each system call, the name, arguments, and return value are printed to standard error (stderr) or to a file specified. How about some examples?

To not obscure the workings of strace, I used the simplest program I could think of.

#include "stdio.h"

main()
{
 printf("Hello World \n");
}

A quick compile:

cc -o helloworld helloworld.c

and voila! The command ./helloworld now outputs:

Hello World

This may look like a picture of simplicity, and really it is, but now let's fire up strace. strace will trace and write to stderr all system calls and signals:

strace ./helloworld

Output of strace ./helloworld

Note:
I've added line numbers to make the strace output more readable, but this isn't written into strace.

The first thing we see is execve. execve() executes the program pointed to by filename. Next, uname() gets the name and type of the system I used for testing. Then our program asks for some memory (brk), and memory maps the shared libraries it needs to operate, like the dynamic loader and libc, all of which we'll get back to in a second. Line 23 of the strace output finally reveals the mission of our helloworld program (had we not seen the source): write to file descriptor 1 (stdout) some output and then report how many bytes that output is and what they are. It's not hard to see that using strace can help you track down a particular call or signal in a program. Why is this useful? Well, if you're writing any program slightly more complex than helloworld.c and it's crashing with little or no information as to what happened, strace can help you find that crash in progress.

Tracing date

Let's consider one such slightly more complex, but altogether common example. The date program makes all kind of system calls other than outputting the date and time. Run strace on the date command;

strace date

Output of strace date

We see execve(), uname(), and the same linker and threading information (assuming it's the same system as determined by uname()), and then the real work begins. Line 44 shows that local.archive is invoked and our locale is determined from a list of possible locales (line 48). On line 61, date calls clock_gettime() to determine what the "real time" is, then our timezone is determined (line 62) via a read at line 66. This all culminates in a write on line 73 in the same basic format as our Hello World example, except that in this case date is producing the output based on the configuration of a file, rather than our input.

Now about all that linker, threading, and memory output. We see a few references to different components of ld. ld is the GNU linker, and its job is to combine object and archive files, relocate their data, and resolve symbol references. Running ld is usually the last step in compiling a program. We also see a lot of memory mapping and unmapping going on. Every time we read or write something, we'll see mmap (in this case mmap2) and munmap. This is our program placing files and/or devices in memory. This is extremely useful in debugging memory intensive programs because you can see exactly where in the program you've written maps specific information from reads and writes into memory. More information on mmap2 is available at linuxinfor.com.

Tracing a process ID

Now, if you're a sysadmin you might be sitting there thinking, "What's this all got to do with me?" Well, you're responsible for the working order of your systems, and therefore you have to test system software before and after upgrades. Often sysadmins find themselves in the position of having to be an after-market debugger: find the bug, send in a report, yadda yadda. That's what we do. Well, strace can help.

One of the cooler options in strace is the -p argument, which allows you to attach to an existing process ID and track the system calls it makes. For example, suppose one day, you've found a new version of Bash is available. You update your system and begin noodling through your regular daily routine the next morning, but you find that this new version exhibits some unexpected behavior.

While in a shell, get the process ID of the shell with the command:

echo $$

Let's say the process ID is 12307. In another shell, start strace with the -p option:

strace -p 12307

which will return:

Process 12307 attached - interrupt to quit
read(0,

The unfinished read() function sits waiting for input. Suppose the user executes the date command. The first thing we see, since we are attached to the shell process, is those characters being typed in:

Process 12307 attached - interrupt to quit
read(0, "d", 1)                         = 1
write(2, "d", 1)                        = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
read(0, "a", 1)                         = 1
write(2, "a", 1)                        = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
read(0, "t", 1)                         = 1
write(2, "t", 1)                        = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
read(0, "e", 1)                         = 1
write(2, "e", 1)                        = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
read(0,

Again, read() waits. The user hits Enter.

Output after user hits enter

Sifting through rt_ and ioctl calls, we find our prompt written out at line 43, having been written to the shell itself back at line 18. This write serves as a nice demarcation for all the other system calls going on, including the undocumented rt_ calls. You can try this at home. You get the picture. You can trace all manners of bugs with strace.

Conclusion

strace is a very useful diagnostic and debugging tool. Programmers will find it useful for bug isolation, and system administrators will find it invaluable when tracking down problems in common programs.

The power of strace comes from the ability to locate bugs when either the source is not available or when a recompile is impractical. This does have the potential benefit of allowing you to see how non-open source software works. While some consider this to be an "evil" attribute, really it can only serve to provide the programmer/administrator with more insight into the workings of such a program. After all, the methods and programming styles used within the software can still remain a mystery without further forensic consideration. strace something.

About the author

Matt Frye is a UNIX/Linux system administrator living in North Carolina. He is Chairman of the North Carolina System Administrators and is an active member of the Triangle Linux User Group. In his spare time, he enjoys fly fishing and mental Kung Foo.