Issue #15 January 2006

Using valgrind to detect and prevent application memory problems

by Michael Behm


Note: This article has been corrected post-publication. An earlier version quoted out-of-date information about valgrind. We regret any inconvenience this has caused.

C, and to a lesser extent C++, are sometimes referred to as "portable assembly languages." This means that they are portable across platforms, but are low-level enough to comfortably deal with hardware and raw bits in memory. This makes them particularly suited for writing systems software such as operating systems, databases, network servers, and data/language processors. However, the runtime model of C/C++ does not include any checking of pointer use, so errors can easily creep in. Or, as valgrind creator Julian Seward puts it, "Valgrind is only useful because C and C++ are such crappy programming languages."1

Several kinds of pointer-use errors are widely known by every C/C++ programmer; accessing freed objects, going past buffer boundaries, dereferencing NULL or other bad pointers, can each result in a spectrum of effects, including random glitches, outright crashes, and security breaches (stack-smashing buffer overrun errors, where malevolent input causes pointers go beyond their bounds to corrupt memory). Such bugs can induce hard-to-debug delayed failures.

"I used valgrind a bit while developing gcjx—I used it to verify that my memory management approach was working properly. It has been invaluable, and caught several real bugs."

Tom Tromey
Principal Software Engineer
Red Hat

Using valgrind

valgrind executes your program in a virtual machine, keeping track of the memory blocks that have been allocated, initialized, and freed. It simulates every instruction a program executes, which reveals errors not only in an application, but also in all supporting dynamically-linked libraries, including the GNU C library, the X client libraries, Qt (if you work with KDE), and so on.

valgrind can detect invalid write accesses, attempted reads of uninitialized memory, and the use of undefined values:

Using uninitialized memory.

Uninitialized data can come from uninitialized local variables or from malloc'ed blocks before your program has written data to that area.

Reading/writing memory after it has been freed.
Reading/writing after the end of malloc'ed blocks.

Errors that occur from attempting to access data at the wrong time or the wrong place.

Reading/writing inappropriate areas on the stack.

This can happen in two ways:

  • Illegal read/write errors occur when you try to access an address that is not in the address range of your program.

  • Invalid free errors occur when you try to free areas that have not been allocated.

Memory leaks.

Here, pointers to malloc'ed blocks are lost, so the memory is never freed.

Mismatched allocate/free functions.

In C++ you can choose from more than one function allocate memory, as long as you follow these rules to free that memory:

  • Memory allocated with malloc, calloc, realloc, valloc, or memalign must be deallocated with free.

  • Memory allocated with new[] must be deallocated with delete[].

  • Memory allocated with new must be deallocated with delete.

Memory errors can be difficult to detect as the symptoms of the error may occur far from the cause. However, valgrind can detect all of the above errors, as well as:

  • Errors that occur because of invalid system-call parameters (overlapping src and dst pointers in memcpy() and related functions).

  • Some errors resulting from non-conformance to the POSIX pthreads API.

Valgrind overview

valgrind consists of a core, which simulates the operation of an x86 CPU in software, and a series of tools for debugging and profiling. The tools include:

"The memcheck tool has done a lot to improve overall code quality and to cut down development time. But I especially like that it is so easy to write your own, specialized tools—this opens many more possibilities."

Uli Drepper
Consulting Engineer
Red Hat

Memcheck

In addition to checking every read and write of memory, memcheck detects memory-management problems in programs by tracking all malloc/new calls and the corresponding free/delete calls.

Note that programs may run up to 50 times slower when you use memcheck.

Cachegrind

Cachegrind, the cache profiler, simulates the I1, D1 and L2 caches in the CPU so that it can pinpoint the sources of cache misses in the code. It can show the number of cache misses, memory references, and instructions accruing to each line of source code, with per-function, per-module, and whole-program summaries. It can also show counts for each individual x86 instruction.

Cachegrind is complemented by the KCacheGrind visualization tool (http://kcachegrind.sourceforge.net/cgi-bin/show.cgi), a KDE application that graphs these profiling results.

Note:
Some minor tools (corecheck, lackey, and Nulgrind) are supplied mainly to illustrate how to create simple tools.

Previous releases of valgrind included other tools, such as addrcheck and helgrind, that are currently broken.

Valgrind limitations and dependencies

valgrind has a number of limitations:

  • False-positives are known; false negatives are a possibility.

  • valgrind is up to 50 times slower than native execution and increases your memory footprint.

  • valgrind runs on X86, AMD64 and PPC32 machines running kernel 2.4.X or 2.6.X and glibc 2.2.X or 2.3.X. This covers the vast majority of Linux installations.

Before running valgrind

You should recompile your application and libraries with debugging info enabled (the -g flag) to enable valgrind to know to which function a particular piece of code belongs.

You should also set the -fno-inline option, which makes it easier to see the function-call chain and to navigate in large C++ applications. You do not have to use this option, but doing so helps valgrind produce more accurate and usable error reports.

Note:
Sometimes optimization levels at -O2 and above generate code that leads Memcheck to wrongly report uninitialised value errors. The best solution is to turn off optimization altogether, but as this often makes things unmanageably slow, a compromise is to use -O. This gets you the majority of the benefits of higher optimization levels while keeping relatively small the chances of false errors from Memcheck.

Running valgrind

To run valgrind, use:

# valgrind [ --tool=toolname ] commandname 

Memcheck is the default --tool.

For example:

valgrind --tool=memcheck ls -l

Errors are reported before the associated operation actually happens. If you are using a tool (Memcheck) that does address checking and your program attempts to read from address zero, the tool will emit a message to this effect, then the program will die with a segmentation fault.

Valgrind processing options

valgrind has many processing options that you may find useful:

--log-file=filename

Writes the commentary to filename.pidpidnumber This is helpful when running valgrind on a tree of processes at once, as each process writes to its own logfile.

-v

Reports how many times each error occurred. When execution finishes, all the reports are printed out, sorted by their occurrence counts. This makes it easy to see which errors have occurred most frequently.

--leak-check=yes

Searches for memory leaks when the program being tested exits. The option --leak-check=full is very useful with Memcheck.

--error-limit=no flag

Disables the cutoff for error reports (300 different errors or 30000 errors in total—after suppressed errors are removed).

Note:
These cutoff limits are set in vg_include.h and can be modified.

How to read the valgrind error report

Here is a sample of output for a test program:

01 ==21333== Invalid read of size 4
02 ==21333==  at 0x80484F6: print (valg_eg.c:7)
03 ==21333==  by 0x8048561: main (valg_eg.c:16)
04 ==21333==  Address 0x40C9104C is 0 bytes after a block of size 40 malloc'ed
05 ==21333==  at 0x40046824: malloc (vg_clientfuncs.c:100)
06 ==21333==  by 0x8048524: main (valg_eg.c:12)

21333 is the process ID. The remaining text is described below:

  • 02: A read error is at line 7, in the function print.

  • 03: The function print is in the function main.

The remaining text describes a malloc error.

Using valgrind to debug libgcj

While using valgrind to try and help track a possible memory leak bug in the libgtk libraries, developers initially saw that a lot of output being generated pointing to the GC (garbage collection) routines inside of the libgcj libraries. They were concerned at first that they had found major problems with gcj because they saw the following error over and over:

==5278== Use of uninitialized value of size 4
==5278==    at 0x55AE01F: GC_mark_and_push_stack (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55AE1DF: GC_push_all_eager (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55AF48A: GC_push_current_stack (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55B59D8: GC_with_callee_saves_pushed (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55B5A12: GC_generic_push_regs (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55AF535: GC_push_roots (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55AEB23: GC_mark_some (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55A6517: GC_stopped_mark (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55A6E29: GC_try_to_collect_inner (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55B05E6: GC_init_inner (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55B06D7: GC_init (in /usr/lib/libgcj.so.6.0.0)
==5278==    by 0x55AAB9B: GC_init_gcj_malloc (in /usr/lib/libgcj.so.6.0.0)

It turns out that GC routines by their very nature do some "quasi-legal" things while performing their work. Because GC works by skating close to the legal line in de-allocating/freeing memory, valgrind flags the operation—it reports anything the least bit shady even though all really is well. Unfortunately, this generates a large amount of output to wade through to get to the real problem.

Fortunately, there is this website: http://gcc.gnu.org/wiki/Debugging%20tips%20for%20libgcj. At the very end is a "valgrind" section that contains a link to a "suppression file" that can be pointed to when valgrind starts up. To use that file, simply cut and paste it into an editor and save it. When starting valgrind, simply added a --suppressions=path-to-suppression-file. This suppression file is used by valgrind to filter "errors" it finds from the GC routines and greatly reduces the amount of output it generates. The output is reduced by over 50% during some debugging sessions.

Mudflap: An alternative to valgrind

mudflap is a tool that has functionality similar to valgrind.

mudflap is a new compiler option in GCC 4.x that detects memory-handling problems. It does this by looking for unsafe source-level pointer operations. Constructs are replaced with expressions that normally evaluate to the same value, but include parts that refer to libmudflap, the mudflap runtime. To evaluate memory accesses, the runtime maintains a database of valid memory objects.

Consider this code:

int a[10];
int b[10];

int main(void) {
   return a[11];
}

valgrind detects no error; the compiler allocates memory for a and b consecutively, and an access of a[11] will likely read b[1].

However, if you use GCC 4.x from FC4 or higher to compile the code:

gcc -o bug bug.c -g -fmudflap -lmudflap

When executed, the program fails and mudflap prints a warning.

mudflap can also print a list of memory leaks through its -print-leaks option. However, mudflap does not detect reads of uninitialized memory.

Mudflap instrumentation and runtime costs extra time and memory. At build time, the compiler needs to process the instrumentation code. When running, it takes time to perform the checks, and memory to represent the object database. The behavior of the application has a strong impact on the run-time slowdown, affecting the lookup cache hit rate, the overall number of checks, and the number of objects tracked in the database, and their rates of change.

Resources

About the author

Michael Behm is a technical writer at Red Hat.