Subscribe to the feed

It used to be tricky to run GDB and connect Valgrind to it, but that’s not the case any more. The 3.21.0 Valgrind release brings new great features that allow using it and GDB much more easily and conveniently in a single terminal. In this article, I review some of those new features, and show you how to launch Valgrind from inside a GDB session. I also demonstrate the advantages of using --multi mode together with the Valgrind Python monitor commands on a small example program containing intentional bugs.

vgdb --multi mode

Valgrind has a builtin GDB server and can act as a remote target to GDB.  It can help to find invalid memory writes, use of uninitialized variables, memory leaks and many other bugs.Serving as an intermediary between Valgrind and GDB, vgdb is a small program but an important one. Valgrind's gdbserver only  works after the process has already started, so you need vgdb to communicate with GDB and set up the initial connection.

You used to have to run Valgrind and GDB in two separate terminals, and then connect them together using a special command. In fact, I wrote an article about setting up and using GDB and Valgrind together previously. But vgdb --multi changes this, allowing you to start Valgrind from inside GDB itself.

GDB uses a remote protocol to communicate with debug agents such as gdbserver. In multi mode, GDB’s  extended remote protocol enables communication between GDB and vgdb.  After setting the initial  connection, vgdb starts Valgrind with the arguments it got from GDB. Then vgdb starts forwarding communication it's getting from GDB to Valgrind.

Running Valgrind from inside GDB has many advantages. GDB is an interactive program, but Valgrind is not. When debugging a program under GDB, you can stop a program, step through it, inspect the program state, and so on.  When running a program under Valgrind, however, you get the diagnostics at the end. Sometimes, it might not be easy to determine the exact point at which a problem happened. With vgdb –multi, you can inspect an issue (such as invalid write, for example)  at the point it happened.  You can set a breakpoint with GDB and ask Valgrind to provide various diagnostics. You could ask Valgrind whether some variable has been initialized, for example, or you can scan for memory leaks.

Another advantage is that running Valgrind from inside GDB  makes it easy to stay inside GDB, so the  debugged program can be rerun easily. This feature allows you to keep breakpoints, settings, scripts, GDB history, and so on. It is, however, still experimental and has some limitations (which we are currently working on).

Current limitations of --multi mode

The vgdb --multi feature uses GDB's remote debugging protocol. It was originally designed for remote debugging, but we wanted to make it useful for local debugging, too. We encountered several challenges while attempting to provide a good user experience.

The main problem is that the file descriptors associated with STDIN and STDOUT are being used for GDB and vgdb communication, so a debugged program can’t read from STDIN and output to STDOUT. To avoid this problem, it's possible to run GDB and vgdb in two different terminals, connecting them with the --port option. To do that, you can choose some port, for example 5555 (but you can choose anything). Start vgdb with the --multi option:

vgdb --multi --port=5555
listening on port 5555 …

Then start GDB  using the target extended-remote command, specifying the same port. Set the remote-exec file to tell Valgrind the name of the program to run.  In this example, I use /bin/echo as a simple example of the debugged program:

gdb
(gdb) target extended-remote:5555
Remote debugging using :5555

(gdb) set remote exec-file /bin/echo
(gdb) run
Starting program:  
warning: remote target does not support file transfer, attempting to access files from local filesystem.
Reading symbols from /usr/bin/echo...
Reading symbols from .gnu_debugdata for /usr/bin/echo...\

This makes the setup more complicated than usual, but you’re still getting all of the advantages of  only having to start things up once.

The other drawback of --multi mode is that you need to perform a set of preparatory steps before being able to launch Valgrind from inside GDB.  You must use set remote exec-file to define the debugged program and then set sysroot /. Only after that is it possible to actually launch Valgrind with the command target extended-remote | coregrind/vgdb --multi.

There is a plan to simplify this process by implementing a new target valgrind command, which would take care of all the preparatory steps.

New GDB Valgrind python commands

A truly useful feature was implemented for Valgrind 3.21.0:  Python code, provided by Valgrind, is now automatically loaded by GDB when using Valgrind's built-in gdbserver.  This code defines GDB front end commands that correspond to the Valgrind monitor commands.  

Monitor commands are not specific to Valgrind. They're commands that send special requests to gdbserver. Valgrind has its own set of monitor commands for Valgrind's gdbserver. 

Monitor commands are normally plain text strings that are sent to gdbserver. But the new

Python code provides a much better integration using GDB’s command line interface. With this Python script,  GDB auto-completion is available , command specific help, searching for a command or command help matching a regex. For relevant monitor commands, GDB will evaluate command arguments. This is incredibly helpful when using monitor commands and makes a huge difference for user experience. Here's an example.

This code has intentional errors:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

typedef struct foo {
    unsigned char flag1;
    unsigned char flag2;
} foo;

typedef struct bar {
   int foo;
   long bar;
} bar;

void setup_foo(struct foo *s)
{
    s->flag2 = 1;
}

int main(void)
{
   foo s;
   bar b1, b2;
   FILE *file;
   int file2;
   setup_foo(&s);                                                             /* Line 34 */
   if (s.flag1 || s.flag2)                                                     /* Line 35 */   
        printf("hey\n");

   file = fopen("temporary_file", "w");
   fprintf(file, "Printing to the  file!\n");
   fprintf(file, "flag1 value: ");
   putw(s.flag1, file);
   fclose(file);

   b1.foo = 1;
   b1.bar = 2;

   /* This is fine. */
   b2 = b1;
   file2 = open("temp2", O_CREAT | O_RDWR);
   write (file2, &b2, sizeof(b2));
   return 0;
}

When compiling this program, compile it with debugging symbols by using the  -g option:

$ gcc -g example.c -o ex

Launch Valgrind from inside GDB

Start by running a program you want to debug under GDB, and then adding a few additional commands.

gdb  -q ./ex
Reading symbols from ./ex…

Next, use the preparation steps I've demonstrated in this article:

(gdb) set remote exec-file ./ex
(gdb) set sysroot /
(gdb) target extended-remote | vgdb --multi --vargs -q
Remote debugging using | vgdb --multi --vargs -q
(gdb) start
Temporary breakpoint 2 at 0x401206: file example.c, line 34.
Starting program: /root/valgrind/ex
relaying data between gdb and process 532003
Loaded /usr/share/gdb/auto-load/valgrind-monitor.py
Type "help valgrind" for more info.

Breakpoint 1, main () at example.c:34
34   setup_foo(&s);

Looking at the GDB output, notice that the valgrind-monitor.py script was loaded. It is what implements the new python monitor commands.  Everything after --vargs are extra arguments passed on to Valgrind.  In this case, the -q option makes Valgrind quiet. You could also have used run instead of start, but start sets an initial breakpoint on main.

It's possible to use GDB's ex command to run everything in one step:

gdb -ex 'set remote exec-file ./prog' -ex 'set sysroot /' \
-ex 'target extended-remote | vgdb --multi --vargs -q' ./prog

GDB help

GDB's help command has been augmented with information about the new Valgrind features. This is one of the advantages that the updated Python script brings. No searching for the documentation somewhere, just type help. Here are several examples:

(gdb) help valgrind
valgrind, v, vg
Front end GDB command for Valgrind gdbserver monitor commands.
Usage: valgrind VALGRIND_MONITOR_COMMAND [ARG...]
VALGRIND_MONITOR_COMMAND is a valgrind subcommand, matching a
gdbserver Valgrind monitor command.
ARG... are optional arguments.  They depend on the VALGRIND_MONITOR_COMMAND.)

Type "help memcheck" or "help mc" for memcheck specific commands.
Type "help helgrind" or "help hg" for helgrind specific commands.
Type "help callgrind" or "help cg" for callgrind specific commands.
Type "help massif" or "help ms" for massif specific commands.
[...]

This command lists some generic Valgrind Python commands first, then it lists how to get help for commands specific to various Valgrind tools.  Note that memcheck is the default Valgrind tool (you can change to a different one with the --tool argument).

(gdb) help memcheck
memcheck, mc
Front end GDB command for Valgrind memcheck gdbserver monitor commands.
Usage: memcheck MEMCHECK_MONITOR_COMMAND [ARG...]
MEMCHECK_MONITOR_COMMAND is a memcheck subcommand, matching
a gdbserver Valgrind memcheck monitor command.
ARG... are optional arguments.  They depend on the MEMCHECK_MONITOR_COMMAND.

List of memcheck subcommands:

memcheck block_list -- Show the list of blocks for a leak search loss record.
memcheck check_memory -- Command to check memory accessibility.
memcheck get_vbits -- Print validity bits for LEN (default 1) bytes at ADDR.
memcheck leak_check -- Execute a memcheck leak search.
memcheck make_memory -- Prefix command to change memory accessibility.
memcheck who_points_at -- Show places pointing inside LEN (default 1) bytes at ADDR.
memcheck xb -- Print validity bits for LEN (default 1) bytes at ADDR.
memcheck xtmemory -- Dump xtree memory profile in FILENAME (default xtmemory.kcg.%p.%n).
[...]

When using Valgrind alone, options to Valgrind must be provided at startup, but when using Valgrind with GDB, certain options can be changed dynamically. Here's an example of adding a leak check:

(gdb) valgrind v.clo
  dynamically changeable options:
-v --verbose -q --quiet -d --stats --vgdb=no --vgdb=yes --vgdb=full
--vgdb-poll --vgdb-error --vgdb-stop-at --error-markers --show-error-list -s
--show-below-main --time-stamp --trace-children --child-silent-after-fork

(gdb) valgrind v.clo --leak-check=yes
==215307== Handling new value --leak-check=yes for option --leak-check=yes
(gdb)

Using GDB Valgrind python commands for monitor

As shown below, Valgrind’s output is conveniently interleaved with GDB's output. (In situations where you don't want interleaved output, it's still possible to run vgdb --multi in a separate terminal where a connection to GDB is established using the --port option.)

(gdb) c
Continuing.
==532003== Conditional jump or move depends on uninitialised value(s)
==532003== at 0x401218: main (example.c:35)
==532003==
==532003== (action on error) vgdb me ...

Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000000401218 in main () at example.c:35
35   if (s.flag1 || s.flag2)

The SIGTRAP signal informs you that something is wrong. Valgrind emulates a "trapping instruction" by generating a SIGTRAP signal. This SIGTRAP is sent by Valgrind only when attached to GDB, so that GDB can intercept the signal. Otherwise, Valgrind would simply print the error.

Checking for uninitialized values

Valgrind is complaining about an uninitialised value. There’s a monitor command xb <addr> [<len>], which informs you about the validity of the variable bytes. Previously, you'd have had to print the variable address before using it for the monitor commands.

(gdb) print &s.flag1
$4 = (int *) 0x1ffefff400
(gdb) print sizeof (s.flag1)
$6 = 1
(gdb) monitor xb 0x1ffefff400 1
ff
0x1FFEFFFE02:    0x00

But things are easier now with the new Valgrind monitor python extension. The GDB front end  command evaluates both the address argument and the variable size. To use the Python extension, you need to use the memcheck command instead of monitor. The monitor command can still be used of course, but without the advantages of the Python extension.

memcheck xb &s sizeof(s.flag1)
      ff
0x1FFEFFFE02:    0x00

The output from xb is in hexadecimal, so ff equals 1111 1111, indicating an undefined value. Memcheck tracks definedness on the bit level. This tells you that s.flag1 is undefined.  Looking at the example program code, you can see that only flag2 was initialized. More details can be found at memcheck monitor documentation.

The memcheck who_points_at command is used to ask for information about possibly leaked pointers. As was the case with the memcheck xb command, the Python extension obtains the variable's address, a nice improvement over the corresponding monitor command.

(gdb) memcheck who_points_at &s.flag1
==777282== Searching for pointers to 0x1ffefffe02
==777282== tid 1 register RDI pointing at 0x1ffefffe02

Investigating the stacktraces

Running Valgrind from inside GDB makes it easier to investigate Valgrind stacktraces, because you  see them immediately in the GDB terminal.

(gdb) c
Continuing.
==532111== Continuing ...
==532111== Syscall param write(buf) points to uninitialised byte(s)
==532111== at 0x4963FB4: write (write.c:26)
…
==532111== by 0x48D8C12: fclose@@GLIBC_2.2.5 (iofclose.c:53)
==532111== by 0x401230: main (example.c:36)
==532111==  Address 0x4aa52c4 is 36 bytes inside a block of size 4,096 alloc'd
==532111== at 0x484182F: malloc (vg_replace_malloc.c:431)
…
==532111== by 0x48DA101: fwrite (iofwrite.c:39)
==532111== by 0x4011F4: main (example.c:33)

After continuing, you can see two stacktraces.  At the first stacktrace, you can call fclose to make sure everything is written. It hits the write syscall, which gets a buffer including some undefined bytes. The second stacktrace then shows you where that problematic block of memory comes from.

==532111== by 0x4011F4: main (example.c:33)

This line refers to the fprintf call and the buf, which is passed to write was allocated there.

Address 0x4aa52c4 is 36 bytes inside a block of size 4,096 alloc'd

The get_vbits command is useful to check the definedness of the address.

(gdb) memcheck get_vbits 0x4aa52c4
ff

You can see the value the address points to is uninitialized.  In this example,  the first string "Printing to the  file!\n" is 23 bytes, and the next "flag1 value: " is 13 bytes (23 + 13 = 36). The next thing printed is s.flag1, and it is uninitialized.

Monitoring potential memory leaks

Scanning for memory leaks renders a lot of output. It isn't a real leak_check because the program is still running, so memory could still be freed up.

(gdb)  memcheck leak_check full reachable any
==777217== 472 bytes in 1 blocks are still reachable in loss record 1 of 3
==777217== at 0x484182F: malloc (vg_replace_malloc.c:431)
==777217== by 0x48D9557: __fopen_internal (iofopen.c:65)
==777217== by 0x4011D5: main (example.c:32)
==777217==
==777217== 4,096 bytes in 1 blocks are still reachable in loss record 2 of 3
==777217== at 0x484182F: malloc (vg_replace_malloc.c:431)
…
==777217== by 0x48DB32C: puts (ioputs.c:40)
==777217== by 0x4011C6: main (example.c:30)
==777217==
==777217== 4,096 bytes in 1 blocks are still reachable in loss record 3 of 3
==777217== at 0x484182F: malloc (vg_replace_malloc.c:431)
…
==777217== by 0x48DA101: fwrite (iofwrite.c:39)
==777217== by 0x4011F4: main (example.c:33)

When examining possible memory leaks, the block_list command might be useful. This command shows the list of blocks belonging to  the loss record.

(gdb) memcheck block_list 2
==777217== 4,096 bytes in 1 blocks are still reachable in loss record 2 of 3
==777217== at 0x484182F: malloc (vg_replace_malloc.c:431)
…
==777217== by 0x48DB32C: puts (ioputs.c:40)
==777217== by 0x4011C6: main (example.c:30)
==777217== 0x4AA4040[4096]

Identifying gaps with Valgrind

After continuing for the last time, Valgrind complains about writing some uninitialised bytes with the write system call.

==777282== Continuing ...
==777282== Syscall param write(buf) points to uninitialised byte(s)
==777282== at 0x4963FB4: write (write.c:26)
==777282== by 0x40127C: main (example.c:45)
==777282==  Address 0x1ffefffde4 is on thread 1's stack
==777282==  in frame #1, created by main (example.c:22)
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000004963fb4 in write () from /lib64/libc.so.6

But how is this possible? Valgrind points to line 45 of the example.c code, where the structure b2 is written to the file. When looking at the code, you can see that the structure b2  is a copy of structure b1, which has two members, both initialized.  Look at the structures by using the xb command again. You got the SIGTRAP signal inside the write glibc function, so you need to use finish on this function to get back to example.c again.

(gdb) finish
Run till exit from #0  0x0000000004963fb4 in write () from /lib64/libc.so.6
==777282== Continuing ...
main () at example.c:47
47   return 0;
(gdb) memcheck xb &b1 sizeof(b1)
      00      00      00      00      ff      ff      ff      ff
0x1FFEFFFDF0:    0x01    0x00    0x00    0x00    0x00    0x00    0x00    0x00
      00      00      00      00      00      00      00      00
0x1FFEFFFDF8:    0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00

The first four bytes are initialized and the four bytes after them are not.  The reason for the gap is that the ABI for some architectures have alignment requirements. In this case, the int field is 4 bytes wide and the long field is 8 bytes wide. The long field is aligned for this particular architecture so that it may be loaded efficiently. The first variable in a b1 structure (foo) has an int type, which should be 4 bytes. GDB can help you verify that by printing the variable size:

(gdb) p sizeof(b1.foo)
$2 = 4

This means there is a hole after the first variable. It’s also possible to see this hole with the pahole program:

pahole ./ex
…
struct bar {
    int foo; /*0 4 */
    /* XXX 4 bytes hole, try to pack */
    long int bar; /*8 8 */

    /* size: 16, cachelines: 1, members: 2 */
    /* sum members: 12, holes: 1, sum holes: 4 */
    /* last cacheline: 16 bytes */
};

The structure can contain gaps between fields that are never initialized. This can be avoided by  zeroing out the whole structure with memset.

Summary

Using Valgrind and GDB together can be extremely useful for debugging memory related problems. It used to be complicated to use Valgrind and GDB together, but recent work has made it easier. Everything shown in this article works on Fedora 38 with Valgrind 3.21.0. I'm working on even better integration, trying to solve the input and output issue, and the upstream  developers are working  towards a distinct Valgrind target to simplify launching Valgrind from GDB.


About the author

UI_Icon-Red_Hat-Close-A-Black-RGB

Browse by channel

automation icon

Automation

The latest on IT automation for tech, teams, and environments

AI icon

Artificial intelligence

Updates on the platforms that free customers to run AI workloads anywhere

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

The latest on how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the platforms that simplify operations at the edge

Infrastructure icon

Infrastructure

The latest on the world’s leading enterprise Linux platform

application development icon

Applications

Inside our solutions to the toughest application challenges

Original series icon

Original shows

Entertaining stories from the makers and leaders in enterprise tech