Degugging Memory Useage with Valgrind for Fun and Profit

Everyone is familiar with ‘malloc()’ and ‘free()’, and also their reputation for being notoriously dangerous when misused. Memory leaks, and other allocation related bugs, are some of the most common and pernicious issues affecting software. With the advent of more sophisticated compilers, as well as the burgeoning use of interpreted languages memory leaks are becoming a less commonly encountered problem, but they are still something that anyone working in C will probably face. Since C allows the programmer to do more of less whatever they please with their pointers it’s easy to mismanage a few bytes of memory in a program of non-trivial size, as there is no semantic enforced by the language to ensure any degree of memory safety. Memory management being so common an issue there has naturally developed a set of tools to help us deal with it. One of these tools, Valgrind can be incredibly helpful in the quest for perfect memory management. If you don’t have a working Valgrind install one can be obtained as follows. As of the writing of this article the current version is 3.11.0.

wget http://www.valgrind.org/downloads/valgrind-3.11.0.tar.bz2
tar -jxvf valgrind-3.11.0.tar.bz2
//once in the untarred directory
./configure
make && make install

Now that we have Valgrind we need some code that mismanages memory. One of Valgrind’s abilities is detecting memory leaks, so we need some code that leaks.

#include <stdlib.h>

int main() {
    void* unfreed;
    int idx;
    for (idx=0; idx<5; idx++) {
        unfreed = malloc(100);
    }
    return 0;
}

While Valgrind works with literally any binary (try it out on some of the coreutils), in practice it’s helpful to have debugging symbols, so it’s worth compiling the program under test with -g. Once you have a binary, it’s time to break out Valgrind. Since Valgrind supports a number of features your invocation has to specify the subset you wish to use. For this example, you want to check for leaks so you will have to enable the leak check tool. This should look something like

valgrind --tool=memcheck --leak-check=full ./program_name

Running this will print information about the programs memory (mis)useage that should look something like the listing below.

...
==24810== Command: ./program_name
==24810== 
==24810== 
==24810== HEAP SUMMARY:
==24810==     in use at exit: 500 bytes in 5 blocks
==24810==   total heap usage: 5 allocs, 0 frees, 500 bytes allocated
==24810== 
==24810== 500 bytes in 5 blocks are definitely lost in loss record 1 of 1
==24810==    at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==24810==    by 0x400525: main (test.c:7)
==24810== 
==24810== LEAK SUMMARY:
==24810==    definitely lost: 500 bytes in 5 blocks
==24810==    indirectly lost: 0 bytes in 0 blocks
==24810==      possibly lost: 0 bytes in 0 blocks
==24810==    still reachable: 0 bytes in 0 blocks
==24810==         suppressed: 0 bytes in 0 blocks
...

Valgrind is kind enough to show us that the allocation on line 7 results in the program leaking 500 bytes of data across 5 different allocations. Knowing how much and where you’re program is leaking is a step up from the massive no information that you typically have. This gives me an excellent opportunity to quote the legendary Billy Mays “but wait there’s more”! Valgrind can also tell you when you use invalid memory. Time for another broken program.

#include <stdlib.h>

int main()
{
char *x = malloc(10);
x[10] = 'a';
return 0;
}

The mistake here is pretty obvious, but it’s also a startlingly easy class of error to make. Write enough MatLab and all your indices go to hell. Fortunately Valgrind has tools to detect invalid heap access. If we invoke Valgrind we can see that it detects invalid memory access.

valgrind --tool=memcheck --leak-check=yes program_name
...
==9814==  Invalid write of size 1
==9814==    at 0x804841E: main (program_name.c:6)
==9814==  Address 0x1BA3607A is 0 bytes after a block of size 10 alloc'd
==9814==    at 0x1B900DD0: malloc (vg_replace_malloc.c:131)
==9814==    by 0x804840F: main (program_name.c:5)
...

If you order today for the low low price of free, memcheck will also let you know when you’re using uninitialized memory. For example, the statement

int x;
if (x==0) {...}

will generate a warning along the lines of

==17943== Conditional jump or move depends on uninitialised value(s)
==17943==    at 0x804840A: main (program_name.c:6)

In fact, Valgrind can trace the use of uninitialized variables throughout the program. Also worth mentioning is Valgrind’s ability to detect double frees.

As powerful as memcheck is Valgrind has aditional capabilities including thread error detection, call graph generation, and cache and branch prediction capabilities. see the docs for more information. Also, for those interested in the LLVM ecosystem Address sanitizer is an interesting tool for memory analysis using clang.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s