4. A Closer View
4.1. Why Valgrind?
As said above, memory management is prone to errors that are too hard to detect. Common errors may be listed as:
Use of uninitialized memory
Reading/writing memory after it has been freed
Reading/writing off the end of malloc'd blocks
Reading/writing inappropriate areas on the stack
Memory leaks -- where pointers to malloc'd blocks are lost forever
Mismatched use of malloc/new/new[] vs free/delete/delete[]
Some misuses of the POSIX pthreads API
These errors usually lead to crashes.
This is a situation where we need Valgrind. Valgrind works directly with the executables, with no need to recompile, relink or modify the program to be checked. Valgrind decides whether the program should be modified to avoid memory leak, and also points out the spots of "leak."
Valgrind simulates every single instruction your program executes. For this reason, Valgrind finds errors not only in your application but also in all supporting dynamically-linked (.so-format) libraries, including the GNU C library, the X client libraries, Qt if you work with KDE, and so on. That often includes libraries, for example the GNU C library, which may contain memory access violations.
4.2. Usage
4.2.1. Invoking Valgrind
The checking may be performed by simply placing the word valgrind just before the normal command used to invoke the program. For example:
#valgrind ps -ax |
Valgrind provides thousands of options. We deliberately avoid them, not to make this article boring.
The output contains the usual output of ps -ax also with the detailed report by valgrind. Any error (memory related) is pointed out in the error report.
4.2.2. How to Identify the Error from the Error Report
Consider the output of Valgrind for some test program:
==1353== Invalid read of size 4 ==1353== at 0x80484F6: print (valg_eg.c:7) ==1353== by 0x8048561: main (valg_eg.c:16) ==1353== by 0x4026D177: __libc_start_main (../sysdeps/generic/libc-start.c :129) ==1353== by 0x80483F1: free@@GLIBC_2.0 (in /home/deepu/valg/a.out) ==1353== Address 0x40C9104C is 0 bytes after a block of size 40 alloc'd ==1353== at 0x40046824: malloc (vg_clientfuncs.c:100) ==1353== by 0x8048524: main (valg_eg.c:12) ==1353== by 0x4026D177: __libc_start_main (../sysdeps/generic/libc-start.c :129) ==1353== by 0x80483F1: free@@GLIBC_2.0 (in /home/deepu/valg/a.out) |
Here, 1353 is the process ID. This part of the error report says that a read error has occurred at line number 7, in the function print. The function print is called by function main, and both are in the file valg_eg.c. The function main is called by the function __libc_start_main at line number 129, in ../sysdeps/generic/libc-start.c. The function __libc_start_main is called by free@@GLIBC_2.0 in the file /home/deepu/valg/a.out. Similarly details of calling malloc are also given.
4.2.3. Types of Errors with Examples
Valgrind can only really detect two types of errors: use of illegal address and use of undefined values. Nevertheless, this is enough to discover all sorts of memory management problems in a program. Some common errors are given below.
4.2.3.1. Use of uninitialized memory
Sources of uninitialized data are:
local variables that have not been initialized.
The contents of malloc'd blocks, before writing something there.
This is not a problem with calloc since it initializes each allocated bytes with 0. The new operator in C++ is similar to malloc. Fields of the created object will be uninitialized.
Sample program:
#include <stdlib.h> int main() { int p, t; if (p == 5) /*Error occurs here*/ t = p+1; return 0; } |
Here the value of p is uninitialized, therefore p may contain some random value (garbage), so an error may occur at the condition check. An uninitialized variable will cause error in 2 situations:
When it is used to determine the outcome of a conditional branch. Eg:'if (p == 5)' in the above program.
When it is used to generate a memory address. Eg: In the above program let there be an integer array a[10], and if you write 'a[p] = 1', it will generate an error.
4.2.3.2. Illegal read/write
Illegal read/write errors occurs when you try to read/write from/to an address that is not in the address range of your program.
Sample program:
#include <stdlib.h> int main() { int *p, i, a; p = malloc(10*sizeof(int)); p[11] = 1; /* invalid write error */ a = p[11]; /* invalid read error */ free(p); return 0; } |
Here you are trying to read/write from/to address (p+sizeof(int)*11) which is not allocated to the program.
4.2.3.3. Invalid free
Valgrind keeps track of blocks allocated to your program with malloc/new. So it can easily check whether argument to free/delete is valid or not.
Sample program:
#include <stdlib.h> int main() { int *p, i; p = malloc(10*sizeof(int)); for(i = 0;i < 10;i++) p[i] = i; free(p); free(p); /* Error: p has already been freed */ return 0; } |
Valgrind checks the address, which is given as argument to free. If it is an address that has already been freed you will be told that the free is invalid.
4.2.3.4. Mismatched Use of Functions
In C++ you can allocate and free memory using more than one function, but the following rules must be followed:
If allocated with malloc, calloc, realloc, valloc or memalign, you must deallocate with free.
If allocated with new[], you must deallocate with delete[].
If allocated with new, you must deallocate with delete.
Sample program:
#include <stdlib.h> int main() { int *p, i; p = ( int* ) malloc(10*sizeof(int)); for(i = 0;i < 10;i++) p[i] = i; delete(p); /* Error: function mismatch */ return 0; } |
Output by valgrind is:
==1066== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) ==1066== malloc/free: in use at exit: 0 bytes in 0 blocks. ==1066== malloc/free: 1 allocs, 1 frees, 40 bytes allocated. ==1066== For a detailed leak analysis, rerun with: --leak-check=yes ==1066== For counts of detected errors, rerun with: -v |
>From the above "ERROR SUMMARY" it is clear that there is 0 bytes in 0 blocks in use at exit, which means that the malloc'd have been freed by delete. Therefore this is not a problem in Linux, but this program may crash on some other platform.
4.2.3.5. Errors Occur Due to Invalid System Call Parameter
Valgrind checks all parameters to system calls.
Sample program:
#include <stdlib.h> #include <unistd.h> int main() { int *p; p = malloc(10); read(0, p, 100); /* Error: unaddressable bytes */ free(p); return 0; } |
==1045== Syscall param read(buf) contains unaddressable byte(s) ==1045== at 0x4032AF44: __libc_read (in /lib/i686/libc-2.2.2.so) ==1045== by 0x4026D177: __libc_start_main (../sysdeps/generic/libc-start.c:129) ==1045== by 0x80483E1: read@@GLIBC_2.0 (in /home/deepu/valg/a.out) |
Here, buf = p contains the address of a 10 byte block. The read system call tries to read 100 bytes from standard input and place it at p. But the bytes after the first 10 are unaddressable.
4.2.3.6. Memory Leak Detection
Consider the following program:
#include <stdlib.h> int main() { int *p, i; p = malloc(5*sizeof(int)); for(i = 0;i < 5;i++) p[i] = i; return 0; } |
==1048== LEAK SUMMARY: ==1048== definitely lost: 20 bytes in 1 blocks. ==1048== possibly lost: 0 bytes in 0 blocks. ==1048== still reachable: 0 bytes in 0 blocks. |
In the above program p contains the address of a 20-byte block. But it is not freed anywhere in the program. So the pointer to this 20 byte block is lost forever. This is known as memory leaking. We can get the leak summary by using the Valgrind option --leak-check=yes.
4.2.4. How to Suppress Errors
Valgrind detects numerous problems in many programs which come pre-installed on your GNU/Linux system. You can't easily fix these, but you don't want to see these errors (and yes, there are many!). So Valgrind reads a list of errors to suppress at startup, from a suppression file ending in .supp.
Suppression files may be modified. This is useful if part of your project contains errors you can't or don't want to fix, yet you don't want to continuously be reminded of them. The format of the file is as follows.
{ Error name Type fun:function name, which contains the error to suppress fun:function name, which calls the function specified above } |
Error name can be any name. type=ValueN, if the error is an uninitialized value error. =AddrN, if it is an address error.(N=sizeof(data type)) =Free, if it is a free error (eg:mismatched free) =Cond, if error is due to uninitialized CPU condition code. =Param, if it is an invalid system call parameter error. |
You can then run the program with:
valgrind --suppressions=path/to/the/supp_file.supp testprog |
4.3. Limitations and Dependencies of Valgrind.
No software is free from limitations. The same is the case of Valgrind, however most programs work fine. The limitations are listed below.
Program runs 25 to 50 times slower.
Increased memory consumption.
Highly optimized code (compiled with -O1, -O2 options ) may sometimes cheat Valgrind.
Valgrind relies on dynamic linking mechanism.
Valgrind is closely tied to details of the CPU, operating system and to a less extent, compiler and basic C libraries. Presently Valgrind works only on the Linux platform (kernels 2.2.X or 2.4.X) on x86s. Glibc 2.1.X or 2.2.X is also required for Valgrind.