Simple use of gdb

This example is of a program with two .c files. One is example.c:

#include <stdio.h>

int main()
{
    char *s = NULL;
    extern void f(char *q);
    int i = 12;
    i += 5;
    f(s);
    printf("%s, world\n", s);
    return(0);
}

The other is example2.c, whose function f() is called from main() above:

void f(char *q)
{
    q[0] = 'h';
    q[1] = 'i';
    q[2] = '\0';
}

Below we intersperse a sample terminal session, in the typewriter-like font, with commentary, in this font.

We are happily working on our program. We have previously compiled both .c files to .o files, creating example.o and example2.o. Now we edit example2.c, and so we recompile that file and re-link (probably via a Makefile, but that's not the point right now).

$ gcc -Wall -c example2.c
$ gcc example.o example2.o
$ ./a.out
Segmentation fault
$

Ok, let's run gdb. First of all, gdb works best on files compiled to contain extra information about variable names, function names, etc -- these things are not present in machine language, but the a.out file format can support having this auxiliary information elsewhere in the file. There is a gcc option "−g" which includes this information. The most debugging information is present if −g is present both in the compile stage and in the link stage. In the current case, we can just recompile everything with −g and re-link with −g in one fell swoop:

$ gcc -g example.c example2.c
$

If you are working on an assignment for which I have supplied a Makefile, it probably already contains −g in the compile and link stages so you wouldn't need this step. Have a look.

Now we run gdb. The gdb command takes at least one argument which is the a.out-format file to debug. Then you can start it running with the "run" command.
(If you want to specify any members of argv, list them after the run command, e.g. if you would type "./a.out −a file", instead to gdb you type "run −a file"; you can also type things like "run <file".)

$ gdb a.out
[copyright garbage deleted][footnote]

(gdb) run
Starting program: /u/ajr/a.out

Program received signal SIGSEGV, Segmentation fault.
0x080483f2 in f (q=0x0) at example2.c:5
5	    q[0] = 'h';
(gdb)

This tells us that the problem occurred in a function named "f", and it's line 5 of example2.c. (It also goes on to show us line 5 of example2.c.) The machine-language addresses (in base 16, indicated with "0x") are displayed but they're not much use to us now... except for the value of 'q', which we can see is zero. We can guess that this is probably the null pointer value (which is usually but not necessarily zero in C -- it's always zero in the source code, but sometimes a different bit representation is needed to implement null pointers at the machine language level, but usually it's just zero at the machine language level too).

Hopefully this would be enough to make you realize what's going on and fix the problem, if you even did make such a feeble error in the first place. But let's press on with gdb to give a tour of the other features you might use in investigating the problem further. Of course you can quit any time once you realize what your error was.

So, first, the "where" command, which tells us everyone who called this function and what parameters they passed. If the initial display isn't good enough to have your "aha!" moment to see your error, "where" often completes the revelation.

(gdb) where
#0  0x080483f2 in f (q=0x0) at example2.c:5
#1  0x080483d2 in main () at example.c:9
(gdb)

Gdb finds the source code based on the file names embedded in the debugging part of the a.out file format. If you haven't edited the .c files since you last compiled (and you should get into the habit of typing "make" before running your program or running gdb on your program), then you can use the "list" command:

(gdb) list
1	#include <string.h>
2	
3	void f(char *q)
4	{
5	    q[0] = 'h';
6	    q[1] = 'i';
7	    q[2] = '\0';
8	}
(gdb)

Now, we are in the function "f", but we are also in "main()" (as shown by the "where" command output above). There is a call tree in which we can move up and down, with the "up" and "down" commands.

(gdb) up
#1  0x080483d2 in main () at example.c:9
9	    f(s);
(gdb)

We now see that "list" will give us a listing of part of example.c, rather than example2.c, since we're examining our program's execution from the point of view of being in main().

(gdb) list
4	{
5	    char *s = NULL;
6	    extern void f(char *q);
7	    int i = 12;
8	    i += 5;
9	    f(s);
10	    printf("%s, world\n", s);
11	    return(0);
12	}
(gdb)

We can display the values of variables with the "print" command. You can print variables in the current function. You can adjust the current function with "up" and "down". You can print all sorts of expressions, not just simple variables; for example, you could print something like "p−>a[3]−>z", if that were a valid expression in the current function context.

(gdb) print i
$1 = 17
(gdb)

The "list" command takes an optional line number to start listing from that line.

(gdb) list 1
1	#include <stdio.h>
2	
3	int main()
4	{
5	    char *s = NULL;
6	    extern void f(char *q);
7	    int i = 12;
8	    i += 5;
9	    f(s);
10	    printf("%s, world\n", s);
(gdb)

Let's go back into f(). Remember that in the paused execution of our program, we are simultaneously in main() and in f(). The "up" and "down" commands just change the meaning of commands such as "print" and "list".

(gdb) down
#0  0x080483f2 in f (q=0x0) at example2.c:5
5	    q[0] = 'h';
(gdb)

Now we'll set a "break point", where the execution will automatically pause at a certain place. In the example so far, the execution paused when it ran into this fatal segmentation exception, but for better probing we can choose where we want this pausing to happen.

(gdb) break f
Breakpoint 1 at 0x80483ef: file example2.c, line 5.
(gdb)

That sets a break point at the beginning of f. We can also set a break point by original source code line number. Let's do that in main().

(gdb) up
#1  0x080483d2 in main () at example.c:9
9	    f(s);
(gdb) break 8
Breakpoint 2 at 0x80483c3: file example.c, line 8.
(gdb)

And now let's see what happens with that. Note how it warns us before killing our current process and starting a new one.

(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y

Starting program: /u/ajr/a.out

Breakpoint 2, main () at example.c:8
8	    i += 5;
(gdb)

So now we're stopped at this earlier point, and we can probe some stuff. Below we print i, then we use the "step" command to execute one more line of C code and then automatically stop for a further print command, verifying that the "+5" worked as we expected.

(gdb) print i
$2 = 12
(gdb) step
9	    f(s);
(gdb) print i
$3 = 17
(gdb)

When we get tired of stepping, we can "un-break" with the "cont" (continue) command:

(gdb) cont
Continuing.

Breakpoint 1, f (q=0x0) at example2.c:5
5	    q[0] = 'h';
(gdb)

and then it goes until the next break point.

A control-C will also break into gdb, if this signal is not being caught by your program. Or if the program exits you will be returned to the gdb prompt, although you can't do much at that point other than "run" again.

When you're done, you'll want the command "quit":

(gdb) quit
The program is running.  Exit anyway? (y or n) y
$

So that's gdb.

If you're not seeing file names, line numbers, variable names, etc, you didn't compile with '−g'. Note that I recompiled all .c files above; "gcc −g example.o example2.o" would not have sufficed.

Suppose example2.c instead said something like "strcpy(q, ...)". In that case we'd get less-immediately useful initial output from gdb; it would say:

Program received signal SIGSEGV, Segmentation fault.
0xb760dcb7 in memcpy () from /lib/libc.so.6
(gdb)

This tells us that the problem occurred in a function named "memcpy", in libc -- the C library. (Apparently strcpy is implemented in terms of memcpy() in this version of the C library.)
In this case we'd probably have to type "where" before we saw anything very useful. It would show this:

(gdb) where
#0  0xb760dcb7 in memcpy () from /lib/libc.so.6
#1  0x0804843d in f (q=0x0) at example2.c:5
#2  0x08048402 in main () at example.c:9
(gdb)

and we could proceed from there.

You may wonder why I did "char *s = NULL;" in my example. What if s were uninitialized, which sounds like a more likely error than mistakenly explicitly initializing it to NULL? Well, in that case we might not have received a segmentation exception. It could, for example, coincidentally have contained the address of some other string area in memory, and the fault might have been harder to diagnose. In general, you can't rely on testing to uncover bugs, and you certainly can't assume they will be uncovered in a clear and localized way. It's always better to think than to test. Testing can show the presence of bugs, but it can't show their absence. If your program is out of control, you might be able to bring it under control by thinking, but you won't be able to get it under control by testing. You're much better off to put the thought into it in the first place and keep it under control and working at all times, and fix bugs when they're newly introduced rather than attempting a big debugging session later. When I divide assignments into a suggested sequence of implementation, you should test your program thoroughly at each stage.

[footnote] The copyright status of gdb is indeed important, and the free software movement is important, but it's a stupid time at which to output information about it. GNU copyright information in man pages is also stupid and annoying. It violates the software tools principle of "Don't clutter output with extraneous information." Sometimes we're considering what software to use or to install or to recommend, and then software freedom is an important issue to consider; but when we're actually running gdb, that is not the task we are engaged in so the copyright information is extraneous.