Developers

GDB: Logging Function Parameters, Part 1

November 1, 2011

9113

Hmm... time to debug

Sometimes an application’s release version crashes in particular scenarios, but the debug version does not — and to make matters worse, the call-trace gets corrupted. It’s then very difficult to find the cause of the crash. Tracing a particular argument may help understand the state of the application before it went “crazy”. This article (in two parts), discusses a mechanism to log parameters, including user-defined types, of various function calls in a released application. The limitations (in this context) of tools like ltrace are also discussed.

In this first part of the series, I will discuss the basic concepts and GDB commands (that will also be used in Part 2) to write a script that will log the desired parameters in an application.

Basic concepts

In my previous articles, we have looked at how GDB can be used to analyse or modify a binary at runtime. Many readers had asked for more neat tricks that can be done with GDB, and hence this article.

There is a lot of information available on the Web (and in the GDB man page) about GDB commands and their syntax/uses. I believe the best way to understand any tool is to see how you can use it to solve a particular problem in a unique way. So, let’s try to understand how GDB can be used to log/modify the parameters of the target function in a release-mode application.

At first, you may not find this a very attractive use of GDB, so I better present an example.

Note: The details discussed here have been tested on Fedora 15 on the x86 platform. GCC version 4.6.0, configured to produce 32-bit target. The GDB version is: 7.2.90.20110429-36.fc15. The sample binaries produced in the article are all 32-bit.

Here is a short story — I think experienced developers can relate to this, while freshers can get a feel for it.

Software is written to behave in a reasonable, deterministic way; however, expectations are not always met — that’s what we call a “bug”! Bugs make customers go crazy, and your management too. The only ones who can’t afford to go crazy are the developers — you and I! The customer issue is replicated in-house, but the root cause is hard to track down. Experience makes one create a debug version of the application, if possible — but the debug version does not replicate the problem!

Now, you have limited options. The next step is obviously a code review, which may not always be helpful. If the application has a built-in logging mechanism, often that is not of any use either. So what is to be done? Trying to debug a non-debug version is not a very straightforward task. So, let’s at least try to establish some relationship between the code segment that might be causing the problem, and the control flow that leads into the faulty code segment.

This control flow can be analysed by tracing changes made to a specific data structure — so, target such a data structure (read that as, function parameter), and observe the control flow.

I hope this use-case illustrates my point. You may also like to consider simulating some other behaviour, by modifying certain function parameters, to have fun with an application. Specifically, you need to know about some generic concepts related to GDB, along with certain commands:

A basic understanding of what a symbol table is, and how GDB uses it.
file and exec-file commands.
symbol-file and add-symbol-file commands.
call, the GDB command to call runtime APIs from GDB, in the context of the inferior (the process being debugged in GDB).
How to load a shared library that was not originally linked to the application.

Application symbol table

Every executable type has a symbol table; however, I will focus on the ELF format used by Linux. This symbol table gives important information about various symbols in the application, including global variables, local variables, function names, complete function signatures, etc. When an application is compiled in debug mode (passing the --g option to GCC), a rich symbol table is written to the executable.

However, only a limited set of details is written into the symbol table of an executable compiled/linked in release mode. For example, let us suppose an application has a function F taking three arguments — an int, a float, and a pointer to a user-defined data type (UDT) uType. Here is the signature (some would prefer the word “prototype”) of this function:

void F ( int, float, struct uType *);

If this application is compiled in debug mode, the symbol table will have the information that the function F returns a void, and takes three parameters: int, float, and a pointer to UDT uType. However, if compiled in release mode, the symbol table will only have information that there is a function F. Period.

There is no information about parameters and their types. Going one step further, if such an application
is stripped (with the strip command), then the symbol table may not have any knowledge of function F at all (the runtime image of the process is not concerned with symbols; everything happens in terms of memory addresses).

Then why is this symbol table important? It is used merely by debuggers, such as GDB, to put breakpoints into the application. When we set a breakpoint on a function, the debugger consults the symbol table to find the address for that function, so it can set the breakpoint at that address. Doing symbol-name-to-address translation involves a complex set of operations.

Why ltrace does not suffice

ltrace can be used to trace arguments to functions that are called — but with some limitations. It has a very clean interface as well. It has a configuration file, /etc/ltrace.conf, with system-wide settings. A user-specific configuration file, .ltrace.conf, is in the user’s home directory, in which users need to add the prototypes of the functions to be traced.

Earlier, the argument types that could be specified were limited to built-in data types like int, float, C-style strings, pointers, etc. Lately, support has been added for UDTs. ltrace can’t change parameters’ values; it gives a read-only view.

Occasionally, though, you may need to change parameter values — for fun, or maybe other better reasons. I will cover that topic in the second part of the series.

Moreover, if you continue to use ltrace without playing around with other things (like we are doing here), then you won’t understand how ltrace works!

GDB commands

exec-file vs file

Both commands accept a filename parameter — the file that will be executed when GDB is given the run command. The exec-file command does not load the symbol table of the file, while the file command does load the symbol table into memory.

By default, GDB follows a two-stage symbol-loading mechanism; at start-up, GDB may not load the full symbol-table information, letting it start faster, and later load the rest of the symbol table(s) as needed. Consequently, you may not be able to put breakpoints at all the symbols (before the symbol is
actually loaded).

To override this behaviour, and load the entire symbol table in one go, you can use the --readnow option. You may like to explore the command --mapped, on systems with memory-mapped file support; though we will not get into those details here.

symbol-file vs add-symbol-file

Here is the simplified syntax of these commands:

symbol-file filename [ -readnow ] [ -mapped ]
add-symbol-file filename address [ -readnow ] [ -mapped ]

The symbol-file command reads (just) the symbol table from filename. The old symbol table, if any, is discarded, and all breakpoints are removed. The add-symbol-file command adds symbol table information from filename without discarding already-loaded symbols. To use add-symbol-file, you need to load filename into the process address space by some other means. The address parameter is the in-memory location at which filename was earlier loaded.

These commands have some other options, please refer to the documentation for more details.

call

The call command can be used to invoke other functions in the context of the inferior. For example, let us suppose you have set a breakpoint. After hitting the breakpoint, GDB stops and waits for your next instruction, in which case you can use call to invoke any function accessible to the inferior. This allows you to inject code at arbitrary points in the application — altering the flow of control! The most important thing is that this call will execute in the context of the application, and so will change the application state.

Loading an unlinked shared library

Let us suppose the application is linked to a shared library L. When the application is run, the runtime environment detects that library L is needed, and GDB loads it automatically. But what if you need to load a shared library X, which is not at all related to the program — it is neither linked to the application, nor does the application load it using the dlopen mechanism.

This could be done by code injection, which is an advanced technique, so we will not discuss it here. Interested readers should look at this excellent article: Code Injection into Running Linux Application’.

The second technique involves using the environment variable LD_PRELOAD, which we will discuss here. LD_PRELOAD can be used to load a shared library before the application itself. For example, to load /home/user/libX.so before the program starts, from the shell, run the following command:

LD_PRELOAD=/home/user/libX.so; export LD_PRELOAD

Inside GDB, this can be done with:

set environment LD_PRELOAD /home/user/libX.so

When LD_PRELOAD is set to a valid shared library, GDB loads the shared library into the process address space (using the runtime loader of the OS), and loads the symbol table of this library.

An example: traceMe.c

Let’s build a sample application that will call traceMe.c:

#include<stdio.h>

/*
 * This is a user-defined type
 */
typedef struct st{
        int val;
        char arrayArg[15];
} ST;

/* This recursive function is meant to simulate a function call with
 * different parameters every time, so we can log the parameters later on.
 *
 * We take only a user-defined type pointer.
 */
static unsigned int prevSeed;
int func(ST *st) {
        static int counter=0;
        counter++;

        printf ("\n Entered [%-5d] with st->val=%-15d, st->arrayArg=%s",
                    counter, st->val, st->arrayArg);

        if( st->val <=0 ) {
            return 0;
        }
        else {
            srand(prevSeed);

            /* rand() so we get some random calls, % so not too much randomness */
            st->val=st->val /  ((rand() % 3)  + 2);
            sprintf(st->arrayArg,"%014d", st->val);
            prevSeed= st->val * rand(); /* :-) */
            return func (st);  /* Recursive call */
        }
}
int main(int argc, char *argv[]) {
        /* No check on argc and argv, to keep it simple*/
        ST st;
        st.val=atoi(argv[1]);
        sprintf(st.arrayArg, "%014d", st.val);
        prevSeed=time(NULL);
        func(&st);
        printf("\n");
}

Let’s compile and link this program to produce a 32-bit binary:

gcc traceMe.c -o traceMe_rel
gcc -g traceMe.c -o traceMe_debug
strip traceMe_rel -o traceMe_strip

Analysis

We have three binaries available: traceMe_rel (release-mode), traceMe_debug (debug-mode), and traceMe_strip (stripped version). Have a look at the size of these binaries:

-rwxrwxr-x. 1 raman raman 7098 Sep 20 07:30 traceMe_debug
-rwxrwxr-x. 1 raman raman 5742 Sep 20 07:30 traceMe_rel
-rwxrwxr-x. 1 raman raman 3820 Sep 20 07:31 traceMe_strip

The debug version is the biggest because it has a full symbol table; the release version is smaller because it has less in the symbol table; the stripped version is the smallest since there is no symbol table information at all.

Let’s gather some interesting information about these binaries. I load traceMe_debug (marker comments Dn) and traceMe_rel (marker comments Rn) in GDB and execute them; the resulting sessions are shown below:

[raman@Chalotra gdbTrace]$ gdb   -quiet
(gdb) set verbose on           #D1
(gdb) file ./traceMe_debug      #D2
Reading symbols from /home/raman/LFY/gdbTrace/traceMe_debug...done.#D3

(gdb) ptype func                           #D4
Reading in symbols for traceMe.c...done.   #D5
type = int (ST *)                          #D6
(gdb) ptype ST                            #D7
type = struct st {
    int val;
    char arrayArg[15];
}                                          #D8
(gdb) br main                              #D9
Breakpoint 1 at 0x80485bb: file traceMe.c, line 51.      #D10
(gdb) r 100
Starting program: /home/raman/LFY/gdbTrace/traceMe_debug 100
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from system-supplied DSO at 0x110000...(no debugging symbols found)...done.
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6

Breakpoint 1, main (argc=2, argv=0xbffff6b4) at traceMe1.c:51
51              st.val=atoi(argv[1]);
Missing separate debuginfos, use: debuginfo-install glibc-2.13.90-9.i686
(gdb)

[raman@Chalotra gdbTrace]$ gdb   -quiet
(gdb) set verbose on                  #R1
(gdb) file ./traceMe_rel              #R2
Reading symbols from /home/raman/LFY/gdbTrace/traceMe_rel...(no debugging symbols found)...done.      #R3
(gdb) ptype func                     #R4
type = int ()                         #R6
(gdb) ptype ST                       #R7
No symbol table is loaded.  Use the "file" command.    #R8
(gdb) br main                         #R9
Breakpoint 1 at 0x80485b5             #R10
(gdb) r 100
Starting program: /home/raman/LFY/gdbTrace/traceMe_rel 100
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from system-supplied DSO at 0x110000...(no debugging symbols found)...done.
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6

Breakpoint 1, 0x080485b5 in main ()
Missing separate debuginfos, use: debuginfo-install glibc-2.13.90-9.i686
(gdb)

Let’s look at how, why and where GDB behaved differently with the two binaries:

D1 and R1: set verbose on makes GDB output internal information about what it is doing — for example, messages like Reading symbols.... This helps clearly understand GDB behaviour.
D2/D3 and R2/R3: file is used to specify the application to debug. Note that for D2, GDB prompted with ...done (#D3), which means it read some symbols from traceMe_debug. However, for R2, it says No debugging symbols found (#R3).
D4 and R4: ptype is used to see what the type of func is. This command can be used to gather information about variables, functions, UDTs, etc.
D5: Surprise! There is no ‘R5’. Why? This is Stage 2 of the loading of the traceMe_debug‘s symbol table; file in D2 only did Stage 1 loading. R5 is missing since there’s no Stage 2 — the release-mode binary doesn’t have that much information.
D6 and R6 (the output of ptype func): In the debug version, GDB knows that func takes ST * , but in the release version, it only knows that func is a function and lacks details about argument types.
D7 and R7: We use ptype on the UDT ST. As expected, GDB prints a correct description of ST in the debug version (#D8), but complains (#R8) about the release version. This will be discussed further in the next part.
D9/D10 and R9/R10: br is used to set a breakpoint on main(). As expected, GDB prints the description of the inserted breakpoint. However, the level of detail printed differs; for the debug version, GDB shows the address, source filename containing main, and the line number where main appears in the source file (#D10). However, for the release version, GDB just prints the
address of the function.

Summing up

If you are looking for a simple, portable, hassle-free but less powerful way to log function arguments, ltrace is for you. However, if you are looking for more power, then this GDB information should have been a good start!

In the subsequent article on the subject, I aim to: run traceMe_rel (the release version), log the arguments to func, and print the complete argument data (both fields of the UDT). Also, I plan to provide a way to change argument values. In other words, we need to do something so that at the command #R7, GDB doesn’t complain, but gives the exact type of ST. Once this is done, life will be much easier. See you next month!

2 COMMENTS

SystemTap Tutorial, Part 1 - LINUX For You May 31, 2012 At 12:36 PM

[…] debug only one program at a time, and the debugger stops the program while we do the inspection. GDB/KDB is used for such debugging.So which of these tools would you use? You’re probably […]

Otto Blomqvist June 26, 2013 At 7:06 AM

Cant believe no one has commented yet.. ! This was excellent information. Just what I was looking for. Thanks !