Modify-function-return-value hack! — Part 2

This is how we get in...

In my previous article on this topic, we discussed some guidelines on how to write a secure application in C. The article focused on functions defined in the application itself. Now, we will look to the system, and see how an application (written in C) interacts with the system C runtime library (LIBC), and the vulnerability of LIBC function calls.

An application needs resources and services to complete a well-defined task, and these services are provided by the OS (let’s call it the “execution environment”). For example, the printf function is provided by the runtime C library LIBC. This is a classic example of code reuse, and a modular approach towards solving a problem.

To make an application secure, we have to ensure that:

  1. The application logic/design, in itself, is secure.
  2. The execution environment (we limit ourselves to LIBC here) is secure.
  3. The interface between application and LIBC (the execution environment that is providing services) is secure.

We already discussed (1) in the previous article; I will discuss some of the other techniques in this article. We assume that (2) is secure enough. So, as mentioned in the “What Next” section of the previous article, (3) is the subject matter of this article — the application/LIBC interface. The focus is on GLIBC — GNU LIBC, although concepts are generic. The tools used are: gdb v7.0-3.fc12, gcc v4.4.2 20091027, glibc v2.11.

Basic concepts

A C program relies on several functions like printf, strcpy, strcat, strcmp, scanf, etc. All these functions are defined in the C runtime library provided by the system (normally, in the /lib/ directory). Common convention is to name this library libc.so.X, where X is the version number. This file is a link to the actual library.

Let’s suppose that a call to function F — which is defined in GLIBC — originating from the application is intercepted and tampered with! The level of tampering can vary; e.g., a hacker can completely replace the GLIBC function with a custom implementation (using library interposition feature), or can alter the flow of the application without providing a new function definition. The application should not be blamed here; it is the interface that has been manipulated.

We will also see how ltrace can be used for application tracing, and discuss how functions are defined in a gdb script.

Modifying hackMe.c: hackMe2.c

Let’s rewrite the earlier program hackMe.c, naming it hackMe2.c. Newly added code is appropriately highlighted in the following code snippet.

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

/* 
   This block of defines obfuscates all the functions
   e.g., authenticate will appear as FUNC4 in complied binary 
*/
#define fake1         FUNC1
#define fake2         FUNC2
#define fake3         FUNC3
#define authenticate  FUNC4
#define fake5         FUNC5
#define fake_str      TEST

char fake_str[100];
int i;

#define DEFINE_FUNC(FNAME, RETTYPE, ARG, RETVAL, STRVAL) \
	RETTYPE FNAME   (ARG)  \
	{	\
		strcpy(fake_str, STRVAL); \
		i=strcmp(fake_str, "123"); \
		return (RETVAL);	\
	}
#define CALL_FUNC(FNAME, ARG) FNAME(ARG)

DEFINE_FUNC( fake1, int, int i, 1,"fake1_str")
DEFINE_FUNC( fake2, float, float i, 2, "fake2_str")
DEFINE_FUNC( fake3, long, long i, 3, "fake3_str")
DEFINE_FUNC( fake5, double, double i, 5, "fake5_str")

/*
    Returns 0 on success, and
    Returns 1 on failure
*/
int authenticate (char *test)
{

    CALL_FUNC(fake1,1);

    if(strcmp(test,"PASS") == 0 )
        return 0; /* success*/
    else
        return 1; /*  fail */
}

/* USAGE:  . /hackMe2 FAIL */
int main(int argc, char *argv[])
{
     int retVal = -1;

     if (argc< 2)
     {
         printf ("\n USAGE: %s <PASS|FAIL>", argv[0]);
         exit (-2);
     }
     /* skipping any checks on argv to keep it simple*/

     CALL_FUNC(fake2,2);
     retVal = authenticate(argv[1]);
     CALL_FUNC(fake3,3);

     if( retVal == 0)
        printf("\n Authenticated ... program continuing...\n");
     else
     {
        printf("\n Wrong Input, exiting...\n");
        CALL_FUNC(fake5,4);

        exit (retVal);
     }
     /*
     .
     .
     Rest of the program
     .
     .
     */
}

Compile and link (to produce 32-bit binary) the program; then strip the binary, as follows:

gcc hackMe2.c -o hackMe2
strip hackMe2

Fake code has been included (although, not in the best way; however, it is okay for demonstration purposes). These are the functions: fake1, fake2, fake3, fake5 and a variable fake_str. Two macros (DEFINE_FUNC and CALL_FUNC) are responsible for defining and calling the functions defined. Every fake function has calls to strcmp and strcpy. Additionally, hackMe2.c has these modifications from hackMe.c:

  • Obfuscation — Guessing a symbol name, in the application, is harder for a hacker.
  • Fake code — A little harder for a hacker to find the code to be targeted.
  • Stripped binary — The hacker can’t see any symbols inside the application; consequently, debugging is even more difficult. Note that this is the second obstacle level that the hacker faces — the first being the binary in release mode. But, as discussed in the previous article, this may invite debugging challenges in the field.

With these modifications, hackMe2.c is better than hackMe.c, if evaluated on a security basis (let’s not bother about the data structures book that taught us code size and runtime efficiency; that book didn’t tell us anything about hackers, by the way).

Note: Please remember that the fake functions in this example do not make a good design — these are for illustration purposes only; e.g., the use of global variables is not a good design. So please don’t use such functions in your application.

Analysing the victim binary

Let’s use nm and see if anything can be found in hackMe2:

[raman@localhost article]$ nm hackMe2
nm: hackMe2: no symbols
[raman@localhost article]$

This is as we expect after we stripped the binary. Note that even main is not known to nm. You can run the objdump command with options -T and/or -R on hackMe2 and see what symbols hackMe2 needs from GLIBC.

Can we use gdb to debug hackMe2? The answer is, no, not directly — because there is no visible symbol in the application, gdb can’t set a breakpoint.

Peeking inside GLIBC

It’s been seen that hackers haven’t had much luck with this application till now — so it’s time to look into the GLIBC functions that are called by any application. hackMe2 is no exception.

We need a tool that can look into the GLIBC calls; fortunately, ltrace is a tool that excels at this. ltrace, a library call tracer tool, is shipped with (almost) every Linux distribution. Here is the output of running hackMe2 under ltrace.

[raman@localhost article]$ ltrace  ./hackMe2 WRONGPASSWORD
__libc_start_main(0x80485de, 2, 0xbfaf1054, 0x80486a0, 0x8048690 <unfinished ...>
memcpy(0x80499c0, "fake2_str", 10)                                                               = 0x80499c0
strcmp("fake2_str", "123")                                                                       = 1
memcpy(0x80499c0, "fake1_str", 10)                                                               = 0x80499c0
strcmp("fake1_str", "123")                                                                       = 1
strcmp("WRONGPASSWORD", "PASS")                                                             = 1
memcpy(0x80499c0, "fake3_str", 10)                                                               = 0x80499c0
strcmp("fake3_str", "123")                                                                       = 1
puts("\n Wrong Input, exiting..."
 Wrong Input, exiting...
)                                                               = 26
memcpy(0x80499c0, "fake5_str", 10)                                                               = 0x80499c0
strcmp("fake5_str", "123")                                                                       = 1
exit(1 <unfinished ...>
+++ exited (status 1) +++
[raman@localhost article]$

Looking at the output, we find there are several library calls to functions defined in GLIBC. However, we are interested only in calls that have reference to the input criteria (i.e., the password string WRONGPASSWORD). There is only one such call:

strcmp("WRONGPASSWORD", "PASS")      =1      (#T)

“1″ is the return value of strcmp, which means a failure.

By this time, you may have realised that a hacker will try to apply the modify-function-return-value hack, discussed in the previous article, on the strcmp function defined in GLIBC! This is an important point: applying this hack to a function defined in an application is different from applying it on functions defined in GLIBC. That’s because the application is under the control of the application developer, but GLIBC is not. We (the developers) have not left any hole in the application, as far as applying this hack is concerned — but we can’t stop hackers applying this hack on GLIBC functions!

Hacking ‘strcmp’ in GLIBC

The hacker’s goal is clear: override strcmp and return 0 irrespective of the arguments passed — but this should be done only for the case #T. Other calls to strcmp should be left unchanged. So, here is the gdb script (arg_strcmp.gdb) that does the magic. It runs the victim application providing a wrong password, and then forcefully applies the modify-function-return-value hack to the desired strcmp call.

# Raman Deep: [email protected]

file ./hackMe2

################################# DEFINITIONS:START
set var  $_isEq=0

# Yes! GDB_STRCMP, below, is a gdb function. 
# Function that provides strcmp-like functionality for gdb script; 
# this function will be used to match the password string provided in command line argument 
# with the string argument of strcmp in program
define GDB_STRCMP
set var  $_i=0
set var  $_c1= *(unsigned char *) ($arg0 + $_i)
set var  $_c2= *(unsigned char *) ($arg1 + $_i)
while (  ($_c1 != 0x0) && ($_c2 != 0x0) && ($_c1 == $_c2) )

#printf "\n i=%d, addr1=%x(%d,%c), addr2=%x(%d,%c)", $_i, ($arg0 + $_i),$_c1, $_c1, ($arg1 + $_i), $_c2,$_c2
set  $_i++
set  $_c1= *(unsigned char *) ($arg0 + $_i)
set  $_c2= *(unsigned char *) ($arg1 + $_i)

#while end
end

if( $_c1 == $_c2)
set $_isEq=1
else
set $_isEq=0
end

#GDB_STRCMP end
end
################################# DEFINITIONS:ENDS

br __libc_start_main
r WRONGPASSWORD
br strcmp
c
while 1
up
set var $argOne=*(int)($esp)
set var $argTwo=*(int)($esp+4)
set var $_myStr="WRONGPASSWORD"
printf "\n strcmp((0x%x)\"%s\" , (0x%x)\"%s\") \n\n",$argOne, $argOne, $argTwo, $argTwo
GDB_STRCMP $argOne $_myStr

if ( $_isEq == 1)
printf "\n\t--> THIS IS OF MY INTEREST -> I AM GOING TO MAKE IT PASS <--\n"
stepi
step
printf "\n\t--> EAX=%d before HACK!!, setting this to 0 <--\n", $eax
set $eax=0
printf "\n\t--> Set...EAX=%d <--\n",  $eax
set $_isEq=0
end

c

#while end
end

This script defines a function named GDB_STRCMP, which takes two arguments (each being the address of the start of a C-style string); GDB_STRCMP sets a variable _isEq to 1 or 0, depending on whether two strings match or differ, respectively. A breakpoint on strcmp is set; every time this breakpoint is hit, the argument of strcmp is compared with our input criteria (WRONGPASSWORD). If there is a match, that means we have #T, so its return value is changed to 0 (which means success). All the strcmp calls are printed mentioning the address and actual string. The following session shows the output of running hackMe2 in gdb using the arg_strcmp.gdb script. One unusual thing — the stack trace printed shows strcmp called from exit. This is wrong, but this happens because the binary is stripped, and hence gdb guesses wrong.

[raman@localhost article]$ gdb -x ../gdbScripts/arg_strcmp.gdb -quiet  -batch
Breakpoint 1 at 0x8048364

Breakpoint 1, 0x009faad6 in __libc_start_main () from /lib/libc.so.6
Breakpoint 2 at 0xa5a200

Breakpoint 2, 0x00a5a200 in strcmp () from /lib/libc.so.6
#1  0x080484fc in exit ()

 strcmp((0x80499c0)"fake2_str" , (0x8048762)"123")

Breakpoint 2, 0x00a5a200 in strcmp () from /lib/libc.so.6
#1  0x080484bb in exit ()

 strcmp((0x80499c0)"fake1_str" , (0x8048762)"123")

Breakpoint 2, 0x00a5a200 in strcmp () from /lib/libc.so.6
#1  0x080485cc in exit ()

 strcmp((0xbffff566)"WRONGPASSWORD" , (0x8048784)"PASS")

	--> THIS IS OF MY INTEREST -> I AM GOING TO MAKE IT PASS <--
0x00a5a204 in strcmp () from /lib/libc.so.6
Single stepping until exit from function strcmp,
which has no line number information.
0x080485cc in exit ()

	--> EAX=1 before HACK!!, setting this to 0 <--

	--> Set...EAX=0 <--

Breakpoint 2, 0x00a5a200 in strcmp () from /lib/libc.so.6
#1  0x08048549 in exit ()

 strcmp((0x80499c0)"fake3_str" , (0x8048762)"123")

 Authenticated ...program continuing...

Program exited with code 051.
/home/raman/gdbScripts/arg_strcmp.gdb:59: Error in sourced command file:
No stack.
[raman@localhost article]$

Countermeasures

Solution to this problem is to inline the calls to strcmp. A compiler may do this implicitly, but this would be compiler/system dependent. So, it’s better to achieve the same effect by explicitly avoiding a call to strcmp. This can be done by defining a custom function like my_strcmp — which should be obfuscated — that would have the same functionality as GLIBC’s strcmp. However, my_strcmp would not call any function in GLIBC. This will have its own logic using plain C language statements. Then the authenticate function (or any other sensitive function) would call my_strcmp instead of strcmp.

We have now learned how gdb can be used to play with functions defined in GLIBC to alter the behaviour of an application. This article does not address all the security concerns that an application has to take care of in the real world, but it should increase awareness among software developers.

What next…

I hope readers find this helpful; I have found the techniques discussed very useful in application debugging. I have tried to explain, in detail, everything used in this article. However, I believe gdb scripts are complex beasts, so I will probably try to write an article on gdb scripts.

Disclaimer: The information provided in this article is only for educational purposes, and should not be used to attack/hack any application that is not owned by you.
References
Feature image copyright: Johan Nilsson. Reused under the terms of CC-BY-NC 2.0 License.

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.