GDB for Reverse Engineering
The GNU Debugger (GDB) is a powerful tool for performing reverse engineering and dynamic analysis of ELF binaries. This post will walk you through gathering an understanding of the basic usage of this command line utility for debugging to build familiarity with GDB. This is a good steppingstone to another post I have going over a basic x86-64 buffer overflow exploit.
Understanding the basic command line use-cases for GDB
1. Loading the ELF Binary into GDB:
You can load binaries into GDB for dynamic analysis with the following syntax: gdb ./filename
. Once you execute that you can begin inspecting the binary.
2. Basic Commands and Information
You can display memory addresses for the entry point and symbols with: info files
You can view all the functions in the binary for function names that may be of interest: info functions
You can retrieve a list of global and static variables that may hold important data: info variables
You can disassemble a function and read its contents with: disas function_name
3. Setting Breakpoints
Breakpoints are crucial for pausing the execution at specific points to inspect the binary's behavior.
You can set a breakpoint on function names with the following syntax: break function_name
You can set a breakpoint at a specific memory address within a function: break (main+56)
or break *0x400123 (0x prepends memory addresses)
You can remove a breakpoint after setting it with: del break
4. Running and Stepping Through the Program
Once you’ve set a breakpoint you can run the program to that point with: run
or r
Once the program hits that breakpoint you set you have the ability to “step through” the program or skip entire instructions, play with their functionality after setting breakpoints and see how the program outputs change:
You can move to the next instruction without stepping into function calls with: next
You can step into the function call to inspect its execution with: step
You can continue running until the next breakpoint with: continue
or c
5. Inspecting Registers and Memory
GDB allows you to inspect registers and memory, which is useful for examining input validation, confirmation of buffer overflows, and cryptographic routines.
You can view register values with: info registers
You can display the current values in all CPU registers (eax, ebx, rip, rsp, esp, etc
). Below is a comprehensive list of the registers you will find among the two types of architectures most commonly associated to binaries:
These are the x86 (32-bit) registers you should slowly become familiar with:
EAX (Accumulator): The accumulator is primarily used for arithmetic operations and I/O. It often holds the return value of functions. Many instructions use EAX as an implicit operand.
EBX (Base): The base register is generally used as a pointer to data in memory, especially for data structures and arrays. It can also hold base addresses for calculations.
ECX (Count): This register is often used as a counter in loops and string operations. Certain instructions can use ECX implicitly, especially those that operate on strings or perform repeated actions.
EDX (Data): The data register is used in arithmetic operations, particularly for multiplication and division. It can hold additional data or carry results from these operations.
ESI (Source Index): The source index register is commonly used for string and memory operations. It points to the source data when performing operations that move data between memory and registers.
EDI (Destination Index): The destination index register is used in conjunction with ESI for memory operations, pointing to the destination for data being transferred or manipulated.
EBP (Base Pointer): The base pointer register points to the base of the current stack frame. It is crucial for accessing local variables and function parameters during function calls.
ESP (Stack Pointer): The stack pointer register points to the top of the stack. It is updated automatically as values are pushed onto or popped off the stack during function calls and returns.
EIP (Instruction Pointer): The instruction pointer holds the address of the next instruction to be executed. It is automatically updated by the CPU as instructions are executed, allowing the program to flow correctly.
These are the x64 (64-bit) registers that you should slowly become familiar with:
RAX (Accumulator):
The accumulator is used for arithmetic operations and I/O. It often holds the return value of functions. Many instructions use RAX as an implicit operand.RBX (Base):
The base register is generally used as a pointer to data in memory, especially for data structures and arrays. It can also hold base addresses for calculations.RCX (Count):
This register is often used as a counter in loops and string operations. Certain instructions can use RCX implicitly, especially those that operate on strings or perform repeated actions.RDX (Data):
The data register is used in arithmetic operations, particularly for multiplication and division. It can hold additional data or carry results from these operations.RSI (Source Index):
The source index register is commonly used for string and memory operations. It points to the source data when performing operations that move data between memory and registers.RDI (Destination Index):
The destination index register is used in conjunction with RSI for memory operations, pointing to the destination for data being transferred or manipulated.RBP (Base Pointer):
The base pointer register points to the base of the current stack frame. It is crucial for accessing local variables and function parameters during function calls.RSP (Stack Pointer):
The stack pointer register points to the top of the stack. It is updated automatically as values are pushed onto or popped off the stack during function calls and returns.RIP (Instruction Pointer):
The instruction pointer holds the address of the next instruction to be executed. It is automatically updated by the CPU as instructions are executed, allowing the program to flow correctly.
You can choose a particular memory register and examine a number of spaces with: x/200xb $rsp
6. Modifying Program Execution
In CTF challenges, you may need to change register or memory values to bypass checks. You can change register values with: set $eax = 0x1
You can modify the EAX register’s value to affect program flow and modify memory directly with: set {int}0x601050 = 42
which changes the value stored at this specific memory address.
7. Extract Information from Variable Values:
You can use GDB to locate and extract information from a binary such as variable values with: print variable_name.
This will depend on the type of variable and if the program is compiled to have things easily obtainable in a CTF context.
You can also print the value of a variable at the current point in the program with: x/s 0x601050
which prints the string stored at a given memory address.
8. Exiting GDB
Once you are down with your analysis you can exit GDB with: q
or quit
9. Enhance your debugging experience by expanding on GDB’s command line utility:
GDB-PEDA and GEF (GDB Enhanced Features) are expansions on the Python functionality with GDB, allowing for more particular use cases involving automating processes. GEF will support broader and modern use-cases which may make it generally superior. Pwntools is great for automating exploitation by integrating GDB commands into Python scripts.
GDB-PEDA: longld/peda: PEDA - Python Exploit Development Assistance for GDB (github.com)
PWNTOOLS: Gallopsled/pwntools: CTF framework and exploit development library (github.com)
Now that you’ve explored GDB's command line utilities and learned how to interpret its outputs, you're well on your way to using these skills in CTFs and reverse engineering. The insights from this guide, along with our previous x64 basic buffer overflow post, will set you up for success in uncovering hidden flags and tackling complex challenges. Keep practicing, and happy hacking!