Reverse Engineering
Methodology
- Description and hints/try to determine topic
- Surface-Level Analysis
file
/strings -t x -w
can provide more clues
- Static Analysis
- Look at imports/exports
- Work backwards from comparisons/notable strings
- Dynamic Analysis
Common reversing formats
- Flag is plaintext in memory, input is compared
- Input is modified, compared to flag in program
- Flag is modified, compared to input
- Some mathematical condition must be met to satisfy equation
- User must provide input and key, program will decipher
- Ciphertext is given, and the program was used to encode it
Common ways to solve problems
- Look at all comparisons, is there a check on length? Does it check the whole string at once? Does it go character by character?
- Break at final comparison, see how user input and program memory have changed
- Determine if each character in input is affected the same way, may be trivial to decode through simple substitution
- If a string is checked one byte at a time, it may be possible to set a breakpoint inside the check function and see what input causes the function to be called one more time than other inputs
- Step through encryption/check functions to see what they are doing, may be easier than static analysis
Scripts
Tools
file
strings -t x -w
nm
objdump -D
strace
ltrace
readelf -a
- ANGR - symbolic execution
- Pintools - dynamic binary instrumentation / Wrapper
- Z3 - constraint solver
- Triton - combines the three above
- secREtary
- Radare2 / notes
- GDB GUI (in case you really love GUIs, can install with pip)
- Tenet (IDA-Pro plugin)
Static Analysis
- Ghidra
- Compiler Explorer
- HxD/Okteta/rehex
- IDA/JEB/RetDec
Dynamic Analysis
- Linux - GDB/Peda/PwnDbg/GEF
- Windows - OllyDbg/WinDbg/x32Dbg/x64Dbg/VS Debugger/Immunity Debugger
- WinDbg Preview
Practice
Links
Topics
Ghidra
- pwndra
- Ghidra Patch Diff Correlator Project
- Lots of helpful built-in scripts in script manager
- Customize layout with windows
IDA
- Use \ to get rid of casts
- Click on a variable and press n to rename
- F5 to disassemble
- Shift+F12 for strings
- Click on a function and click x to see what calls that function (helpful for working backwards)
- Double click declarations at the top of the function to see the function’s stack
GDB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Command line arguments:
-ex "[cmd]" (run command on start)
-x [file] (run file with commands on start)
-p [pid] (attach to process)
Prompt commands:
r (run the program)
r arg1 arg2 `python -c "print '\x61\x72\x67\x33'"` (run program with arguments)
r < <(python -c "print 'firstline\nsecondline'") (write to stdin)
info func (display all known functions, updates during program execution)
b main (break at function)
b *0x80482018 (break at address)
tb (temporary breakpoint)
so = tb *($rip+1)
info break (see breakpoint locations and IDs)
del break 1 (delete first breakpoint)
ignore 1 100 (ignore breakpoint 1, 100 times)
info stack
info reg
info files
info var
set [var] = ([type])[addr] (great for structs, use print to display contents)
bt (backtrace of where you are at currently)
up (useful after backtrace to go back through stack frames)
info frame (stack view)
p $esp (print value of stack pointer)
p $ebp (print value of base pointer)
p &esp (print what is @ ESP)
x/18x $esp (display 18 double words at the stack pointer)
x/18x 0x7fffffffe0c0 (if you hit enter again, it will continue to print up in memory)
x/s 0x7fffffffe0c0 (print string at address)
set {int}0x7fffffffe0c0 = 0x12342018 (set memory value)
set $rip = 0x40082f (set register value)
set {char [4]} 0x08040000 = "Ace"
search AAAA
search -x [hex in little endian]
traceinst cmp (run until next cmp operation)
disass main (disassemble a function)
c (continue, run until next breakpoint)
s (step)
s 100 (step 100 times)
n (like step, but step over function calls)
fin (run until out of current function)
<enter> repeat last command
attach [proc_id] or gdb -p [proc_id] (use gdb to examine a running process)
define [alias] (create a list of aliased commands to run)
show user (see commands that are defined)
watch -l *(int *)0xaddr (rwatch for read, awatch for read/written)
start or starti (good for PIE)
b *($rebase(addr)) (good for PIE)
layout split/layout regs (see everything in the current view)
PWNDBG
1
2
3
4
5
6
7
8
9
10
pwndbg (list commands)
arena (show heap)
bins
fastbins, smallbins, ETC
heap (best detail)
vmmap
search -s [string]
search -p [pointer]
vis
vis_heap_chunks [num] [addr_chunk]
x32Dbg/x64Dbg
downloadsym
gets symbols from Microsoft symbol store, allows you to specify library and store address as args, by default gets everything
WinDbg
- Cheatsheet
?
/.help
/.hh [text]
g
/F5
(go)p
/F10
(step over)t
/F8
(step into)pt
(step out)u
(unassemble)d(b/w/d/q/...) [L#]
(display)- ex.
dd poi(esp)
- ex.
dt
(display type)- ex.
dt -r ntdll!_TEB @$teb
- ex.
?? sizeof(ntdll!_TEB)
- ex.
e(b/w/d/q/...)
(edit)s
(search)r
(registers)r eax=[val]
b(p/u/a/l/c/...)
(breakpoints)- `bp ntdll!Function “.if (poi(eax + 0x08) == 0) {g} .else {.printf "Message: %p", poi(esp);.echo;g}”
lm
(loaded modules)? ff - 8
- pseudo-registers
Windows
- Often the
main
function in ghidra is toward the end of theentry
function and in a line likeuVar4 = FUN_004010000()
andmain
is the first function in the text section
VBA
.NET
- Decompile An Assembly In C#
- Flare VM
- Don’t forget about wide characters when looking for strings (
-e b
/-e l
)
Linux
ELF Sections
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
.text: code.
.data: initialised data.
.rodata: initialised read-only data.
.bss: uninitialized data.
.plt: PLT (Procedure Linkage Table) (IAT equivalent).
.got: GOT entries dedicated to dynamically linked global variables.
.got.plt: GOT entries dedicated to dynamically linked functions.
.symtab: global symbol table.
.dynamic: Holds all needed information for dynamic linking.
.dynsym: symbol tables dedicated to dynamically linked symbols.
.strtab: string table of .symtab section.
.dynstr: string table of .dynsym section.
.interp: RTLD embedded string.
.rel.dyn: global variable relocation table.
.rel.plt: function relocation table.
Java / Android
- jadx-gui
jd-gui
- All of the java stuff here and here
- First, look at
AndroidManifest.xml
- Second,
lib/
jadx
/apktool
/rej.jar
/jdb
on linux – jdb overview – jdb examplesd2j-dex2jar
converts apk to jar- compile java using
javac -g XXX.java
ARM
* See below for running ARM binaries *
Different Architecture
- QEMU/QIRA – Monitor
- Debugging with QEMU
- 1st terminal:
qemu-system-x86_64 -s -S -k en-us -m 512 -drive format=raw,file=floppy.img
- or:
qemu-system-mips -M mips -s -S -bios ./flash -nographic -m 16M -monitor /dev/null
- 2nd terminal:
gdb;target remote localhost:1234; c
- May have to use
gdb-multiarch -ex "set architecture [arch]" -ex "set endian [endian]" -ex "target remote localhost:1234" # (or put commands in file and use -x)
- ARM:
- RUNNING ARM BINARIES ON X86 WITH QEMU-USER
- ex:
qemu-arm -L /usr/arm-linux-gnueabihf -g 1234 ./binary
gdb-multiarch -q -ex 'set architecture arm' -ex 'file ./binary' -ex 'target remote localhost:1234'
- pwntools ex:
r = process(["qemu-arm", "-L", "/usr/arm-linux-gnueabihf", "-g", "1234", "./binary"])
gdb-multiarch -q -ex 'set architecture arm' -ex 'file ./binary' -ex 'target remote localhost:1234'
C++
Rust
GoLang
Compiled Python
- .pyc extension
- pycdc
- uncompyle
- uncompyle6
- decompyle3
- If you want to cheat because who you’re totally getting around to installing that Python2 virtualenv but just don’t have it on this machine right now
import dis;dis.dis(codeObject)
JavaScript
WASM
- You should be able to see the source in developer tools
- Video writeup to manipulate the code and run it
x86
- Return value stored in
*AX
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
32-bit Stack layout
Highest Address
-------------
| arg n |
| arg ... |
| arg 1 |
| arg 0 |
-------------
| ret |
-------------
| old_ebp | <-ebp
-------------
| var 0 |
| var 1 |
| var ... |
| var n | <-esp
-------------
Lowest Address
1
2
3
64-bit Integer/pointer argument order:
rdi, rsi, rdx, rcx, r8, r9, then stack
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Pseudo-instructions
call x
- push return addr
- mov eip, x
push x
- sub esp, 4
- mov [esp], x
ret
- pop eip
pop x
- mov x, [esp]
- add esp, 4
ANGR
- Helper
- ANGR is not great with C++
My process for ANGR
- Setup
- Docker image
alias angr='sudo docker run --rm -v "$(pwd)"/files:/mnt -it angr/angr' (place in ~/.bashrc)
1
2
3
4
5
6
7
$ mkdir files
$ cp [binary] files/
$ wget https://raw.githubusercontent.com/m4dSt4cks/CTF_Resources/master/Reversing/angr_helper.py files/angr_helper.py
$ code files
$ angr
(angr) cd /mnt
(angr) python angr_helper.py
Dynamic Binary Instrumentation
Tools
Z3
Pintools
- Binary instrumentation framework from intel
- Wrapper
GDB Python
* See other files *
Ghidra Python
* See other files *
Packed
UPX
upx -d [file]
- Oneshot?
IDA
- Run in IDA debugger until you get to Original Entry Point
- Debugger->Take memory snapshot
- Stop debugger
- Options->General->Analysis->Reanalyze program
Patching
- Use a disassembler and a hex editor (unless it does patching directly)
- Use pwntools
This post is licensed under
CC BY 4.0
by the author.