Reverse engineering is the process of analyzing a system, software, or device to understand its
internal workings, design, and functionality without access to its source code or documentation.
In cybersecurity and malware analysis, it's an essential skill for understanding how malicious
software operates.
Forward vs Reverse Engineering
Forward Engineering
Design → Source Code → Binary → Execution
You control the entire process
⇄
Reverse Engineering
Binary → Analysis → Understanding → Recreation
You work backwards from the result
Applications in Malware Analysis
Threat Intelligence: Understanding attack vectors and techniques
Incident Response: Analyzing suspicious files and binaries
Detection Engineering: Creating signatures and behavioral rules
Attribution: Identifying malware families and threat actors
Legal Considerations: Reverse engineering may be restricted by law in some
jurisdictions (DMCA, CFAA in the US). Always ensure you have proper authorization and are working
in a controlled environment. Only analyze malware in isolated lab environments.
Ethics Note: Reverse engineering skills are powerful. Use them responsibly
for defensive security, research, and education - never for unauthorized access or malicious purposes.
Knowledge Check: Reverse Engineering Fundamentals
Answer all 3 questions correctly to complete this section.
2. Assembly Language Fundamentals
Assembly language is a low-level programming language that provides a human-readable representation
of machine code. Understanding assembly is crucial for reverse engineering because compiled programs
are ultimately executed as machine instructions.
x86/x64 Registers
Registers are small, fast storage locations inside the CPU. Click on each register to learn more:
RAX/EAX
Accumulator
RBX/EBX
Base
RCX/ECX
Counter
RDX/EDX
Data
RSI/ESI
Source Index
RDI/EDI
Destination Index
RBP/EBP
Base Pointer
RSP/ESP
Stack Pointer
RIP/EIP
Instruction Pointer
Common Instructions
Instruction
Syntax
Description
Example
MOV
MOV dest, src
Copy data from source to destination
MOV rax, 5
PUSH
PUSH value
Push value onto the stack
PUSH rbx
POP
POP dest
Pop value from stack to destination
POP rcx
CALL
CALL address
Call a function at address
CALL 0x401000
RET
RET
Return from function
RET
JMP
JMP address
Unconditional jump to address
JMP 0x401020
JE/JZ
JE address
Jump if equal (or zero)
JE 0x401030
JNE/JNZ
JNE address
Jump if not equal (or not zero)
JNE 0x401040
CMP
CMP op1, op2
Compare two operands (sets flags)
CMP rax, 10
TEST
TEST op1, op2
Bitwise AND (sets flags, doesn't store)
TEST rax, rax
ADD
ADD dest, src
Add source to destination
ADD rax, 5
SUB
SUB dest, src
Subtract source from destination
SUB rax, 3
XOR
XOR dest, src
Exclusive OR operation
XOR rax, rax
LEA
LEA dest, [address]
Load effective address
LEA rax, [rbp-8]
Calling Conventions
Windows x64 (Microsoft ABI)
First 4 integer args: RCX, RDX, R8, R9
Floating point args: XMM0, XMM1, XMM2, XMM3
Additional args: pushed on stack (right to left)
Return value: RAX (or XMM0 for floating point)
Caller must allocate 32 bytes of shadow space
🐧
Linux x64 (System V ABI)
First 6 integer args: RDI, RSI, RDX, RCX, R8, R9
Floating point args: XMM0-XMM7
Additional args: pushed on stack (right to left)
Return value: RAX (or XMM0 for floating point)
Interactive: Assembly Instruction Matcher
Match the assembly instruction to its purpose:
3. Disassemblers and Debuggers
Tools are essential for reverse engineering. Disassemblers convert machine code back into
assembly language, while debuggers allow you to execute and inspect programs step-by-step.
Tool Comparison Chart
IDA Pro
Industry standard disassembler
Excellent decompiler (Hex-Rays)
Cross-platform support
Extensive plugin ecosystem
Commercial (expensive)
Ghidra
Free and open source (NSA)
Built-in decompiler
Multi-architecture support
Collaborative features
Java-based (requires JRE)
x64dbg
Modern Windows debugger
User-friendly interface
Active development
Plugin support
Free and open source
OllyDbg
Classic Windows debugger
Simple interface
Good for malware analysis
32-bit focused
Development ceased
GDB
GNU debugger for Linux
Command-line based
Powerful scripting
GEF/PEDA extensions
Free and open source
Binary Ninja
Modern disassembler
Fast and responsive UI
Built-in IL (intermediate language)
Python API
Commercial (affordable)
Typical Workflow
1
Static Analysis (Disassembler)
Load binary into IDA/Ghidra
Analyze strings and imports
Identify interesting functions
Study control flow graphs
Use decompiler for high-level view
2
Dynamic Analysis (Debugger)
Load binary in x64dbg/GDB
Set breakpoints at key locations
Step through execution
Inspect memory and registers
Monitor API calls and behavior
3
Iterate
Alternate between static and dynamic analysis
Verify hypotheses from static with dynamic behavior
Pro Tip: Start with static analysis to understand the overall structure,
then use dynamic analysis to verify your hypotheses and understand runtime behavior.
The combination of both approaches is more powerful than either alone.
Knowledge Check: Tools of the Trade
Answer all 3 questions correctly to complete this section.
4. Understanding Control Flow
Control flow refers to the order in which individual instructions are executed. Understanding
how high-level constructs (if/else, loops) translate to assembly is crucial for reverse engineering.
If/Else Statements in Assembly
; C Code:
if (x == 5) {
y = 10;
} else {
y = 20;
}
; Assembly equivalent:
CMP eax, 5 ; Compare x with 5
JNE else_block ; Jump if not equal
MOV ebx, 10 ; y = 10 (if block)
JMP end_if ; Skip else block
else_block:
MOV ebx, 20 ; y = 20 (else block)
end_if:
; Continue execution
Loop Recognition
; C Code: for (i = 0; i < 10; i++)
XOR ecx, ecx ; i = 0 (ECX is loop counter)
loop_start:
CMP ecx, 10 ; Compare i with 10
JGE loop_end ; Jump if greater or equal; Loop body here
INC ecx ; i++
JMP loop_start ; Jump back to start
loop_end:
; Continue after loop; C Code: while (x > 0)
while_start:
CMP eax, 0 ; Compare x with 0
JLE while_end ; Jump if less or equal; Loop body here
JMP while_start ; Jump back to condition
while_end:
Function Calls and Returns
; Calling a function
PUSH rbp ; Save base pointer
MOV rbp, rsp ; Set up new stack frame
SUB rsp, 32 ; Allocate stack space; Function body
MOV rsp, rbp ; Restore stack pointer
POP rbp ; Restore base pointer
RET ; Return to caller
Guided Walkthrough: Building Your First CFG
Before tackling the builder, let's walk through decomposing a simple assembly snippet into a control flow graph — step by step.
CMP eax, 0 ; Compare eax with zero
JE is_zero ; Jump if equal
MOV ebx, 1 ; Set ebx to 1
JMP done ; Skip to end
is_zero:
XOR ebx, ebx ; Set ebx to 0
done:
RET ; Return
Walkthrough complete! You've built a full control flow graph. Now try the builder below.
Interactive: Control Flow Graph Builder
Challenge 1/4: Simple If/Else
Build the control flow graph for this assembly code.
Mode: Drag
Branch:
5. Recognizing C Constructs in Assembly
When reverse engineering, you'll often need to recognize how high-level C constructs
appear in assembly code. This skill helps you understand the program's logic faster.
Variables and Data Types
; int x = 5;
MOV DWORD PTR [rbp-4], 5
; char c = 'A';
MOV BYTE PTR [rbp-5], 41h ; 0x41 = 'A'; long long value = 1000000;
MOV QWORD PTR [rbp-16], 0F4240h
; float f = 3.14;
MOVSS xmm0, DWORD PTR [.float_constant]
MOVSS DWORD PTR [rbp-20], xmm0
Structures and Arrays
; struct Person { int age; char name[20]; };; person.age = 25;
LEA rax, [rbp-32] ; Load address of struct
MOV DWORD PTR [rax], 25 ; Set age field (offset 0); strcpy(person.name, "John");
LEA rax, [rbp-32] ; Load struct address
ADD rax, 4 ; Offset to name field
LEA rdx, [.string] ; Source string
CALL strcpy
; int arr[5]; arr[2] = 10;
LEA rax, [rbp-40] ; Load array base address
MOV DWORD PTR [rax+8], 10 ; arr + (2 * 4 bytes)
String Operations
; strlen(str) - count characters until null terminator
MOV rdi, str_ptr ; String address in RDI
XOR rcx, rcx ; Counter = 0
strlen_loop:
CMP BYTE PTR [rdi+rcx], 0
JE strlen_done
INC rcx
JMP strlen_loop
strlen_done:
; Length now in RCX; strcmp(str1, str2) - compare strings
MOV rsi, str1_ptr
MOV rdi, str2_ptr
strcmp_loop:
MOVZX rax, BYTE PTR [rsi]
MOVZX rdx, BYTE PTR [rdi]
CMP rax, rdx
JNE strcmp_done
TEST rax, rax
JE strcmp_done
INC rsi
INC rdi
JMP strcmp_loop
strcmp_done:
SUB rax, rdx ; Return difference
Pointer Dereferencing
; int *ptr = &x;
LEA rax, [rbp-4] ; Get address of x
MOV QWORD PTR [rbp-16], rax ; Store in ptr; *ptr = 10;
MOV rax, QWORD PTR [rbp-16] ; Load pointer value
MOV DWORD PTR [rax], 10 ; Dereference and assign; y = *ptr;
MOV rax, QWORD PTR [rbp-16] ; Load pointer
MOV edx, DWORD PTR [rax] ; Dereference
MOV DWORD PTR [rbp-8], edx ; Store in y
Interactive: Match the C to Assembly
Drag the C code snippets to their corresponding assembly implementations:
6. Anti-Analysis Techniques
Malware authors employ various techniques to make analysis difficult. Understanding these
methods helps you recognize and defeat them during reverse engineering.
Anti-Debugging Tricks
; Technique 1: IsDebuggerPresent API
CALL IsDebuggerPresent
TEST eax, eax
JNE debugger_detected ; Jump if debugger present; Technique 2: PEB (Process Environment Block) check
MOV rax, QWORD PTR gs:[60h] ; Get PEB address
CMP BYTE PTR [rax+2], 0 ; Check BeingDebugged flag
JNE debugger_detected
; Technique 3: Timing check
RDTSC ; Read timestamp counter
MOV r8, rax ; Save timestamp; Some instructions here
RDTSC ; Read again
SUB rax, r8 ; Calculate difference
CMP rax, 1000 ; Too slow = debugger?
JA debugger_detected
; Technique 4: Exception-based detection
INT 3 ; Software breakpoint; If we reach here, exception was handled by debugger
; Check for VMware via I/O port
MOV eax, 564D5868h ; 'VMXh' magic value
MOV ebx, 0
MOV ecx, 10
MOV edx, 5658h ; 'VX'
IN eax, dx ; VMware I/O port
CMP ebx, 564D5868h ; Check response
JE vm_detected
; Check for VirtualBox via registry/files
LEA rcx, vbox_path
CALL GetFileAttributesA
CMP eax, -1
JNE vm_detected ; File exists = VirtualBox; CPUID-based VM detection
MOV eax, 1
CPUID
BT ecx, 31 ; Check hypervisor bit
JC vm_detected
Common Obfuscation Patterns
Dead Code Insertion: Adding code that never executes
Opaque Predicates: Conditional branches with predetermined outcomes
Control Flow Flattening: Converting control flow into switch statements
String Encryption: Decrypting strings at runtime
API Hashing: Resolving APIs by hash instead of name
Instruction Substitution: Replacing simple instructions with complex equivalents
Interactive: Spot the Anti-Analysis Code
Identify which code snippets contain anti-analysis techniques:
Analysis Warning: When encountering anti-analysis techniques, document them
carefully. Use patches, plugins, or modified environments to bypass them. Always maintain
detailed notes about what techniques were used and how you defeated them.
7. Practical Workflow and Best Practices
A systematic approach to reverse engineering increases efficiency and ensures you don't
miss critical details. Here's a proven workflow used by professional malware analysts.
Congratulations! You've completed the Reverse Engineering Basics module. You now have
foundational knowledge of assembly language, disassemblers, debuggers, and common
anti-analysis techniques.
Next Steps:
Practice analyzing real malware samples in a safe lab environment