Slide 1

Slide 1 text

[email protected] Malware Sandbox Emulation in Python

Slide 2

Slide 2 text

• Master degree of CSIE, NTUST • Security Researcher - chrO.ot, TDOHacker • Speaker - BlackHat, DEFCON, VXCON - HITCON, SITCON, iThome >_cat ./Bio

Slide 3

Slide 3 text

1. The challenges of Anti-Virus techniques 2. The implementation of WinAPI CreateProcess() 3. Build our own emulator for exe file (PE) - CPU? and emulate thread via Unicorn in Python - Deal with exe file mapping, IAT, EAT in our sandbox - Challenges of Malware Sandbox, and solution 4. Recap >_cat ./intro

Slide 4

Slide 4 text

[email protected] The challenges of Anti-Virus techniques

Slide 5

Slide 5 text

• Trend Anti-Virus products verify files in the technique named Malware Signature Detection • Anti-Virus products store all virus signatures in the cloud database • The most famous rule of malware signature is YARA >_Anti-Virus AntiVirus File System avideo.scr finance.docx malware.exe

Slide 6

Slide 6 text

>_YARA-Rule rule silent_banker : banker { meta: description = "This is just an example" thread_level = 3 in_the_wild = true strings: $a = {6A 40 68 00 30 00 00 6A 14 8D 91} $b = {8D 4D B0 2B C1 83 C0 27 99 6A 4E 59 F7 F9} $c = "UVODFRYSIHLNWPEJXQZAKCBGMT" condition: $a or $b or $c } virustotal.github.io/yara

Slide 7

Slide 7 text

With the improvement in malware technology, more and more malware source codes are released. A large number of malware variants have been developed in the wild. ┐(´∀`)┌ This makes Anti-Virus products hard to detect all malicious in Malware-Signature-Detection Technique. >_Challenge? AntiVirus File System ma1ware.docx malware.exe m4lW4.re.scr

Slide 8

Slide 8 text

What ...? Only Detect via Signature?

Slide 9

Slide 9 text

>_Solution?

Slide 10

Slide 10 text

>_Solution?

Slide 11

Slide 11 text

>_from AV to RCE landave.io/2018/06/f-secure-anti-virus-remote-code-execution-via-solid-rar-unpacking

Slide 12

Slide 12 text

A Cool Idea: Building a malware sandbox?

Slide 13

Slide 13 text

>_Cuckoo? cuckoosandbox.org

Slide 14

Slide 14 text

>_Cuckoo? cuckoosandbox.org

Slide 15

Slide 15 text

A Cool Idea: Building a malware emulator?

Slide 16

Slide 16 text

>_AV = more RCE? googleprojectzero.blogspot.com/2015/06/analysis-and-exploitation-of-eset.html

Slide 17

Slide 17 text

>_Sandbox? •Cuckoo-like Sandbox •Emulator-like Sandbox •VM-like Sandbox

Slide 18

Slide 18 text

>_Sandbox? •Cuckoo-like Sandbox •Emulator-like Sandbox •VM-like Sandbox

Slide 19

Slide 19 text

The Final Goal ⚑ Get all the behavior of malware files without execution

Slide 20

Slide 20 text

[email protected] General Compiler

Slide 21

Slide 21 text

General Compiler Source.cpp Object Files Main.exe Compiler Assembly Codes Assembler Linker

Slide 22

Slide 22 text

>_cat msgbox.c #include int main() { MessageBoxA( 0, "hi there.", "info", 0 ); return 0; }

Slide 23

Slide 23 text

based on x86 Calling Convention #include int main() { MessageBoxA( 0, "hi there.", "info", 0 ); return 0; } push 0 push "info" push "hi there." push 0 call MessageBoxA xor eax, eax ret en.wikipedia.org/wiki/X86_calling_conventions

Slide 24

Slide 24 text

>_Compiler xor eax, eax ret push 0 push "info" push "hi there." push 0 call MessageBoxA 0xdead: "info" 0xbeef: "hi there." .rdata section 0xcafe: 0x7630EA99 .idata section (Import Address Table)

Slide 25

Slide 25 text

xor eax, eax ret push 0 push offset "info" push offset "hi there." push 0 call MessageBoxA 0xdead: "info" 0xbeef: "hi there." .rdata section 0xcafe: 0x7630EA99 .idata section (Import Address Table) >_Compiler

Slide 26

Slide 26 text

xor eax, eax ret push 0 push 0x40dead push 0x40beef push 0 call ds:0x40cafe 0xdead: "info" 0xbeef: "hi there." .rdata section 0xcafe: 0x7630EA99 .idata section (Import Address Table) >_Compiler

Slide 27

Slide 27 text

push 0 ; 6A 00 push 0x40dead ; 68 AD DE 40 00 push 0x40beef ; 68 EF BE 40 00 push 0 ; 6A 00 call ds:0x40cafe ; FF 15 FE CA 00 00 xor eax, eax ; 33 C0 ret ; C3 >_Assembler

Slide 28

Slide 28 text

Main.exe .text Section 6A 00 68 AD DE 40 00 68 EF BE 40 00 6A 00 FF 15 FE CA 40 00 33 C0 C3 0xdead: "info" 0xbeef: "hi there." 0xcafe: 0x7630EA99 .rdata Section .idata Section

Slide 29

Slide 29 text

[email protected] The Implementation of WinAPI CreateProcess()

Slide 30

Slide 30 text

3) create first thread of this process, point register eax to AddressOfEntry, point ebx+8 (TIB base + 8) to image base, and point eip to ntdll!LdrInitializeThunk Process >_Process? Kernel (ring0) Application (ring3) 1) create process via CreateProcess() 2) mapping file into memory iexplorer.exe .data section .text section AddressOfEntry ntdll.dll kernel32.dll ...

Slide 31

Slide 31 text

>_Process? Process iexplorer.exe ntdll.dll kernel32.dll ... Call Stack -------------- _LdrpSnapModule _LdrpMapAndSnapDependency _LdrpMapDllWithSectionHandle _LdrpLoadKnownDll _LdrpFindOrPrepareLoadingModule _LdrpLoadDllInternal _LdrpLoadDll _LdrLoadDll _LdrpInitializeProcess __LdrpInitialize _LdrInitializeThunk fix import address table, fix export directory, apply relocation, etc .text section ntdll!LdrInitializeThunk LdrMapAndSnapDependency: fix import address table for every loaded dll image

Slide 32

Slide 32 text

>_ntdll!LdrpSnapModule

Slide 33

Slide 33 text

>_Process? ntdll!LdrInitializeThunk [email protected] Process iexplorer.exe .text section ntdll.dll kernel32.dll ... ntdll!RtlUserThreadStart note: RtlUserThreadStart is entrypoint of every thread. We can hijack thread via write shellcode address into global variable'LdrDelegatedRtlUserThreadStart'.

Slide 34

Slide 34 text

>_Qick Recap 1. Once a process creates, kernel maps each section into memory in the expected addresses 2. Next, kernel creates a new thread for this process. This thread will call ntdll!LdrInitializeThunk to repair Import Address Table (IAT), Export Address Table (EAT) and Relocation 3. Finally, thread enter func@AddressOfEntry

Slide 35

Slide 35 text

[email protected] Build our own emulator for *.exe file

Slide 36

Slide 36 text

>_for Emulator? 1. File Mapping - It's essential for codes to fetch information from each other section (e.g., [email protected] read [email protected]) 2. Repair IAT, Relocation - Repairing Import Address Table for malware to call WinAPI at correct address; In an emulator, it's easy for us to place PE image at expected image base, so we don't care about relocation. 3. Thread Simulation - We need to create a fake CPU unit to run every single instruction, and it's allowed us to monitor all behavior.

Slide 37

Slide 37 text

>_Emu in Python 1. File Mapping - It's essential for codes to fetch information from each other section (e.g., [email protected] read [email protected]) ➛ via PEFile.py 2. Repair IAT - Repairing Import Address Table for malware to call WinAPI at correct address ➛ via Unicorn.py + PEFile.py + Keystone.py 3. Thread Simulation - We need to create a fake CPU unit to run every single instruction, and it's allowed us to monitor all behavior ➛ via Unicorn.py

Slide 38

Slide 38 text

Unicorn is a lightweight multi-platform, multi-architecture CPU emulator framework. Highlight features: • Multi-architectures: Arm, Arm64, M68K, Mips, Sparc, x86, & x86_64. • Implemented in pure C language, with bindings for Crystal, Clojure, Visual Basic, Perl, Rust, Haskell, Ruby, Python, Java, Go, .NET, Delphi/Pascal & MSVC available. • Native support for Windows & *nix (with Mac OSX, Linux, *BSD & Solaris confirmed). • High performance by using Just-In-Time compiler technique. • Thread-safe by design.
 ... Unicorn is based on QEMU, but it goes much further with a lot more to offer. >_Unicorn.py

Slide 39

Slide 39 text

>_Unicorn.py

Slide 40

Slide 40 text

Challenge 1 >_File Mapping

Slide 41

Slide 41 text

DOS Program OptionalHeader NtHeader ... .ImageBase (0x400000) .SizeOfHeaders File Header .NumberOfSections .AddressOfEntryPoint SizeOfHeaders Section Header 1 (.text) ... sizeof(Section Header) = IMAGE_SIZEOF_SECTION_HEADER = 40(fixed) .SizeOfImage Section Header 2 Section Header 3 Section Data 1 (.text) .DataDirectory Exe File (PE)

Slide 42

Slide 42 text

DOS Program NtHeader ... Section Header 1 (.text) ... Section Header 2 Section Header 3 Section Data 1 (.text) SectionHeader[i] = PIMAGE_SECTION_HEADER( NtHeader + sizeof(IMAGE_NT_HEADERS) + IMAGE_SIZEOF_SECTION_HEADER * index ); IMAGE SECTION HEADER .VirtualAddress .SizeOfRawData .PointerToRawData Exe File (PE)

Slide 43

Slide 43 text

>_File Mapping

Slide 44

Slide 44 text

>_File Mapping

Slide 45

Slide 45 text

Challenge 2 >_Repair Import Address Table

Slide 46

Slide 46 text

>_ls ./agenda DOS Program OptionalHeader NtHeader ... .ImageBase (0x400000) .SizeOfHeaders File Header .NumberOfSections .AddressOfEntryPoint SizeOfHeaders Section Header 1 (.text) ... sizeof(Section Header) = IMAGE_SIZEOF_SECTION_HEADER = 40(fixed) .SizeOfImage Section Header 2 Section Header 3 Section Data 1 (.text) .DataDirectory Exe File (PE)

Slide 47

Slide 47 text

>_ls ./agenda DOS Program OptionalHeader NtHeader ... DataDirectory Export Directory index 0 Import Directory 1 Exception Directory 2 Security Directory Base Relocation Table Resource Directory Import Address Table 3 4 5 ... 12 ... IMAGE_DATA_DIRECTORY[16] typedef struct _IMAGE_DATA_DIRECTORY { DWORD VirtualAddress; DWORD Size; } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY; Exe File (PE)

Slide 48

Slide 48 text

>_ls ./agenda DOS Program OptionalHeader NtHeader ... DataDirectory index Import Address Table ... 12 ... typedef struct _IMAGE_DATA_DIRECTORY { DWORD VirtualAddress; DWORD Size; } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY; .VirtualAddress .Size Exe File (PE)

Slide 49

Slide 49 text

>_ls ./agenda DOS Program OptionalHeader NtHeader ... DataDirectory index Import Address Table ... 12 ... .VirtualAddress .Size IMAGE_IMPORT_DESCRIPTOR Exe File (PE)

Slide 50

Slide 50 text

IMAGE_IMPORT_DESCRIPTOR Array IMAGE_IMPORT_DESCRIPTOR 1 IMAGE_IMPORT_DESCRIPTOR 2 IMAGE_IMPORT_DESCRIPTOR 3 ... \x00\x00\x00\x00\x00\x00\x00 Fixed Size Fixed Size Fixed Size OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size sizeof(IMAGE_IMPORT_DESCRIPTOR) NT Header sizeof(Descriptor Array) Exe File (PE)

Slide 51

Slide 51 text

typedef struct _IMAGE_IMPORT_DESCRIPTOR { union { DWORD Characteristics;// 0 for terminating null import descriptor DWORD OriginalFirstThunk;// RVA to original unbound IAT (PIMAGE_THUNK_DATA) } DUMMYUNIONNAME; DWORD TimeDateStamp;// 0 if not bound, // -1 if bound, and real date\time stamp // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND) // O.W. date/time stamp of DLL bound to (Old BIND) DWORD ForwarderChain; // -1 if no forwarders DWORD Name; DWORD FirstThunk; // RVA to IAT (if bound this IAT has actual addresses) } IMAGE_IMPORT_DESCRIPTOR; typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR; IMAGE_IMPORT_DESCRIPTOR

Slide 52

Slide 52 text

IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size NT Header .OriginalFirstThunk .FirstThunk .Name (User32.dll) IMAGE_IMPORT_BY_NAME: MessageBoxA typedef struct _IMAGE_IMPORT_BY_NAME { WORD Hint; CHAR Name[1]; } IMAGE_IMPORT_BY_NAME; Exe File (PE)

Slide 53

Slide 53 text

xor eax, eax ret push 0 push 0x40dead push 0x40beef push 0 call ds:0x40cafe IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size NT Header .OriginalFirstThunk .FirstThunk = 0xcafe .Name (User32.dll) IMAGE_IMPORT_BY_NAME: MessageBoxA Exe File (PE)

Slide 54

Slide 54 text

IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size NT Header .FirstThunk = 0xcafe .FirstThunk .Name (User32.dll) IMAGE_IMPORT_BY_NAME: MessageBoxA HANDLE mod = LoadLibrary("User32.dll"); GetProcAddress(mod, "MessageBoxA") = 0x7547EA99 0x7547EA99 Exe File (PE)

Slide 55

Slide 55 text

Thunk Array IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size NT Header MessageBoxA: *(uint32_t *)0xcafe = 0x7547EA99 .FirstThunk = 0xcafe .Thunk = 0xcaff .Thunk = 0xcb00 .Thunk = 0xcb02 .Name (User32.dll) LoadStringA: *(uint32_t *)0xcaff = 0x75130D4D KillTimer: *(uint32_t *)0xcb01 = 0x754364C7 Lastest Thunk: *(uint32_t *)0xcb02 = NULL Exe File (PE)

Slide 56

Slide 56 text

>_My Win32 Internal ˉ ̶̡̭̭ (´∀`๑) ˉ ̶̡̭̭

Slide 57

Slide 57 text

>_Repair IAT

Slide 58

Slide 58 text

>_Repair IAT

Slide 59

Slide 59 text

Challenge 3 >_Emulation Thread

Slide 60

Slide 60 text

>_Thread Registers eax 41414141 ebx 42424242 ecx 43434343 edx 44444444 ... ... esp 7ffffffc ebp 7ffffffc eip 401000 Main.exe .text Section addr @ 401000: 6A 00 68 AD DE 40 00 68 EF BE 40 00 6A 00 FF 15 FE CA 40 00 33 C0 C3

Slide 61

Slide 61 text

>_Periodic Table? sparksandflames.com/files/x86InstructionChart.html

Slide 62

Slide 62 text

>_Thread addr @ 401000: 6A 00 68 AD DE 40 00 68 EF BE 40 00 6A 00 FF 15 FE CA 40 00 33 C0 C3 push 0 push 0x40dead push 0x40beef push 0 call ds:0x40cafe xor eax, eax ret via x86 Instruction Set Registers eax 41414141 ebx 42424242 ecx 43434343 edx 44444444 ... ... esp 7ffffffc ebp 7ffffffc eip 401000

Slide 63

Slide 63 text

>_hook

Slide 64

Slide 64 text

>_hook

Slide 65

Slide 65 text

>_Emulation

Slide 66

Slide 66 text

>_Downloader

Slide 67

Slide 67 text

Slide 68

Slide 68 text

Challenge 4 >_TEB, PEB, LDR, ...

Slide 69

Slide 69 text

>_Repair IAT

Slide 70

Slide 70 text

>_Challenge 4

Slide 71

Slide 71 text

Slide 72

Slide 72 text

>_Bypass blog.trendmicro.com/trendlabs-security-intelligence/new-emotet-hijacks-windows-api-evades-sandbox-analysis

Slide 73

Slide 73 text

>_AV Leak hwww.youtube.com/watch?v=a6yOwvFds78

Slide 74

Slide 74 text

>_Sandbox • VM-like Sandbox • Slow, resource consuming • Easily to be identified • Loader-like Sandbox • Not easy to implement whole system API • Cuckoo-like Sandbox • Not isolated environment, vulnerable

Slide 75

Slide 75 text

Thanks! [email protected] Slide Github @aaaddress1 Facebook