Save 37% off PRO during our Black Friday Sale! »

Malware Sandbox Emulation in Python

229b1596ce57cd0935a2bacd410d87a0?s=47 adr
August 01, 2018

Malware Sandbox Emulation in Python

議程內設計的虛擬機 PoC: github.com/aaaddress1/vtMal

許多知名防毒軟體在查殺惡意程式時,經常會使用已知的特徵碼來確認一支執行文件是否為惡意程式;不過現今惡意程式技術逐步進步,大量的惡意程式變種、新型態的惡意程式與日俱增之下,以往作為防毒軟體核心技術——特徵碼查殺已經無法阻擋下所有的惡意程式了,我們就該如此止步讓駭客為所欲為嗎?

本議程將提出一個概念,有別以往防毒軟體核心使用的特徵碼查殺;將介紹以 Python 開發一個虛擬化技術的沙箱運行惡意程式。使惡意程式在未被使用者真正打開執行之前,將此惡意程式預先在假的虛擬 Windows 作業系統架構下執行起來,卻不會感染到使用者真正的作業系統環境。

議程內將提及如何以 Python 開發一個簡易的 EXE 虛擬化執行虛擬機(針對單一執行程式)。剖析 Windows EXE 結構體(PE 結構)並模擬系統程式裝載器(PE Loader)創建一個新的 Process 的過程 e.g. Section Mapping, IAT Apply, Section Relocation, etc,並透過 Unicorn Engine 做記憶體管理與模擬執行緒運行組合語言指令,過程中將在虛擬環境內記錄下所有程式的行為。

When anti-virus software try to detect malwares, it will usually use some well known pattern to verify is this execution file is malicious or not. However, the technique of malware will also improve. The core technical of anti-virus pattern recognition have not been able to defent moden malware. Should we let the hacker do whatever they want?

In this session, the speaker will provide a new concept. By using Python to implement a virtualization sandbox to run the malware. Before the OS execute the malware, we could pre-run this malware in a virtual Windows OS environment and will not infect the real OS environment.

This session will introduce how to use Python implement a EXE simulator (Singal thread). Will analyze the Windows PE format and how to simulate PE Loader to create a Windows process. e.g Section Mapping, IAT Apply, Section Relocation, etc. By using Unicorn Engine to manage the memory and simulate the process to run the CPU instruction. During this flow we can recode all the data inside this simulator.

229b1596ce57cd0935a2bacd410d87a0?s=128

adr

August 01, 2018
Tweet

Transcript

  1. aaaddress1@chroot.org Malware Sandbox Emulation in Python

  2. • Master degree of CSIE, NTUST • Security Researcher -

    chrO.ot, TDOHacker • Speaker - BlackHat, DEFCON, VXCON - HITCON, SITCON, iThome >_cat ./Bio
  3. 1. The challenges of Anti-Virus techniques 2. The implementation of

    WinAPI CreateProcess() 3. Build our own emulator for exe file (PE) - CPU? and emulate thread via Unicorn in Python - Deal with exe file mapping, IAT, EAT in our sandbox - Challenges of Malware Sandbox, and solution 4. Recap >_cat ./intro
  4. aaaddress1@chroot.org The challenges of Anti-Virus techniques

  5. • Trend Anti-Virus products verify files in the technique named

    Malware Signature Detection • Anti-Virus products store all virus signatures in the cloud database • The most famous rule of malware signature is YARA >_Anti-Virus AntiVirus File System avideo.scr finance.docx malware.exe
  6. >_YARA-Rule rule silent_banker : banker { meta: description = "This

    is just an example" thread_level = 3 in_the_wild = true strings: $a = {6A 40 68 00 30 00 00 6A 14 8D 91} $b = {8D 4D B0 2B C1 83 C0 27 99 6A 4E 59 F7 F9} $c = "UVODFRYSIHLNWPEJXQZAKCBGMT" condition: $a or $b or $c } virustotal.github.io/yara
  7. With the improvement in malware technology, more and more malware

    source codes are released. A large number of malware variants have been developed in the wild. ┐(´∀`)┌ This makes Anti-Virus products hard to detect all malicious in Malware-Signature-Detection Technique. >_Challenge? AntiVirus File System ma1ware.docx malware.exe m4lW4.re.scr
  8. What ...? Only Detect via Signature?

  9. >_Solution?

  10. >_Solution?

  11. >_from AV to RCE landave.io/2018/06/f-secure-anti-virus-remote-code-execution-via-solid-rar-unpacking

  12. A Cool Idea: Building a malware sandbox?

  13. >_Cuckoo? cuckoosandbox.org

  14. >_Cuckoo? cuckoosandbox.org

  15. A Cool Idea: Building a malware emulator?

  16. >_AV = more RCE? googleprojectzero.blogspot.com/2015/06/analysis-and-exploitation-of-eset.html

  17. >_Sandbox? •Cuckoo-like Sandbox •Emulator-like Sandbox •VM-like Sandbox

  18. >_Sandbox? •Cuckoo-like Sandbox •Emulator-like Sandbox •VM-like Sandbox

  19. The Final Goal ⚑ Get all the behavior of malware

    files without execution
  20. aaaddress1@chroot.org General Compiler

  21. General Compiler Source.cpp Object Files Main.exe Compiler Assembly Codes Assembler

    Linker
  22. >_cat msgbox.c #include <Windows.h> int main() { MessageBoxA( 0, "hi

    there.", "info", 0 ); return 0; }
  23. based on x86 Calling Convention #include <Windows.h> int main() {

    MessageBoxA( 0, "hi there.", "info", 0 ); return 0; } push 0 push "info" push "hi there." push 0 call MessageBoxA xor eax, eax ret en.wikipedia.org/wiki/X86_calling_conventions
  24. >_Compiler xor eax, eax ret push 0 push "info" push

    "hi there." push 0 call MessageBoxA 0xdead: "info" 0xbeef: "hi there." .rdata section 0xcafe: 0x7630EA99 .idata section (Import Address Table)
  25. xor eax, eax ret push 0 push offset "info" push

    offset "hi there." push 0 call MessageBoxA 0xdead: "info" 0xbeef: "hi there." .rdata section 0xcafe: 0x7630EA99 .idata section (Import Address Table) >_Compiler
  26. xor eax, eax ret push 0 push 0x40dead push 0x40beef

    push 0 call ds:0x40cafe 0xdead: "info" 0xbeef: "hi there." .rdata section 0xcafe: 0x7630EA99 .idata section (Import Address Table) >_Compiler
  27. push 0 ; 6A 00 push 0x40dead ; 68 AD

    DE 40 00 push 0x40beef ; 68 EF BE 40 00 push 0 ; 6A 00 call ds:0x40cafe ; FF 15 FE CA 00 00 xor eax, eax ; 33 C0 ret ; C3 >_Assembler
  28. Main.exe .text Section 6A 00 68 AD DE 40 00

    68 EF BE 40 00 6A 00 FF 15 FE CA 40 00 33 C0 C3 0xdead: "info" 0xbeef: "hi there." 0xcafe: 0x7630EA99 .rdata Section .idata Section
  29. aaaddress1@chroot.org The Implementation of WinAPI CreateProcess()

  30. 3) create first thread of this process, point register eax

    to AddressOfEntry, point ebx+8 (TIB base + 8) to image base, and point eip to ntdll!LdrInitializeThunk Process >_Process? Kernel (ring0) Application (ring3) 1) create process via CreateProcess() 2) mapping file into memory iexplorer.exe .data section .text section AddressOfEntry ntdll.dll kernel32.dll ...
  31. >_Process? Process iexplorer.exe ntdll.dll kernel32.dll ... Call Stack -------------- _LdrpSnapModule

    _LdrpMapAndSnapDependency _LdrpMapDllWithSectionHandle _LdrpLoadKnownDll _LdrpFindOrPrepareLoadingModule _LdrpLoadDllInternal _LdrpLoadDll _LdrLoadDll _LdrpInitializeProcess __LdrpInitialize _LdrInitializeThunk fix import address table, fix export directory, apply relocation, etc .text section ntdll!LdrInitializeThunk LdrMapAndSnapDependency: fix import address table for every loaded dll image
  32. >_ntdll!LdrpSnapModule

  33. >_Process? ntdll!LdrInitializeThunk AddressOfEntry@.text Process iexplorer.exe .text section ntdll.dll kernel32.dll ...

    ntdll!RtlUserThreadStart note: RtlUserThreadStart is entrypoint of every thread. We can hijack thread via write shellcode address into global variable'LdrDelegatedRtlUserThreadStart'.
  34. >_Qick Recap 1. Once a process creates, kernel maps each

    section into memory in the expected addresses 2. Next, kernel creates a new thread for this process. This thread will call ntdll!LdrInitializeThunk to repair Import Address Table (IAT), Export Address Table (EAT) and Relocation 3. Finally, thread enter func@AddressOfEntry
  35. aaaddress1@chroot.org Build our own emulator for *.exe file

  36. >_for Emulator? 1. File Mapping - It's essential for codes

    to fetch information from each other section (e.g., codes@.text read text@.rdata) 2. Repair IAT, Relocation - Repairing Import Address Table for malware to call WinAPI at correct address; In an emulator, it's easy for us to place PE image at expected image base, so we don't care about relocation. 3. Thread Simulation - We need to create a fake CPU unit to run every single instruction, and it's allowed us to monitor all behavior.
  37. >_Emu in Python 1. File Mapping - It's essential for

    codes to fetch information from each other section (e.g., codes@.text read text@.rdata) ➛ via PEFile.py 2. Repair IAT - Repairing Import Address Table for malware to call WinAPI at correct address ➛ via Unicorn.py + PEFile.py + Keystone.py 3. Thread Simulation - We need to create a fake CPU unit to run every single instruction, and it's allowed us to monitor all behavior ➛ via Unicorn.py
  38. Unicorn is a lightweight multi-platform, multi-architecture CPU emulator framework. Highlight

    features: • Multi-architectures: Arm, Arm64, M68K, Mips, Sparc, x86, & x86_64. • Implemented in pure C language, with bindings for Crystal, Clojure, Visual Basic, Perl, Rust, Haskell, Ruby, Python, Java, Go, .NET, Delphi/Pascal & MSVC available. • Native support for Windows & *nix (with Mac OSX, Linux, *BSD & Solaris confirmed). • High performance by using Just-In-Time compiler technique. • Thread-safe by design.
 ... Unicorn is based on QEMU, but it goes much further with a lot more to offer. >_Unicorn.py
  39. >_Unicorn.py

  40. Challenge 1 >_File Mapping

  41. DOS Program OptionalHeader NtHeader ... .ImageBase (0x400000) .SizeOfHeaders File Header

    .NumberOfSections .AddressOfEntryPoint SizeOfHeaders Section Header 1 (.text) ... sizeof(Section Header) = IMAGE_SIZEOF_SECTION_HEADER = 40(fixed) .SizeOfImage Section Header 2 Section Header 3 Section Data 1 (.text) .DataDirectory Exe File (PE)
  42. DOS Program NtHeader ... Section Header 1 (.text) ... Section

    Header 2 Section Header 3 Section Data 1 (.text) SectionHeader[i] = PIMAGE_SECTION_HEADER( NtHeader + sizeof(IMAGE_NT_HEADERS) + IMAGE_SIZEOF_SECTION_HEADER * index ); IMAGE SECTION HEADER .VirtualAddress .SizeOfRawData .PointerToRawData Exe File (PE)
  43. >_File Mapping

  44. >_File Mapping

  45. Challenge 2 >_Repair Import Address Table

  46. >_ls ./agenda DOS Program OptionalHeader NtHeader ... .ImageBase (0x400000) .SizeOfHeaders

    File Header .NumberOfSections .AddressOfEntryPoint SizeOfHeaders Section Header 1 (.text) ... sizeof(Section Header) = IMAGE_SIZEOF_SECTION_HEADER = 40(fixed) .SizeOfImage Section Header 2 Section Header 3 Section Data 1 (.text) .DataDirectory Exe File (PE)
  47. >_ls ./agenda DOS Program OptionalHeader NtHeader ... DataDirectory Export Directory

    index 0 Import Directory 1 Exception Directory 2 Security Directory Base Relocation Table Resource Directory Import Address Table 3 4 5 ... 12 ... IMAGE_DATA_DIRECTORY[16] typedef struct _IMAGE_DATA_DIRECTORY { DWORD VirtualAddress; DWORD Size; } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY; Exe File (PE)
  48. >_ls ./agenda DOS Program OptionalHeader NtHeader ... DataDirectory index Import

    Address Table ... 12 ... typedef struct _IMAGE_DATA_DIRECTORY { DWORD VirtualAddress; DWORD Size; } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY; .VirtualAddress .Size Exe File (PE)
  49. >_ls ./agenda DOS Program OptionalHeader NtHeader ... DataDirectory index Import

    Address Table ... 12 ... .VirtualAddress .Size IMAGE_IMPORT_DESCRIPTOR Exe File (PE)
  50. IMAGE_IMPORT_DESCRIPTOR Array IMAGE_IMPORT_DESCRIPTOR 1 IMAGE_IMPORT_DESCRIPTOR 2 IMAGE_IMPORT_DESCRIPTOR 3 ... \x00\x00\x00\x00\x00\x00\x00

    Fixed Size Fixed Size Fixed Size OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size sizeof(IMAGE_IMPORT_DESCRIPTOR) NT Header sizeof(Descriptor Array) Exe File (PE)
  51. typedef struct _IMAGE_IMPORT_DESCRIPTOR { union { DWORD Characteristics;// 0 for

    terminating null import descriptor DWORD OriginalFirstThunk;// RVA to original unbound IAT (PIMAGE_THUNK_DATA) } DUMMYUNIONNAME; DWORD TimeDateStamp;// 0 if not bound, // -1 if bound, and real date\time stamp // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND) // O.W. date/time stamp of DLL bound to (Old BIND) DWORD ForwarderChain; // -1 if no forwarders DWORD Name; DWORD FirstThunk; // RVA to IAT (if bound this IAT has actual addresses) } IMAGE_IMPORT_DESCRIPTOR; typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR; IMAGE_IMPORT_DESCRIPTOR
  52. IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress

    .Size NT Header .OriginalFirstThunk .FirstThunk .Name (User32.dll) IMAGE_IMPORT_BY_NAME: MessageBoxA typedef struct _IMAGE_IMPORT_BY_NAME { WORD Hint; CHAR Name[1]; } IMAGE_IMPORT_BY_NAME; Exe File (PE)
  53. xor eax, eax ret push 0 push 0x40dead push 0x40beef

    push 0 call ds:0x40cafe IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress .Size NT Header .OriginalFirstThunk .FirstThunk = 0xcafe .Name (User32.dll) IMAGE_IMPORT_BY_NAME: MessageBoxA Exe File (PE)
  54. IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table 12 .VirtualAddress

    .Size NT Header .FirstThunk = 0xcafe .FirstThunk .Name (User32.dll) IMAGE_IMPORT_BY_NAME: MessageBoxA HANDLE mod = LoadLibrary("User32.dll"); GetProcAddress(mod, "MessageBoxA") = 0x7547EA99 0x7547EA99 Exe File (PE)
  55. Thunk Array IMAGE_IMPORT_DESCRIPTOR 1 OptionalHeader DataDirectory index Import Address Table

    12 .VirtualAddress .Size NT Header MessageBoxA: *(uint32_t *)0xcafe = 0x7547EA99 .FirstThunk = 0xcafe .Thunk = 0xcaff .Thunk = 0xcb00 .Thunk = 0xcb02 .Name (User32.dll) LoadStringA: *(uint32_t *)0xcaff = 0x75130D4D KillTimer: *(uint32_t *)0xcb01 = 0x754364C7 Lastest Thunk: *(uint32_t *)0xcb02 = NULL Exe File (PE)
  56. >_My Win32 Internal ˉ ̶̡̭̭ (´∀`๑) ˉ ̶̡̭̭

  57. >_Repair IAT

  58. >_Repair IAT

  59. Challenge 3 >_Emulation Thread

  60. >_Thread Registers eax 41414141 ebx 42424242 ecx 43434343 edx 44444444

    ... ... esp 7ffffffc ebp 7ffffffc eip 401000 Main.exe .text Section addr @ 401000: 6A 00 68 AD DE 40 00 68 EF BE 40 00 6A 00 FF 15 FE CA 40 00 33 C0 C3
  61. >_Periodic Table? sparksandflames.com/files/x86InstructionChart.html

  62. >_Thread addr @ 401000: 6A 00 68 AD DE 40

    00 68 EF BE 40 00 6A 00 FF 15 FE CA 40 00 33 C0 C3 push 0 push 0x40dead push 0x40beef push 0 call ds:0x40cafe xor eax, eax ret via x86 Instruction Set Registers eax 41414141 ebx 42424242 ecx 43434343 edx 44444444 ... ... esp 7ffffffc ebp 7ffffffc eip 401000
  63. >_hook

  64. >_hook

  65. >_Emulation

  66. >_Downloader

  67. aaaddress1@chroot.org Demo

  68. Challenge 4 >_TEB, PEB, LDR, ...

  69. >_Repair IAT

  70. >_Challenge 4

  71. aaaddress1@chroot.org Recap

  72. >_Bypass blog.trendmicro.com/trendlabs-security-intelligence/new-emotet-hijacks-windows-api-evades-sandbox-analysis

  73. >_AV Leak hwww.youtube.com/watch?v=a6yOwvFds78

  74. >_Sandbox • VM-like Sandbox • Slow, resource consuming • Easily

    to be identified • Loader-like Sandbox • Not easy to implement whole system API • Cuckoo-like Sandbox • Not isolated environment, vulnerable
  75. Thanks! aaaddress1@chroot.org Slide Github @aaaddress1 Facebook