Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Malware Sandbox Emulation in Python

adr
August 01, 2018

Malware Sandbox Emulation in Python

議程內設計的虛擬機 PoC: github.com/aaaddress1/vtMal

許多知名防毒軟體在查殺惡意程式時,經常會使用已知的特徵碼來確認一支執行文件是否為惡意程式;不過現今惡意程式技術逐步進步,大量的惡意程式變種、新型態的惡意程式與日俱增之下,以往作為防毒軟體核心技術——特徵碼查殺已經無法阻擋下所有的惡意程式了,我們就該如此止步讓駭客為所欲為嗎?

本議程將提出一個概念,有別以往防毒軟體核心使用的特徵碼查殺;將介紹以 Python 開發一個虛擬化技術的沙箱運行惡意程式。使惡意程式在未被使用者真正打開執行之前,將此惡意程式預先在假的虛擬 Windows 作業系統架構下執行起來,卻不會感染到使用者真正的作業系統環境。

議程內將提及如何以 Python 開發一個簡易的 EXE 虛擬化執行虛擬機(針對單一執行程式)。剖析 Windows EXE 結構體(PE 結構)並模擬系統程式裝載器(PE Loader)創建一個新的 Process 的過程 e.g. Section Mapping, IAT Apply, Section Relocation, etc,並透過 Unicorn Engine 做記憶體管理與模擬執行緒運行組合語言指令,過程中將在虛擬環境內記錄下所有程式的行為。

When anti-virus software try to detect malwares, it will usually use some well known pattern to verify is this execution file is malicious or not. However, the technique of malware will also improve. The core technical of anti-virus pattern recognition have not been able to defent moden malware. Should we let the hacker do whatever they want?

In this session, the speaker will provide a new concept. By using Python to implement a virtualization sandbox to run the malware. Before the OS execute the malware, we could pre-run this malware in a virtual Windows OS environment and will not infect the real OS environment.

This session will introduce how to use Python implement a EXE simulator (Singal thread). Will analyze the Windows PE format and how to simulate PE Loader to create a Windows process. e.g Section Mapping, IAT Apply, Section Relocation, etc. By using Unicorn Engine to manage the memory and simulate the process to run the CPU instruction. During this flow we can recode all the data inside this simulator.

adr

August 01, 2018
Tweet

More Decks by adr

Other Decks in Technology

Transcript

  1. [email protected]
    Malware Sandbox Emulation
    in Python

    View Slide

  2. • Master degree of CSIE, NTUST
    • Security Researcher - chrO.ot, TDOHacker
    • Speaker
    - BlackHat, DEFCON, VXCON
    - HITCON, SITCON, iThome
    >_cat ./Bio

    View Slide

  3. 1. The challenges of Anti-Virus techniques
    2. The implementation of WinAPI CreateProcess()
    3. Build our own emulator for exe file (PE)
    - CPU? and emulate thread via Unicorn in Python
    - Deal with exe file mapping, IAT, EAT in our sandbox
    - Challenges of Malware Sandbox, and solution
    4. Recap
    >_cat ./intro

    View Slide

  4. [email protected]
    The challenges
    of Anti-Virus techniques

    View Slide

  5. • Trend Anti-Virus products verify files in the technique
    named Malware Signature Detection
    • Anti-Virus products store all virus signatures in the
    cloud database
    • The most famous rule of malware signature is YARA
    >_Anti-Virus
    AntiVirus File System
    avideo.scr
    finance.docx malware.exe

    View Slide

  6. >_YARA-Rule
    rule silent_banker : banker {
    meta:
    description = "This is just an example"
    thread_level = 3
    in_the_wild = true
    strings:
    $a = {6A 40 68 00 30 00 00 6A 14 8D 91}
    $b = {8D 4D B0 2B C1 83 C0 27 99 6A 4E 59 F7 F9}
    $c = "UVODFRYSIHLNWPEJXQZAKCBGMT"
    condition:
    $a or $b or $c
    }
    virustotal.github.io/yara

    View Slide

  7. With the improvement in malware technology, more and more
    malware source codes are released. A large number of malware
    variants have been developed in the wild.
    ┐(´∀`)┌ This makes Anti-Virus products hard to detect all
    malicious in Malware-Signature-Detection Technique.
    >_Challenge?
    AntiVirus File System
    ma1ware.docx
    malware.exe m4lW4.re.scr

    View Slide

  8. What ...?
    Only Detect via Signature?

    View Slide

  9. >_Solution?

    View Slide

  10. >_Solution?

    View Slide

  11. >_from AV to RCE
    landave.io/2018/06/f-secure-anti-virus-remote-code-execution-via-solid-rar-unpacking

    View Slide

  12. A Cool Idea:
    Building a malware sandbox?

    View Slide

  13. >_Cuckoo?
    cuckoosandbox.org

    View Slide

  14. >_Cuckoo?
    cuckoosandbox.org

    View Slide

  15. A Cool Idea:
    Building a malware emulator?

    View Slide

  16. >_AV = more RCE?
    googleprojectzero.blogspot.com/2015/06/analysis-and-exploitation-of-eset.html

    View Slide

  17. >_Sandbox?
    •Cuckoo-like Sandbox
    •Emulator-like Sandbox
    •VM-like Sandbox

    View Slide

  18. >_Sandbox?
    •Cuckoo-like Sandbox
    •Emulator-like Sandbox
    •VM-like Sandbox

    View Slide

  19. The Final Goal ⚑
    Get all the behavior of
    malware files without execution

    View Slide

  20. [email protected]
    General Compiler

    View Slide

  21. General Compiler
    Source.cpp Object Files Main.exe
    Compiler
    Assembly Codes
    Assembler
    Linker

    View Slide

  22. >_cat msgbox.c
    #include
    int main()
    {
    MessageBoxA(
    0, "hi there.", "info", 0
    );
    return 0;
    }

    View Slide

  23. based on
    x86 Calling Convention
    #include
    int main() {
    MessageBoxA(
    0,
    "hi there.",
    "info", 0
    );
    return 0;
    }
    push 0
    push "info"
    push "hi there."
    push 0
    call MessageBoxA
    xor eax, eax
    ret
    en.wikipedia.org/wiki/X86_calling_conventions

    View Slide

  24. >_Compiler
    xor eax, eax
    ret
    push 0
    push "info"
    push "hi there."
    push 0
    call MessageBoxA
    0xdead: "info"
    0xbeef: "hi there."
    .rdata section
    0xcafe: 0x7630EA99
    .idata section

    (Import Address Table)

    View Slide

  25. xor eax, eax
    ret
    push 0
    push offset "info"
    push offset "hi there."
    push 0
    call MessageBoxA
    0xdead: "info"
    0xbeef: "hi there."
    .rdata section
    0xcafe: 0x7630EA99
    .idata section

    (Import Address Table)
    >_Compiler

    View Slide

  26. xor eax, eax
    ret
    push 0
    push 0x40dead
    push 0x40beef
    push 0
    call ds:0x40cafe
    0xdead: "info"
    0xbeef: "hi there."
    .rdata section
    0xcafe: 0x7630EA99
    .idata section

    (Import Address Table)
    >_Compiler

    View Slide

  27. push 0 ; 6A 00
    push 0x40dead ; 68 AD DE 40 00
    push 0x40beef ; 68 EF BE 40 00
    push 0 ; 6A 00
    call ds:0x40cafe ; FF 15 FE CA 00 00
    xor eax, eax ; 33 C0
    ret ; C3
    >_Assembler

    View Slide

  28. Main.exe
    .text Section
    6A 00
    68 AD DE 40 00
    68 EF BE 40 00
    6A 00
    FF 15 FE CA 40 00
    33 C0
    C3
    0xdead: "info"
    0xbeef: "hi there."
    0xcafe: 0x7630EA99
    .rdata Section
    .idata Section

    View Slide

  29. [email protected]
    The Implementation
    of WinAPI CreateProcess()

    View Slide

  30. 3) create first thread of this process,
    point register eax to AddressOfEntry,
    point ebx+8 (TIB base + 8) to image base,
    and point eip to ntdll!LdrInitializeThunk
    Process
    >_Process?
    Kernel (ring0)
    Application (ring3)
    1) create process
    via CreateProcess()
    2) mapping file into memory
    iexplorer.exe
    .data section
    .text section
    AddressOfEntry
    ntdll.dll
    kernel32.dll
    ...

    View Slide

  31. >_Process?
    Process
    iexplorer.exe
    ntdll.dll
    kernel32.dll
    ...
    Call Stack
    --------------
    _LdrpSnapModule
    _LdrpMapAndSnapDependency
    _LdrpMapDllWithSectionHandle
    _LdrpLoadKnownDll
    _LdrpFindOrPrepareLoadingModule
    _LdrpLoadDllInternal
    _LdrpLoadDll
    _LdrLoadDll
    _LdrpInitializeProcess
    __LdrpInitialize
    _LdrInitializeThunk
    fix import address table,
    fix export directory,
    apply relocation, etc
    .text section
    ntdll!LdrInitializeThunk
    LdrMapAndSnapDependency:
    fix import address table for
    every loaded dll image

    View Slide

  32. >_ntdll!LdrpSnapModule

    View Slide

  33. >_Process?
    ntdll!LdrInitializeThunk
    [email protected]
    Process
    iexplorer.exe
    .text section
    ntdll.dll
    kernel32.dll
    ...
    ntdll!RtlUserThreadStart
    note: RtlUserThreadStart is entrypoint of every thread.
    We can hijack thread via write shellcode address into
    global variable'LdrDelegatedRtlUserThreadStart'.

    View Slide

  34. >_Qick Recap
    1. Once a process creates, kernel maps each section into
    memory in the expected addresses
    2. Next, kernel creates a new thread for this process.
    This thread will call ntdll!LdrInitializeThunk to
    repair Import Address Table (IAT), Export Address Table
    (EAT) and Relocation
    3. Finally, thread enter [email protected]

    View Slide

  35. [email protected]
    Build our own emulator
    for *.exe file

    View Slide

  36. >_for Emulator?
    1. File Mapping - It's essential for codes to fetch
    information from each other section (e.g., [email protected]
    read [email protected])
    2. Repair IAT, Relocation - Repairing Import Address Table
    for malware to call WinAPI at correct address; In an
    emulator, it's easy for us to place PE image at
    expected image base, so we don't care about relocation.
    3. Thread Simulation - We need to create a fake CPU unit
    to run every single instruction, and it's allowed us to
    monitor all behavior.

    View Slide

  37. >_Emu in Python
    1. File Mapping - It's essential for codes to fetch
    information from each other section (e.g., [email protected]
    read [email protected]) ➛ via PEFile.py
    2. Repair IAT - Repairing Import Address Table for malware
    to call WinAPI at correct address ➛ via Unicorn.py +
    PEFile.py + Keystone.py
    3. Thread Simulation - We need to create a fake CPU unit
    to run every single instruction, and it's allowed us to
    monitor all behavior ➛ via Unicorn.py

    View Slide

  38. Unicorn is a lightweight multi-platform, multi-architecture CPU
    emulator framework.
    Highlight features:
    • Multi-architectures: Arm, Arm64, M68K, Mips, Sparc, x86, &
    x86_64.
    • Implemented in pure C language, with bindings for Crystal,
    Clojure, Visual Basic, Perl, Rust, Haskell, Ruby, Python, Java,
    Go, .NET, Delphi/Pascal & MSVC available.
    • Native support for Windows & *nix (with Mac OSX, Linux, *BSD &
    Solaris confirmed).
    • High performance by using Just-In-Time compiler technique.
    • Thread-safe by design.

    ...
    Unicorn is based on QEMU, but it goes much further with a lot more
    to offer.
    >_Unicorn.py

    View Slide

  39. >_Unicorn.py

    View Slide

  40. Challenge 1
    >_File Mapping

    View Slide

  41. DOS Program
    OptionalHeader
    NtHeader
    ...
    .ImageBase (0x400000)
    .SizeOfHeaders
    File Header
    .NumberOfSections
    .AddressOfEntryPoint
    SizeOfHeaders
    Section
    Header 1
    (.text)
    ...
    sizeof(Section Header) =
    IMAGE_SIZEOF_SECTION_HEADER = 40(fixed)
    .SizeOfImage
    Section
    Header 2
    Section
    Header 3
    Section
    Data 1
    (.text)
    .DataDirectory
    Exe File (PE)

    View Slide

  42. DOS Program NtHeader
    ...
    Section
    Header 1
    (.text)
    ...
    Section
    Header 2
    Section
    Header 3
    Section
    Data 1
    (.text)
    SectionHeader[i] = PIMAGE_SECTION_HEADER(
    NtHeader +
    sizeof(IMAGE_NT_HEADERS) +
    IMAGE_SIZEOF_SECTION_HEADER * index
    );
    IMAGE SECTION HEADER
    .VirtualAddress .SizeOfRawData
    .PointerToRawData
    Exe File (PE)

    View Slide

  43. >_File Mapping

    View Slide

  44. >_File Mapping

    View Slide

  45. Challenge 2
    >_Repair Import Address Table

    View Slide

  46. >_ls ./agenda
    DOS Program
    OptionalHeader
    NtHeader
    ...
    .ImageBase (0x400000)
    .SizeOfHeaders
    File Header
    .NumberOfSections
    .AddressOfEntryPoint
    SizeOfHeaders
    Section
    Header 1
    (.text)
    ...
    sizeof(Section Header) =
    IMAGE_SIZEOF_SECTION_HEADER = 40(fixed)
    .SizeOfImage
    Section
    Header 2
    Section
    Header 3
    Section
    Data 1
    (.text)
    .DataDirectory
    Exe File (PE)

    View Slide

  47. >_ls ./agenda
    DOS Program
    OptionalHeader
    NtHeader
    ...
    DataDirectory
    Export Directory
    index
    0
    Import Directory
    1
    Exception Directory
    2
    Security Directory
    Base Relocation Table
    Resource Directory
    Import Address Table
    3
    4
    5
    ...
    12
    ...
    IMAGE_DATA_DIRECTORY[16]
    typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD VirtualAddress;
    DWORD Size;
    } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
    Exe File (PE)

    View Slide

  48. >_ls ./agenda
    DOS Program
    OptionalHeader
    NtHeader
    ...
    DataDirectory
    index
    Import Address Table
    ...
    12
    ...
    typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD VirtualAddress;
    DWORD Size;
    } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
    .VirtualAddress .Size
    Exe File (PE)

    View Slide

  49. >_ls ./agenda
    DOS Program
    OptionalHeader
    NtHeader
    ...
    DataDirectory
    index
    Import Address Table
    ...
    12
    ...
    .VirtualAddress
    .Size
    IMAGE_IMPORT_DESCRIPTOR
    Exe File (PE)

    View Slide

  50. IMAGE_IMPORT_DESCRIPTOR Array
    IMAGE_IMPORT_DESCRIPTOR 1
    IMAGE_IMPORT_DESCRIPTOR 2
    IMAGE_IMPORT_DESCRIPTOR 3
    ...
    \x00\x00\x00\x00\x00\x00\x00
    Fixed Size
    Fixed Size
    Fixed Size
    OptionalHeader DataDirectory
    index
    Import Address Table
    12
    .VirtualAddress
    .Size
    sizeof(IMAGE_IMPORT_DESCRIPTOR)
    NT Header
    sizeof(Descriptor Array)
    Exe File (PE)

    View Slide

  51. typedef struct _IMAGE_IMPORT_DESCRIPTOR {
    union {
    DWORD Characteristics;// 0 for terminating null import descriptor
    DWORD OriginalFirstThunk;// RVA to original unbound IAT (PIMAGE_THUNK_DATA)
    } DUMMYUNIONNAME;
    DWORD TimeDateStamp;// 0 if not bound,
    // -1 if bound, and real date\time stamp
    // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
    // O.W. date/time stamp of DLL bound to (Old BIND)
    DWORD ForwarderChain; // -1 if no forwarders
    DWORD Name;
    DWORD FirstThunk; // RVA to IAT (if bound this IAT has actual addresses)
    } IMAGE_IMPORT_DESCRIPTOR;
    typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR;
    IMAGE_IMPORT_DESCRIPTOR

    View Slide

  52. IMAGE_IMPORT_DESCRIPTOR 1
    OptionalHeader DataDirectory
    index
    Import Address Table
    12
    .VirtualAddress
    .Size
    NT Header
    .OriginalFirstThunk
    .FirstThunk
    .Name (User32.dll)
    IMAGE_IMPORT_BY_NAME: MessageBoxA
    typedef struct _IMAGE_IMPORT_BY_NAME {
    WORD Hint;
    CHAR Name[1];
    } IMAGE_IMPORT_BY_NAME;
    Exe File (PE)

    View Slide

  53. xor eax, eax
    ret
    push 0
    push 0x40dead
    push 0x40beef
    push 0
    call ds:0x40cafe
    IMAGE_IMPORT_DESCRIPTOR 1
    OptionalHeader DataDirectory
    index
    Import Address Table
    12
    .VirtualAddress
    .Size
    NT Header
    .OriginalFirstThunk
    .FirstThunk = 0xcafe
    .Name (User32.dll)
    IMAGE_IMPORT_BY_NAME: MessageBoxA
    Exe File (PE)

    View Slide

  54. IMAGE_IMPORT_DESCRIPTOR 1
    OptionalHeader DataDirectory
    index
    Import Address Table
    12
    .VirtualAddress
    .Size
    NT Header
    .FirstThunk = 0xcafe
    .FirstThunk
    .Name (User32.dll)
    IMAGE_IMPORT_BY_NAME: MessageBoxA
    HANDLE mod = LoadLibrary("User32.dll");
    GetProcAddress(mod, "MessageBoxA") = 0x7547EA99
    0x7547EA99
    Exe File (PE)

    View Slide

  55. Thunk Array
    IMAGE_IMPORT_DESCRIPTOR 1
    OptionalHeader DataDirectory
    index
    Import Address Table
    12
    .VirtualAddress
    .Size
    NT Header
    MessageBoxA: *(uint32_t *)0xcafe = 0x7547EA99
    .FirstThunk = 0xcafe
    .Thunk = 0xcaff
    .Thunk = 0xcb00
    .Thunk = 0xcb02
    .Name (User32.dll)
    LoadStringA: *(uint32_t *)0xcaff = 0x75130D4D
    KillTimer: *(uint32_t *)0xcb01 = 0x754364C7
    Lastest Thunk: *(uint32_t *)0xcb02 = NULL
    Exe File (PE)

    View Slide

  56. >_My Win32 Internal ˉ
    ̶̡̭̭ (´∀`๑) ˉ
    ̶̡̭̭

    View Slide

  57. >_Repair IAT

    View Slide

  58. >_Repair IAT

    View Slide

  59. Challenge 3
    >_Emulation Thread

    View Slide

  60. >_Thread
    Registers
    eax 41414141
    ebx 42424242
    ecx 43434343
    edx 44444444
    ... ...
    esp 7ffffffc
    ebp 7ffffffc
    eip 401000
    Main.exe
    .text Section
    addr @ 401000:
    6A 00
    68 AD DE 40 00
    68 EF BE 40 00
    6A 00
    FF 15 FE CA 40 00
    33 C0
    C3

    View Slide

  61. >_Periodic Table?
    sparksandflames.com/files/x86InstructionChart.html

    View Slide

  62. >_Thread addr @ 401000:
    6A 00
    68 AD DE 40 00
    68 EF BE 40 00
    6A 00
    FF 15 FE CA 40 00
    33 C0
    C3
    push 0
    push 0x40dead
    push 0x40beef
    push 0
    call ds:0x40cafe
    xor eax, eax
    ret
    via
    x86 Instruction Set
    Registers
    eax 41414141
    ebx 42424242
    ecx 43434343
    edx 44444444
    ... ...
    esp 7ffffffc
    ebp 7ffffffc
    eip 401000

    View Slide

  63. >_hook

    View Slide

  64. >_hook

    View Slide

  65. >_Emulation

    View Slide

  66. >_Downloader

    View Slide

  67. View Slide

  68. Challenge 4
    >_TEB, PEB, LDR, ...

    View Slide

  69. >_Repair IAT

    View Slide

  70. >_Challenge 4

    View Slide

  71. View Slide

  72. >_Bypass
    blog.trendmicro.com/trendlabs-security-intelligence/new-emotet-hijacks-windows-api-evades-sandbox-analysis

    View Slide

  73. >_AV Leak
    hwww.youtube.com/watch?v=a6yOwvFds78

    View Slide

  74. >_Sandbox
    • VM-like Sandbox
    • Slow, resource consuming
    • Easily to be identified
    • Loader-like Sandbox
    • Not easy to implement whole system API
    • Cuckoo-like Sandbox
    • Not isolated environment, vulnerable

    View Slide

  75. Thanks!
    [email protected]
    Slide
    Github @aaaddress1
    Facebook

    View Slide