Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WIP Prensentation

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.

WIP Prensentation

Avatar for Yuma Kurogome

Yuma Kurogome

November 16, 2013
Tweet

More Decks by Yuma Kurogome

Other Decks in Programming

Transcript

  1. Background • Analysis of malware is becoming difficult – APT

    – Kernel rootkit – Code obfuscation • Many obsuscation tools/methods • No good deobfuscation tool available
  2. LLVM • Compiler infrastructure written in C++ • Has many

    ways to optimize code • Frontend → Middlend → Backend
  3. LLVM • Frontend – Generate parse tree – Generate LLVM

    IR • Middlend – Optimize LLVM IR – Many methods available • Backend – Generate x86 code from optimized LLVM IR • If there are x86 frontend...
  4. Related works(1/5) Dagger: decompilation to LLVM IR, Ahmed Bougacha, Presentation,

    2013 European LLVM Conference, Apr 2013 • LLVM IR Decompiler • Focus on semantic gaps between x86 and LLVM IR – LLVM IR designed Static Single Assignment form • Binary → Mir(own IR) → LLVM IR • Virtual Operand Expansion
  5. Related works(2/5) OptiCode: Machine Code Deobfuscation for Malware Analysis, Nguyen

    Anh Quynh, Presentation, SysCan SG, Apr 2013 • Same motivation • Support many obfuscation technics – Insert dead instruction – Insert NOP semantic instructions – Insert unreachable code – Insert branch insn to next insn
  6. Related works(2/5) • Own x86 frontend(details unknown) and default LLVM

    optimizer – Generate control flow graph(CFG) consisting of basic blocks(BB) from machine code – Constant folding – Eliminate dead store instrucitons – Combine instrctions – Simplifly CFG – Merge BB
  7. Related works(2/5) • Opaque predicate – LLVM cannot deal with

    – Insert value everytime became true/false • Theorem Prover(SMT solver) – Prove the satisability/validity of a logical formula – Can generate the model if satisable • Genarete logical formula from LLVM IR
  8. Related works(3/5) KLEE: Unassisted and Automatic Generation of High-Coverage Tests

    for Complex Systems Programs, Cristian Cadar, Daniel Dunbar, Dawson Engler, OSDI 2008 • Souce → LLVM bitcode • Branch recording – Obtain current path condition using Theorem Prover(STP solver) – Execute every branch
  9. Related works(4/5) QEMU, a Fast and Portable Dynamic Translator, Fabrice

    Bellard, USENIX, 2005 • Dynamic binary translation • Dynamic translator(now Tiny code generator) – Target code → IR → host code – Disassembler – Micro-Operations – Mapping • Similar to LLVM :)
  10. Related works(5/5) Dynamically Translating x86 to LLVM using QEMU, Vitaly

    Chipounov, George Candea, 2010 • QEMU backend for LLVM ≒ x86 frontend for LLVM • LLVM Code Dictionary instead of Host Code Dictionary – Referred when mapping
  11. Approach • Previously I thought using IDA disassembler... • Full-scratch

    x86 frontend aborted... • Modify QEMU binary translator • Generate LLVM IR • Optimize! • Symbolic execution?