Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WIP Prensentation

WIP Prensentation

Yuma Kurogome

November 16, 2013
Tweet

More Decks by Yuma Kurogome

Other Decks in Programming

Transcript

  1. Background • Analysis of malware is becoming difficult – APT

    – Kernel rootkit – Code obfuscation • Many obsuscation tools/methods • No good deobfuscation tool available
  2. LLVM • Compiler infrastructure written in C++ • Has many

    ways to optimize code • Frontend → Middlend → Backend
  3. LLVM • Frontend – Generate parse tree – Generate LLVM

    IR • Middlend – Optimize LLVM IR – Many methods available • Backend – Generate x86 code from optimized LLVM IR • If there are x86 frontend...
  4. Related works(1/5) Dagger: decompilation to LLVM IR, Ahmed Bougacha, Presentation,

    2013 European LLVM Conference, Apr 2013 • LLVM IR Decompiler • Focus on semantic gaps between x86 and LLVM IR – LLVM IR designed Static Single Assignment form • Binary → Mir(own IR) → LLVM IR • Virtual Operand Expansion
  5. Related works(2/5) OptiCode: Machine Code Deobfuscation for Malware Analysis, Nguyen

    Anh Quynh, Presentation, SysCan SG, Apr 2013 • Same motivation • Support many obfuscation technics – Insert dead instruction – Insert NOP semantic instructions – Insert unreachable code – Insert branch insn to next insn
  6. Related works(2/5) • Own x86 frontend(details unknown) and default LLVM

    optimizer – Generate control flow graph(CFG) consisting of basic blocks(BB) from machine code – Constant folding – Eliminate dead store instrucitons – Combine instrctions – Simplifly CFG – Merge BB
  7. Related works(2/5) • Opaque predicate – LLVM cannot deal with

    – Insert value everytime became true/false • Theorem Prover(SMT solver) – Prove the satisability/validity of a logical formula – Can generate the model if satisable • Genarete logical formula from LLVM IR
  8. Related works(3/5) KLEE: Unassisted and Automatic Generation of High-Coverage Tests

    for Complex Systems Programs, Cristian Cadar, Daniel Dunbar, Dawson Engler, OSDI 2008 • Souce → LLVM bitcode • Branch recording – Obtain current path condition using Theorem Prover(STP solver) – Execute every branch
  9. Related works(4/5) QEMU, a Fast and Portable Dynamic Translator, Fabrice

    Bellard, USENIX, 2005 • Dynamic binary translation • Dynamic translator(now Tiny code generator) – Target code → IR → host code – Disassembler – Micro-Operations – Mapping • Similar to LLVM :)
  10. Related works(5/5) Dynamically Translating x86 to LLVM using QEMU, Vitaly

    Chipounov, George Candea, 2010 • QEMU backend for LLVM ≒ x86 frontend for LLVM • LLVM Code Dictionary instead of Host Code Dictionary – Referred when mapping
  11. Approach • Previously I thought using IDA disassembler... • Full-scratch

    x86 frontend aborted... • Modify QEMU binary translator • Generate LLVM IR • Optimize! • Symbolic execution?