Slide 1

Slide 1 text

1 Title WIP Presentation Using LLVM for malware deobfuscation B1 Yuma Kurogome(@ntddk) a.k.a. gomachan Supervisor: none

Slide 2

Slide 2 text

2 Contents ■Background ■Purpose ■Related work ■Approach ■Implementation ■Problem ■Future work

Slide 3

Slide 3 text

3 Background ■Analysis of malware is becoming difficult  APT  Botnet  Code obfuscation etc... ■Many obfuscation tools/methods ■No good deobfuscation tool available

Slide 4

Slide 4 text

4 Purpose ■Realization of useful deobfuscator  Use code optimizer of LLVM  Implementation of x86 Frontend ➔ It is difficult to make AST from x86 native code x86 Frontend x86

Slide 5

Slide 5 text

5 Related work OptiCode: Machine Code Deobfuscation for Malware Analysis, Nguyen Anh Quynh, Presentation, SysCan SG, Apr 2013 ■Support many obfuscation technics  Insert dead instruction  Insert NOP semantic instructions  Insert unreachable code  Insert branch insn to next insn ■Own x86 frontend(details unknown) and default LLVM optimizer  Generate control flow graph(CFG) consisting of basic blocks(BB) from machine code  Constant folding  Eliminate dead store instrucitons  Combine instrctions  Simplifly CFG  Merge BB In this work, I wanted to reproduce the OptiCode

Slide 6

Slide 6 text

6 Related work Dynamically Translating x86 to LLVM using QEMU, Vitaly Chipounov, George Candea, 2010 ■QEMU has Dynamic translator(now Tiny code generator)  Target code → IR → host code  Disassembler  Micro-Operations  Mapping ■Use LLVM Code Dictionary instead of Host Code Dictionary  Reffered when mapping

Slide 7

Slide 7 text

7 Approach 1.Read obfuscated code 2.Dynamic translation 3.LLVM bitcode 4.Generate BB and CFG 5.Optimize 6.Generate deobfuscated code

Slide 8

Slide 8 text

8 Implementation ■Modify QEMU Dynamic Translator  Tiny code generator(tcg) ➔ BB  Easy to mapping register of LLVM IR  Generate CFG from LLVMContext class ■Use LLVM optimizer  Insert dead code ➔ -dse, -simplifycfg  Substitute with equivalent instructions ➔ -constprop, -instcombie  Reorder instructions ➔ -instcombie

Slide 9

Slide 9 text

9 Problem ■Methods written in Opticode can be deobfuscated  Without opaque predicate However, ■QEMU Dynamic translator has problems  Dependence on context  Impossible to interpret Win32API  Overhead ■Optimice is more sophisticated than my work  Deobfuscation plugin for IDA  Use CFG and BB generated from IDA  Overcome the problem of my work ■Evaluation method is ambiguous...

Slide 10

Slide 10 text

10 Future work ■Continuation of research for TERM  How can we deobfuscate malware? ■Establishment of evaluation method ■Leading in semantics  Abstract lnterpretation  Predicate logic  There is little existing reserch...