Slide 1

Slide 1 text

軟體除錯與符號執行整合互動 Interacting Software Debugging with Symbolic Execution 研究生: 陳威伯 指導教授: 黃世昆教授

Slide 2

Slide 2 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 3

Slide 3 text

Motivation • Related work of Qira and Ponce • Qira-like debugging (without symbolic execution) • Ponce-like Interactive Symbolic Execution (without scripting)

Slide 4

Slide 4 text

Qira: QEMU Interactive Runtime Analyser • A timeless debugger • Initially developed at Google by George Hotz, and work continued at CMU • Using patched QEMU to generate trace • Recording differences between assembly commands • Communicating with browser by websocket with updated program information

Slide 5

Slide 5 text

Qira UI

Slide 6

Slide 6 text

Qira structure

Slide 7

Slide 7 text

Qira without symbolic execution • Plan to support symbolic execution: • https://github.com/BinaryAnalysisPlatform/qira/blob/master/tracers/angr/an gr_trace.py • Symbolic execution is not implemented • Deprecated github project • Qira is using basic tracing functionality of QEMU • QEMU argument: -d -in_asm • http://www.droid-developers.org/wiki/QEMU

Slide 8

Slide 8 text

Ponce • Ponce is an IDA Pro plugin that provides users the ability to perform taint analysis and symbolic execution over binaries • Github: https://github.com/illera88/Ponce • Implemented by triton

Slide 9

Slide 9 text

Ponce example

Slide 10

Slide 10 text

Ponce running example

Slide 11

Slide 11 text

Need something similar to Ponce for scripting • Provide symbolic execution functionality in debugger • Integrated with exploit generation of script • Choosing GDB as debugger to implement symbolic execution functionality

Slide 12

Slide 12 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 13

Slide 13 text

Objective • Interactive debugger with symbolic execution • QEMU • Difficulties

Slide 14

Slide 14 text

Interactive debugger with symbolic execution • Continued with qira with symbolic execution idea • Some experiments based on qemu • Using -d asm parameter to generate trace • Yielding trace for triton engine

Slide 15

Slide 15 text

QEMU • QEMU is a generic and open source machine emulator and virtualizer • Two modes • System (target-softmmu) • User (target-linux-user) • We choose QEMU user mode • Targets • x86 • x86_64 • arm • … • Triton only support x86 and x86_64

Slide 16

Slide 16 text

QEMU

Slide 17

Slide 17 text

QEMU -d in_asm

Slide 18

Slide 18 text

Difficulties • QEMU as a tracer to generate assembly trace • Some instructions are not valid form for triton • QEMU translates assembly to absolute address for trace • Ex: call 0x4000805c00 • not work for some assembly • Some assembly needs relative address in operand

Slide 19

Slide 19 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 20

Slide 20 text

Background • Symbolic execution • Triton • Triton Structure • Triton Tracer • AST representations • Static single assignment form • Symbolic execution engine • SMT solver Interface

Slide 21

Slide 21 text

Symbolic execution • Symbolic execution is a means of analyzing a program to determine what inputs cause each part of a program to execute • System-level • S2e • User-level • Angr • Triton • Code-based • klee

Slide 22

Slide 22 text

Symbolic execution Z == 12 fail() "OK"

Slide 23

Slide 23 text

Triton • A dynamic binary analysis framework written in C++. • developed by Jonathan Salwan • Triton components • Tracer • AST representations • Symbolic execution engine • SMT solver Interface

Slide 24

Slide 24 text

Triton Structure

Slide 25

Slide 25 text

Triton Tracer • Tracer provides: • Current opcode executed • State context (register and memory) • Translate the control flow into AST Representations • Pin tracer support

Slide 26

Slide 26 text

AST representations • Triton converts the x86 and the x86-64 instruction set semantics into AST representations • Triton's expressions are on SSA form • Instruction: add rax, rdx • Expression: ref!41 = (bvadd ((_ extract 63 0) ref!40) ((_ extract 63 0) ref!39)) • ref!41 is the new expression of the RAX register • ref!40 is the previous expression of the RAX register • ref!39 is the previous expression of the RDX register

Slide 27

Slide 27 text

AST representations • mov al, 1 • mov cl, 10 • mov dl, 20 • xor cl, dl • add al, cl

Slide 28

Slide 28 text

Static single assignment form • Each variable is assigned exactly once • y := 1 • y := 2 • x := y Turns into • y1 := 1 • y2 := 2 • x1 := y2

Slide 29

Slide 29 text

Symbolic execution engine • The symbolic engine maintains: • a table of symbolic registers states • a map of symbolic memory states • a global set of all symbolic references Step Register Instruction Set of symbolic expressions init eax = UNSET None ⊥ 1 eax = φ1 mov eax, 0 {φ1=0} 2 eax = φ2 inc eax {φ1=0,φ2=φ1+1} 3 eax = φ3 add eax, 5 {φ1=0,φ2=φ1+1,φ3=φ2+5}

Slide 30

Slide 30 text

SMT solver Interface

Slide 31

Slide 31 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 32

Slide 32 text

Design and Implementation • Symbolic Support for GDB (SymGDB) • SymGDB System Structure • Implementation of System Internals • Relationship between SymGDB classes • Supported Commands • Symbolic Execution Process in GDB • Symbolic Environment • symbolic argv

Slide 33

Slide 33 text

Symbolic Support for GDB (SymGDB) • Using python API for GDB • https://sourceware.org/gdb/onlinedocs/gdb/Python-API.html • Source python script in .gdbinit • Get debugged program state by calling python API • Get the current program state and yield to triton • Set symbolic variable • Set the target address • Run symbolic execution and get output • Inject back to debugged program state

Slide 34

Slide 34 text

SymGDB System Structure

Slide 35

Slide 35 text

Implementation of System Internals • Three classes in the symGDB • Arch(), GdbUtil(), Symbolic() • Arch() • Provide different pointer size、register name • GdbUtil() • Read write memory、read write register • Get memory mapping of program • Get filename and detect architecture • Get argument list • Symbolic() • Set constraint on pc register • Run symbolic execution

Slide 36

Slide 36 text

Relationship between SymGDB classes

Slide 37

Slide 37 text

Supported Commands • Inherit from gdb.Command class • Make symbolic command • symbolize • argv • memory [address] [size] • Set target address • target [address] • Run symbolic execution • triton

Slide 38

Slide 38 text

Symbolic Execution Process in GDB • gdb.execute("info registers", to_string=True) to get registers • gdb.selected_inferior().read_memory(address, length) to get memory • setConcreteMemoryAreaValue and setConcreteRegisterValue to set triton state • In each instruction, use isRegisterSymbolized to check if pc register is symbolized or not • Set target address as constraint • Call getModel to get answer • gdb.selected_inferior().write_memory(address, buf, length) to inject back to debugged program state

Slide 39

Slide 39 text

Symbolic Environment: symbolic argv • Using "info proc all" to get stack start address • Examining memory content from stack start address • argc • argv[0] • argv[1] • …… • null • env[0] • env[1] • …… • null argc argument counter(integer) argv[0] program name (pointer) argv[1] program args (pointers) … argv[argc-1] null end of args (integer) env[0] environment variables (pointers) env[1] … env[n] null end of environment (integer)

Slide 40

Slide 40 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 41

Slide 41 text

Evaluations • Examples • crackme hash • crackme xor • GDB commands • Combined with Peda • Comparisons with triton • Comparisons with Ponce

Slide 42

Slide 42 text

crackme hash • Source: https://github.com/illera88/Ponce/blob/master/examples/crackme_h ash.cpp • Program will pass argv[1] to check function • In check function, argv[1] xor with serial(fixed string) • If sum of xored result equals to 0xABCD • print "Win" • else • print "fail"

Slide 43

Slide 43 text

crackme hash

Slide 44

Slide 44 text

crackme hash

Slide 45

Slide 45 text

crackme hash

Slide 46

Slide 46 text

crackme xor • Source: https://github.com/illera88/Ponce/blob/master/examples/crackme_xor.cpp • Program will pass argv[1] to check function • In check function, argv[1] xor with 0x55 • If xored result not equals to serial(fixed string) • return 1 • print "fail" • else • go to next loop • If program go through all the loop • return 0 • print "Win"

Slide 47

Slide 47 text

crackme xor

Slide 48

Slide 48 text

crackme xor

Slide 49

Slide 49 text

crackme xor

Slide 50

Slide 50 text

GDB commands

Slide 51

Slide 51 text

GDB commands

Slide 52

Slide 52 text

Combined with Peda • Same demo video of crackme hash • Using find(peda command) to find argv[1] address • Using symbolize memory argv[1]_address argv[1]_length to symbolic argv[1] memory

Slide 53

Slide 53 text

Combined with Peda

Slide 54

Slide 54 text

Comparisons with triton • triton’s pre-written script with more than 100+ lines for similar functionality • can’t stop at any point in script execution period • Triton pre-written script needs following steps: • load binary • Initialize registers • Symbolize memory • Define examination point • Define constraint • In our symGDB, steps simplified by GDB provided information

Slide 55

Slide 55 text

Comparisons with triton Triton SymGDB pre-written script lines 100+ 1-10+ load binary 10+ lines Automatically Initialize registers 2-16 lines Automatically Symbolize memory 10-30 lines Symbolize command Define examination point Yes Automatically Define constraint 10+ lines Using pc register instead

Slide 56

Slide 56 text

Comparisons with Ponce • Compared with Ponce, symGDB can restart from break point • Due to limitation of Ponce, it could only start symbolic execution from break point once • symGDB can combine with GDB commands to provide scripting functionality • symGDB works with peda or other powerful gdb plugins

Slide 57

Slide 57 text

Comparisons with Ponce Ponce SymGDB Restart from break point No Yes Scripting interface No Yes Command line interface No Yes Integration with peda No Yes

Slide 58

Slide 58 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 59

Slide 59 text

Conclusions • Symbolic Execution Supports for Software Debugger • First GDB integration with Symbolic Execution • With Triton Symbolic Execution Engine • Integration with Other Exploit Development Tools • With Flexibility to Interface with Peda and Pwntools • Scripting and Restart Execution Support

Slide 60

Slide 60 text

Outline • Motivation • Objective • Background • Design and Implementation • Evaluation • Conclusion • Future work

Slide 61

Slide 61 text

Future work • Due to triton only support python2 • However, default GDB is shipped with python3 • Need to recompile GDB to use SymGDB plugin • Try to integrated with Pwntools • Current Pwntools codebase has some problems with GDB

Slide 62

Slide 62 text

Q & A

Slide 63

Slide 63 text

Thank you