Slide 1

Slide 1 text

Introduction to Fuzzing Ren Kimura contact@ricsec.co.jp https://ricsec.co.jp

Slide 2

Slide 2 text

Ren Kimura 2 CVE-2019-14247, CVE-2019-14248, CVE-2019-14249, CVE-2019-14250, CVE-2019-16161, CVE-2019-16162, CVE-2019-16163, CVE-2019-16164, CVE-2019-16165, CVE-2019-16166, CVE-2019-16167, CVE-2019-19725 Ricerca Security, Inc. CEO twitter: @RKX1209 mail: renk@ricsec.co.jp

Slide 3

Slide 3 text

3 Background In 2016, DARPA hosted fully automated hacking challenge Software components become more larger and complicated. → Many people try to automate analyzing process. In 2016, DARPA hosted Cyber Grand Challenge (CGC) → Almost all winners used Fuzzing technique in the vulnerability detection process.

Slide 4

Slide 4 text

4 Vulnerability Finding Triage Evaluation Exploit Generation “Fuzzing”, “Static Analysis” “Symbolic Execution” “Triage”, “Exploitability” “Bug Reproduction” “Automatic Exploit Generation” (AEG) Crash Inputs Triaged / POC Exploit Code How to automate Hacking?

Slide 5

Slide 5 text

Mutation based Black box Fuzzing Vulnerability Finding

Slide 6

Slide 6 text

6 They randomly generates large amount of inputs and execute program with it. Black box Fuzzing (Synopsys defensics, zzuf) “a” “kqeqert” “G\x13\x02” “iohbpofi9qnpiof” “3i129074g” They don’t observe program behavior, learn nothing. They continue dumb generation forever

Slide 7

Slide 7 text

7 Fuzzer Target Program (PUT) Initial Seed Mutator Black box Fuzzing They mutates initial seed, generate inputs and execute program with it.

Slide 8

Slide 8 text

8 Initial Seed ・・・ Mutation They mutate initial seed to generate new random inputs Input Generation (defensics, zzuf) Initial.png, Initial.jpeg...

Slide 9

Slide 9 text

9 Bitflip mutation Flip n-th bit of input. Many fuzzers (AFL, zzuf..) use it. 0 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 BitFlip

Slide 10

Slide 10 text

10 Byteflip mutation 0 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 Byteflip 1 1 0 1 0 1 1 0 Flip n-th byte of input. Many fuzzers (AFL, zzuf..) use it.

Slide 11

Slide 11 text

11 Arithmetic mutation Add, Subtract, Multiply or Divide crazy integer on n-th byte 0 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0 Arithmetic operation +-*/ 256, U32_MAX, U32_MIN

Slide 12

Slide 12 text

12 Insert/Delete mutation Insert extra bytes at n-th offset. Delete n bytes subset of input 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 Insert/Delete 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0

Slide 13

Slide 13 text

13 Hands-on 1 (zzuf) “Using zzuf directly“ https://fuzzing-project.org/tutorial1.html zzuf -s 0:1000000 -c -C 0 -q -T 3 objdump -x win9x.exe zzuf -s -T sudo apt-get install zzuf Try other combination, like readelf -a /bin/ls, file, ...

Slide 14

Slide 14 text

Mutation based Grey box Fuzzing Vulnerability Finding

Slide 15

Slide 15 text

15 Fuzzer Target Program (PUT) Initial Seed Mutator Feedback Grey box Fuzzing They generates large amount of inputs in the smart way with feedback.

Slide 16

Slide 16 text

16 How does fuzzer get feedback? Fuzzer Target Program (PUT) Initial Seed Mutator Feedback They get some kind of feedback from program execution.

Slide 17

Slide 17 text

17 Feedback mechanism (edge coverage) if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Instrument unique random numbers to every basic blocks by compiler.

Slide 18

Slide 18 text

18 Feedback mechanism (edge coverage) if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Map hash keys to memory key = nextBB ^ prevBB >> 1

Slide 19

Slide 19 text

19 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ Hands-on 2 (afl-gcc on binutils) cd AFL && make git clone https://github.com/google/AFL wget https://ftp.gnu.org/gnu/binutils/binutils-2.26.tar.gz tar xf binutils-2.26.tar.gz cd binutils-2.26 CC=/path/to/AFL/afl-gcc ./configure --disable-werror make

Slide 20

Slide 20 text

20 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ Hands-on 2 (afl-gcc on binutils) gdb -q binutils-2.26/binutils/nm-new (gdb) disas main Dump of assembler code for function main: 0x0000000000031ba0 <+0>: lea -0x98(%rsp),%rsp 0x0000000000031ba8 <+8>: mov %rdx,(%rsp) 0x0000000000031bac <+12>: mov %rcx,0x8(%rsp) 0x0000000000031bb1 <+17>: mov %rax,0x10(%rsp) 0x0000000000031bb6 <+22>: mov $0xab10,%rcx 0x0000000000031bbd <+29>: callq 0x3caf8 <__afl_maybe_log>

Slide 21

Slide 21 text

21 if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Feedback mechanism (edge coverage) If we only have a binary executable... Instrument unique random numbers to every basic blocks by emulator.

Slide 22

Slide 22 text

22 Black box vs Grey box Fuzzer Unconditional Branch Conditional Branch Vunlerable Control Flow Graph(CFG) of target program “GET” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))

Slide 23

Slide 23 text

23 Black-box Fuzzer (zzuf) Initial Seed “a” “GET” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) ・・・・・ “kq” “\xfeXp\x2a” Mutation

Slide 24

Slide 24 text

24 Black-box Fuzzer (zzuf) “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” ・・・ “kq” “G” “G” Generated “G” from initial seed. Initial Seed Mutation “GET”

Slide 25

Slide 25 text

25 Black-box Fuzzer (zzuf) “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” ・・・ “kq” “GE” “GE” Initial Seed Mutation Generated “GE” from initial seed. “GET”

Slide 26

Slide 26 text

26 Black-box Fuzzer (zzuf) “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” ・・・ “kq” “GET” “GET” Initial Seed Mutation Generated “GET” from initial seed.

Slide 27

Slide 27 text

27 Initial Seed “a” ・・・ “kq” Mutation “GET” Coverage information + Genetic Algorithm (GA) Initial Seed “a” Mutation “G” “GE” “GET” Generation Selection and Mutation based on Fitness Function Grey-box Fuzzer (AFL, libfuzzer) Mutation Mutation

Slide 28

Slide 28 text

28 Mutator Fuzzer Target Program (PUT) Initial Seed Mutator Feedback They generate inputs by mutating initial seed or parent seeds. Initial Seed “a” Mutation “G” “GE” “GET” Mutation Mutation

Slide 29

Slide 29 text

29 Initial Seed “a” “G” “GE” “GET” Fitness Function Seed Scheduler (Selection) Mutation If generated input leads new program coverage, F(input) = 1. Which inputs should be mutated? Generate next inputs by mutatting parent inputs . Grey-box Fuzzer (AFL, libfuzzer) Coverage information + Genetic Algorithm (GA) Generation Mutation Mutation Mutation

Slide 30

Slide 30 text

30 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “NULL” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.x” “a” “GE” “G” Many state-of-art fuzzers use genetic algorithm with coverage information. Grey-box Fuzzer (AFL, libfuzzer) Coverage information + Genetic Algorithm (GA)

Slide 31

Slide 31 text

31 Hands-on 3 (AFL) https://github.com/google/AFL ./afl-fuzz -i -t

Slide 32

Slide 32 text

32 Hands-on 3 (AFL example, nm) https://github.com/google/AFL ./afl-fuzz -i initial/ -t 10000 -m 1024 -o output -- binutils-2.26/binutils/nm-new @@ cp /bin/ls initial/ mkdir initial ls /queue # Show all saved seeds ls /crashes # Show crash inputs

Slide 33

Slide 33 text

Generation based Fuzzing Vulnerability Finding

Slide 34

Slide 34 text

34 Highly structured inputs Fuzzer JavaScript HTML Firefox, Chrome (Target Program)

Slide 35

Slide 35 text

35 Rule ・・・ Generation They generate large amount of inputs from rule Generation based Fuzzing Fileformat (PIT) Grammar

Slide 36

Slide 36 text

36 Fuzzer Target Program (PUT) Rule Generator Generation based Fuzzing They generate inputs and execute program with it.

Slide 37

Slide 37 text

37 Hands-on 4 (PEACH) Read: https://github.com/MozillaSecurity/peach Run PEACH on Firefox ./peach.py -pit Pits///.xml -target Pits/Targets/firefox.xml -run Browser

Slide 38

Slide 38 text

38 They infer data dependency between API calls and generate valid stub code. XNU kernel fuzzing with API Inferring HyungSeok Han IMF: Inferred Model-based Fuzzer [CCS17]

Slide 39

Slide 39 text

White box Fuzzing Vulnerability Finding

Slide 40

Slide 40 text

40 Fuzzer Target Program (PUT) Initial Seed Executor SMT solver White box Fuzzing They generate inputs by solving constraints, using SMT solver. Patrice Godefroid Automated Whitebox Fuzz Testing [NDSS08]

Slide 41

Slide 41 text

41 How to work? Conditional Branch Unconditional Branch Vunlerable Control Flow Graph(CFG) of target program “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))

Slide 42

Slide 42 text

42 White box Fuzzer (SAGE) Initial seed “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” (1) Execute program with initial input “a” (2) Build constraints from trace (input[0] != “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (3) Negate one of the constraints (input[0] == “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (4) Solve constraints by SMT solver → next input “G”

Slide 43

Slide 43 text

43 “GET<\x00” “HTTP” “501” “HTTP1.? if (input[1] == ‘E’) if (input[2] == ‘T’) “G” if (input[0] == ‘G’) Next seed “G” (1) Execute program with initial input “G” (2) Build constraints from trace (input[0] == “G”) & (input[1] != ‘E’) & (input[2] != “T”) (3) Negate one of the constraints (input[0] == “G”) & (input[1] == ‘E’) & (input[2] != “T”) (4) Solve constraints by SMT solver → next input “GE” White box Fuzzer (SAGE)

Slide 44

Slide 44 text

44 “GET<\x00” “HTTP” “501” “HTTP1.? if (input[1] == ‘E’) if (input[2] == ‘T’) “GE” if (input[0] == ‘G’) Next seed “GE” (1) Execute program with initial input “GE” (2) Build constraints from trace (input[0] == “G”) & (input[1] != ‘E’) & (input[2] != “T”) (3) Negate one of the constraints (input[0] == “G”) & (input[1] == ‘E’) & (input[2] == “T”) (4) Solve constraints by SMT solver → next input “GET” White box Fuzzer (SAGE)

Slide 45

Slide 45 text

45 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[1] == ‘E’) if (input[2] == ‘T’) “GET” if (input[0] == ‘G’) Next seed “GET” Congratulation! We found 3 program path lead by “G”, “GE” and “GET” inputs. White box Fuzzer (SAGE) It’s also called “Dynamic Symbolic Execution” (DSE)

Slide 46

Slide 46 text

46 White box vs Grey box Fuzzer White box fuzzer entails significant computational cost (Build constraints, SMT solve) { C1 & C2 & C3 & C4 & C5 … C100 & C101 & C102 … …. C1000 && C1001 && C1002 …. C2360 & C2361 & C2362 …. } C1 C2 C3 C2360 C2361 Thousand of Constraints…. SMT query (over 2000 queries) Performance Overhead

Slide 47

Slide 47 text

47 White box vs Grey box Fuzzer Grey box fuzzer is hard to overcome the long magic number comparison. if (input == 0xdeadbeefcafebabe) { crash(); } Grey box way White box way Mutation “ 0x0” “0xdeadbeefcafebabe ” P(crash) = 1/(2^64) SMT solve “ 0x0” “0xdeadbeefcafebabe ”

Slide 48

Slide 48 text

48 Hybrid Fuzzing (Driller) Control Flow Graph(CFG) of target program if (input == 0xdeadbeefcafebabe) if (input[0] == ‘A’’) if (input[2] < 10) Nick Stephen Driller: Augmenting Fuzzing Through Selective Symbolic Execution [NDSS16] Grey-box fuzzing Dynamic Symbolic Execution (DSE)

Slide 49

Slide 49 text

Feedback Mechanism Vulnerability Finding

Slide 50

Slide 50 text

50 Edge Coverage by Compiler, Emulator/Intel PT (AFL/kAFL) if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Sergej Schumilo kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels[UESNIX17]

Slide 51

Slide 51 text

51 Build CFG from binary by Static Analysis Sanjay Rawat VUzzer: Application-aware Evolutionary Fuzzing [NDSS 17] Markov model on static analysis (VUzzer) if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 1.0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.0 1.0 0.5 0.5 0.5 0.5 P(path) = 1.0*0.5*0.5*0.5*0.5 = 0.0625 F(path) = 1/P(path) = 1/0.0625 = 16 Binary executable format Markov Model

Slide 52

Slide 52 text

52 Dynamic Binary Rewriting (Chizpurfle, RetroWrite) Stalker server rewrites the code block to instrument additional instructions, reveal basic block coverage. Antonio Chizpurfle: A Gray-Box Android Fuzzer for Vendor Service Customizations [ISSRE17] Sushant Dinesh RetroWrite: Statically Instrumenting COTS Binaries for Fuzzing and Sanitization[S&P20] They use frida, rewrite Android system services

Slide 53

Slide 53 text

State-of-art Fuzzing Vulnerability Finding

Slide 54

Slide 54 text

54 Fuzzer Target Program (PUT) Rule Mutator Feedback Smart Grey box Fuzzing (AFLSmart, Nautilus) They generates large amount of inputs based on Rule with feedback.

Slide 55

Slide 55 text

55 Smart mutation with rule Mutate inputs with knowledge about grammar or file format. Semantics Mutation int f(int arg) { return g(arg); } short f(int arg) { return h(arg); } Grammar File (EBNF) Van-Thuan Pham Smart Greybox Fuzzing [TSE20] File Format (PIT) Cornelius Aschermann NAUTILUS: Fishing for Deep Bugs with Grammars [NDSS19]

Slide 56

Slide 56 text

56 Hands-on 6 (AFLSmart) Read: https://github.com/aflsmart/aflsmart Find Crash input of WavPack by AFLSmart ./afl-fuzz -h -i -o -w peach -g -x -- @@ ls /queue # Show all saved seeds ls /crashes # Show crash inputs

Slide 57

Slide 57 text

57 Group work (Bug hunting in Real World) Choose your favorite real world programs. Try to find bug or vulnerabilities from them. You can use any fuzzers. You should try many (program, fuzzer) combinations!

Slide 58

Slide 58 text

58 Group work (Bug hunting in Real World) afl-qemu (AFL against binary) https://github.com/google/AFL VUzzer (Fuzzing with tant analysis) https://github.com/vusec/vuzzer kAFL (Fuzzing linux kernel) https://github.com/RUB-SysSec/kAFL Nautilus (Grammar fuzzing) https://github.com/RUB-SysSec/nautilus Driller (Hybrid fuzzing) https://blog.grimm-co.com/post/guided-fuzzing-with-driller/

Slide 59

Slide 59 text

https://ricsec.co.jp