Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Fuzzing

Ren Kimura
February 13, 2020

Introduction to Fuzzing

Basic Fuzzing Training by Ren Kimura CEO of Ricerca Security, Inc.

Ren Kimura

February 13, 2020
Tweet

More Decks by Ren Kimura

Other Decks in Programming

Transcript

  1. Ren Kimura 2 CVE-2019-14247, CVE-2019-14248, CVE-2019-14249, CVE-2019-14250, CVE-2019-16161, CVE-2019-16162, CVE-2019-16163,

    CVE-2019-16164, CVE-2019-16165, CVE-2019-16166, CVE-2019-16167, CVE-2019-19725 Ricerca Security, Inc. CEO twitter: @RKX1209 mail: [email protected]
  2. 3 Background In 2016, DARPA hosted fully automated hacking challenge

    Software components become more larger and complicated. → Many people try to automate analyzing process. In 2016, DARPA hosted Cyber Grand Challenge (CGC) → Almost all winners used Fuzzing technique in the vulnerability detection process.
  3. 4 Vulnerability Finding Triage Evaluation Exploit Generation “Fuzzing”, “Static Analysis”

    “Symbolic Execution” “Triage”, “Exploitability” “Bug Reproduction” “Automatic Exploit Generation” (AEG) Crash Inputs Triaged / POC Exploit Code How to automate Hacking?
  4. 6 They randomly generates large amount of inputs and execute

    program with it. Black box Fuzzing (Synopsys defensics, zzuf) “a” “kqeqert” “G\x13\x02” “iohbpofi9qnpiof” “3i129074g” They don’t observe program behavior, learn nothing. They continue dumb generation forever
  5. 7 Fuzzer Target Program (PUT) Initial Seed Mutator Black box

    Fuzzing They mutates initial seed, generate inputs and execute program with it.
  6. 8 Initial Seed ・・・ Mutation They mutate initial seed to

    generate new random inputs Input Generation (defensics, zzuf) Initial.png, Initial.jpeg...
  7. 9 Bitflip mutation Flip n-th bit of input. Many fuzzers

    (AFL, zzuf..) use it. 0 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 BitFlip
  8. 10 Byteflip mutation 0 0 1 0 1 0 0

    1 0 0 1 0 1 1 0 1 Byteflip 1 1 0 1 0 1 1 0 Flip n-th byte of input. Many fuzzers (AFL, zzuf..) use it.
  9. 11 Arithmetic mutation Add, Subtract, Multiply or Divide crazy integer

    on n-th byte 0 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0 Arithmetic operation +-*/ 256, U32_MAX, U32_MIN
  10. 12 Insert/Delete mutation Insert extra bytes at n-th offset. Delete

    n bytes subset of input 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 Insert/Delete 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0
  11. 13 Hands-on 1 (zzuf) “Using zzuf directly“ https://fuzzing-project.org/tutorial1.html zzuf -s

    0:1000000 -c -C 0 -q -T 3 objdump -x win9x.exe zzuf -s <range> -T <timeout> <program> <initial seed> sudo apt-get install zzuf Try other combination, like readelf -a /bin/ls, file, ...
  12. 15 Fuzzer Target Program (PUT) Initial Seed Mutator Feedback Grey

    box Fuzzing They generates large amount of inputs in the smart way with feedback.
  13. 16 How does fuzzer get feedback? Fuzzer Target Program (PUT)

    Initial Seed Mutator Feedback They get some kind of feedback from program execution.
  14. 17 Feedback mechanism (edge coverage) if (input[0] == ‘G’) if

    (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Instrument unique random numbers to every basic blocks by compiler.
  15. 18 Feedback mechanism (edge coverage) if (input[0] == ‘G’) if

    (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Map hash keys to memory key = nextBB ^ prevBB >> 1
  16. 19 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ Hands-on 2 (afl-gcc on binutils) cd AFL &&

    make git clone https://github.com/google/AFL wget https://ftp.gnu.org/gnu/binutils/binutils-2.26.tar.gz tar xf binutils-2.26.tar.gz cd binutils-2.26 CC=/path/to/AFL/afl-gcc ./configure --disable-werror make
  17. 20 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ Hands-on 2 (afl-gcc on binutils) gdb -q binutils-2.26/binutils/nm-new

    (gdb) disas main Dump of assembler code for function main: 0x0000000000031ba0 <+0>: lea -0x98(%rsp),%rsp 0x0000000000031ba8 <+8>: mov %rdx,(%rsp) 0x0000000000031bac <+12>: mov %rcx,0x8(%rsp) 0x0000000000031bb1 <+17>: mov %rax,0x10(%rsp) 0x0000000000031bb6 <+22>: mov $0xab10,%rcx 0x0000000000031bbd <+29>: callq 0x3caf8 <__afl_maybe_log>
  18. 21 if (input[0] == ‘G’) if (input[1] == ‘E’) if

    (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Feedback mechanism (edge coverage) If we only have a binary executable... Instrument unique random numbers to every basic blocks by emulator.
  19. 22 Black box vs Grey box Fuzzer Unconditional Branch Conditional

    Branch Vunlerable Control Flow Graph(CFG) of target program “GET” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))
  20. 23 Black-box Fuzzer (zzuf) Initial Seed “a” “GET” “HTTP” “501”

    “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) ・・・・・ “kq” “\xfeXp\x2a” Mutation
  21. 24 Black-box Fuzzer (zzuf) “HTTP” “501” “HTTP1.?” if (input[0] ==

    ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” ・・・ “kq” “G” “G” Generated “G” from initial seed. Initial Seed Mutation “GET”
  22. 25 Black-box Fuzzer (zzuf) “HTTP” “501” “HTTP1.?” if (input[0] ==

    ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” ・・・ “kq” “GE” “GE” Initial Seed Mutation Generated “GE” from initial seed. “GET”
  23. 26 Black-box Fuzzer (zzuf) “HTTP” “501” “HTTP1.?” if (input[0] ==

    ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” ・・・ “kq” “GET” “GET” Initial Seed Mutation Generated “GET” from initial seed.
  24. 27 Initial Seed “a” ・・・ “kq” Mutation “GET” Coverage information

    + Genetic Algorithm (GA) Initial Seed “a” Mutation “G” “GE” “GET” Generation Selection and Mutation based on Fitness Function Grey-box Fuzzer (AFL, libfuzzer) Mutation Mutation
  25. 28 Mutator Fuzzer Target Program (PUT) Initial Seed Mutator Feedback

    They generate inputs by mutating initial seed or parent seeds. Initial Seed “a” Mutation “G” “GE” “GET” Mutation Mutation
  26. 29 Initial Seed “a” “G” “GE” “GET” Fitness Function Seed

    Scheduler (Selection) Mutation If generated input leads new program coverage, F(input) = 1. Which inputs should be mutated? Generate next inputs by mutatting parent inputs . Grey-box Fuzzer (AFL, libfuzzer) Coverage information + Genetic Algorithm (GA) Generation Mutation Mutation Mutation
  27. 30 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if

    (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “NULL” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.x” “a” “GE” “G” Many state-of-art fuzzers use genetic algorithm with coverage information. Grey-box Fuzzer (AFL, libfuzzer) Coverage information + Genetic Algorithm (GA)
  28. 31 Hands-on 3 (AFL) https://github.com/google/AFL ./afl-fuzz -i <initial seed dir>

    -t <time out> -m <memory limit> -o <output dir> -- <command line> @@ cd AFL && make git clone https://github.com/google/AFL ls <output dir>/queue # Show all saved seeds ls <output dir>/crashes # Show crash inputs
  29. 32 Hands-on 3 (AFL example, nm) https://github.com/google/AFL ./afl-fuzz -i initial/

    -t 10000 -m 1024 -o output -- binutils-2.26/binutils/nm-new @@ cp /bin/ls initial/ mkdir initial ls <output dir>/queue # Show all saved seeds ls <output dir>/crashes # Show crash inputs
  30. 35 Rule ・・・ Generation They generate large amount of inputs

    from rule Generation based Fuzzing Fileformat (PIT) Grammar
  31. 36 Fuzzer Target Program (PUT) Rule Generator Generation based Fuzzing

    They generate inputs and execute program with it.
  32. 37 Hands-on 4 (PEACH) Read: https://github.com/MozillaSecurity/peach Run PEACH on Firefox

    ./peach.py -pit Pits/<component>/<format>/<name>.xml -target Pits/Targets/firefox.xml -run Browser
  33. 38 They infer data dependency between API calls and generate

    valid stub code. XNU kernel fuzzing with API Inferring HyungSeok Han IMF: Inferred Model-based Fuzzer [CCS17]
  34. 40 Fuzzer Target Program (PUT) Initial Seed Executor SMT solver

    White box Fuzzing They generate inputs by solving constraints, using SMT solver. Patrice Godefroid Automated Whitebox Fuzz Testing [NDSS08]
  35. 41 How to work? Conditional Branch Unconditional Branch Vunlerable Control

    Flow Graph(CFG) of target program “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))
  36. 42 White box Fuzzer (SAGE) Initial seed “GET<\x00” “HTTP” “501”

    “HTTP1.?” if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” (1) Execute program with initial input “a” (2) Build constraints from trace (input[0] != “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (3) Negate one of the constraints (input[0] == “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (4) Solve constraints by SMT solver → next input “G”
  37. 43 “GET<\x00” “HTTP” “501” “HTTP1.? if (input[1] == ‘E’) if

    (input[2] == ‘T’) “G” if (input[0] == ‘G’) Next seed “G” (1) Execute program with initial input “G” (2) Build constraints from trace (input[0] == “G”) & (input[1] != ‘E’) & (input[2] != “T”) (3) Negate one of the constraints (input[0] == “G”) & (input[1] == ‘E’) & (input[2] != “T”) (4) Solve constraints by SMT solver → next input “GE” White box Fuzzer (SAGE)
  38. 44 “GET<\x00” “HTTP” “501” “HTTP1.? if (input[1] == ‘E’) if

    (input[2] == ‘T’) “GE” if (input[0] == ‘G’) Next seed “GE” (1) Execute program with initial input “GE” (2) Build constraints from trace (input[0] == “G”) & (input[1] != ‘E’) & (input[2] != “T”) (3) Negate one of the constraints (input[0] == “G”) & (input[1] == ‘E’) & (input[2] == “T”) (4) Solve constraints by SMT solver → next input “GET” White box Fuzzer (SAGE)
  39. 45 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[1] == ‘E’) if

    (input[2] == ‘T’) “GET” if (input[0] == ‘G’) Next seed “GET” Congratulation! We found 3 program path lead by “G”, “GE” and “GET” inputs. White box Fuzzer (SAGE) It’s also called “Dynamic Symbolic Execution” (DSE)
  40. 46 White box vs Grey box Fuzzer White box fuzzer

    entails significant computational cost (Build constraints, SMT solve) { C1 & C2 & C3 & C4 & C5 … C100 & C101 & C102 … …. C1000 && C1001 && C1002 …. C2360 & C2361 & C2362 …. } C1 C2 C3 C2360 C2361 Thousand of Constraints…. SMT query (over 2000 queries) Performance Overhead
  41. 47 White box vs Grey box Fuzzer Grey box fuzzer

    is hard to overcome the long magic number comparison. if (input == 0xdeadbeefcafebabe) { crash(); } Grey box way White box way Mutation “ 0x0” “0xdeadbeefcafebabe ” P(crash) = 1/(2^64) SMT solve “ 0x0” “0xdeadbeefcafebabe ”
  42. 48 Hybrid Fuzzing (Driller) Control Flow Graph(CFG) of target program

    if (input == 0xdeadbeefcafebabe) if (input[0] == ‘A’’) if (input[2] < 10) Nick Stephen Driller: Augmenting Fuzzing Through Selective Symbolic Execution [NDSS16] Grey-box fuzzing Dynamic Symbolic Execution (DSE)
  43. 50 Edge Coverage by Compiler, Emulator/Intel PT (AFL/kAFL) if (input[0]

    == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Sergej Schumilo kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels[UESNIX17]
  44. 51 Build CFG from binary by Static Analysis Sanjay Rawat

    VUzzer: Application-aware Evolutionary Fuzzing [NDSS 17] Markov model on static analysis (VUzzer) if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 1.0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.0 1.0 0.5 0.5 0.5 0.5 P(path) = 1.0*0.5*0.5*0.5*0.5 = 0.0625 F(path) = 1/P(path) = 1/0.0625 = 16 Binary executable format Markov Model
  45. 52 Dynamic Binary Rewriting (Chizpurfle, RetroWrite) Stalker server rewrites the

    code block to instrument additional instructions, reveal basic block coverage. Antonio Chizpurfle: A Gray-Box Android Fuzzer for Vendor Service Customizations [ISSRE17] Sushant Dinesh RetroWrite: Statically Instrumenting COTS Binaries for Fuzzing and Sanitization[S&P20] They use frida, rewrite Android system services
  46. 54 Fuzzer Target Program (PUT) Rule Mutator Feedback Smart Grey

    box Fuzzing (AFLSmart, Nautilus) They generates large amount of inputs based on Rule with feedback.
  47. 55 Smart mutation with rule Mutate inputs with knowledge about

    grammar or file format. Semantics Mutation int f(int arg) { return g(arg); } short f(int arg) { return h(arg); } Grammar File (EBNF) Van-Thuan Pham Smart Greybox Fuzzing [TSE20] File Format (PIT) Cornelius Aschermann NAUTILUS: Fishing for Deep Bugs with Grammars [NDSS19]
  48. 56 Hands-on 6 (AFLSmart) Read: https://github.com/aflsmart/aflsmart Find Crash input of

    WavPack by AFLSmart ./afl-fuzz -h -i <initial seed> -o <output dir> -w peach -g <input model file> -x <dictionary file> -- <command line> @@ ls <output dir>/queue # Show all saved seeds ls <output dir>/crashes # Show crash inputs
  49. 57 Group work (Bug hunting in Real World) Choose your

    favorite real world programs. Try to find bug or vulnerabilities from them. You can use any fuzzers. You should try many (program, fuzzer) combinations!
  50. 58 Group work (Bug hunting in Real World) afl-qemu (AFL

    against binary) https://github.com/google/AFL VUzzer (Fuzzing with tant analysis) https://github.com/vusec/vuzzer kAFL (Fuzzing linux kernel) https://github.com/RUB-SysSec/kAFL Nautilus (Grammar fuzzing) https://github.com/RUB-SysSec/nautilus Driller (Hybrid fuzzing) https://blog.grimm-co.com/post/guided-fuzzing-with-driller/