3 Background In 2016, DARPA hosted fully automated hacking challenge Software components become more larger and complicated. → Many people try to automate analyzing process. In 2016, DARPA hosted Cyber Grand Challenge (CGC) → Almost all winners used Fuzzing technique in the vulnerability detection process.
6 They randomly generates large amount of inputs and execute program with it. Black box Fuzzing (Synopsys defensics, zzuf) “a” “kqeqert” “G\x13\x02” “iohbpofi9qnpiof” “3i129074g” They don’t observe program behavior, learn nothing. They continue dumb generation forever
17 Feedback mechanism (edge coverage) if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Instrument unique random numbers to every basic blocks by compiler.
18 Feedback mechanism (edge coverage) if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Map hash keys to memory key = nextBB ^ prevBB >> 1
19 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ Hands-on 2 (afl-gcc on binutils) cd AFL && make git clone https://github.com/google/AFL wget https://ftp.gnu.org/gnu/binutils/binutils-2.26.tar.gz tar xf binutils-2.26.tar.gz cd binutils-2.26 CC=/path/to/AFL/afl-gcc ./configure --disable-werror make
21 if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Feedback mechanism (edge coverage) If we only have a binary executable... Instrument unique random numbers to every basic blocks by emulator.
22 Black box vs Grey box Fuzzer Unconditional Branch Conditional Branch Vunlerable Control Flow Graph(CFG) of target program “GET” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))
29 Initial Seed “a” “G” “GE” “GET” Fitness Function Seed Scheduler (Selection) Mutation If generated input leads new program coverage, F(input) = 1. Which inputs should be mutated? Generate next inputs by mutatting parent inputs . Grey-box Fuzzer (AFL, libfuzzer) Coverage information + Genetic Algorithm (GA) Generation Mutation Mutation Mutation
31 Hands-on 3 (AFL) https://github.com/google/AFL ./afl-fuzz -i -t -m -o -- @@ cd AFL && make git clone https://github.com/google/AFL ls /queue # Show all saved seeds ls /crashes # Show crash inputs
38 They infer data dependency between API calls and generate valid stub code. XNU kernel fuzzing with API Inferring HyungSeok Han IMF: Inferred Model-based Fuzzer [CCS17]
41 How to work? Conditional Branch Unconditional Branch Vunlerable Control Flow Graph(CFG) of target program “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))
45 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[1] == ‘E’) if (input[2] == ‘T’) “GET” if (input[0] == ‘G’) Next seed “GET” Congratulation! We found 3 program path lead by “G”, “GE” and “GET” inputs. White box Fuzzer (SAGE) It’s also called “Dynamic Symbolic Execution” (DSE)
47 White box vs Grey box Fuzzer Grey box fuzzer is hard to overcome the long magic number comparison. if (input == 0xdeadbeefcafebabe) { crash(); } Grey box way White box way Mutation “ 0x0” “0xdeadbeefcafebabe ” P(crash) = 1/(2^64) SMT solve “ 0x0” “0xdeadbeefcafebabe ”
48 Hybrid Fuzzing (Driller) Control Flow Graph(CFG) of target program if (input == 0xdeadbeefcafebabe) if (input[0] == ‘A’’) if (input[2] < 10) Nick Stephen Driller: Augmenting Fuzzing Through Selective Symbolic Execution [NDSS16] Grey-box fuzzing Dynamic Symbolic Execution (DSE)
50 Edge Coverage by Compiler, Emulator/Intel PT (AFL/kAFL) if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 Calculate hash keys based on random numbers on program path. Sergej Schumilo kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels[UESNIX17]
52 Dynamic Binary Rewriting (Chizpurfle, RetroWrite) Stalker server rewrites the code block to instrument additional instructions, reveal basic block coverage. Antonio Chizpurfle: A Gray-Box Android Fuzzer for Vendor Service Customizations [ISSRE17] Sushant Dinesh RetroWrite: Statically Instrumenting COTS Binaries for Fuzzing and Sanitization[S&P20] They use frida, rewrite Android system services
54 Fuzzer Target Program (PUT) Rule Mutator Feedback Smart Grey box Fuzzing (AFLSmart, Nautilus) They generates large amount of inputs based on Rule with feedback.
55 Smart mutation with rule Mutate inputs with knowledge about grammar or file format. Semantics Mutation int f(int arg) { return g(arg); } short f(int arg) { return h(arg); } Grammar File (EBNF) Van-Thuan Pham Smart Greybox Fuzzing [TSE20] File Format (PIT) Cornelius Aschermann NAUTILUS: Fishing for Deep Bugs with Grammars [NDSS19]
57 Group work (Bug hunting in Real World) Choose your favorite real world programs. Try to find bug or vulnerabilities from them. You can use any fuzzers. You should try many (program, fuzzer) combinations!
58 Group work (Bug hunting in Real World) afl-qemu (AFL against binary) https://github.com/google/AFL VUzzer (Fuzzing with tant analysis) https://github.com/vusec/vuzzer kAFL (Fuzzing linux kernel) https://github.com/RUB-SysSec/kAFL Nautilus (Grammar fuzzing) https://github.com/RUB-SysSec/nautilus Driller (Hybrid fuzzing) https://blog.grimm-co.com/post/guided-fuzzing-with-driller/