Slide 1

Slide 1 text

Systematic Assessment of Fuzzers using Mutation Analysis Philipp Görz1 @[email protected] Björn Mathis1 @bjrnmath Keno Hassler1 Emre Güler2 @emrexgueler Thorsten Holz1 @thorstenholz Andreas Zeller1 @andreaszeller Rahul Gopinath3 @[email protected] 1 CISPA Helmholtz Center for Information Security, Germany 2 Ruhr-University Bochum, Germany 3 University of Sydney, Australia

Slide 2

Slide 2 text

Fuzz Testing / Fuzzing https://lcamtuf.coredump.cx/afl/

Slide 3

Slide 3 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults 3 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 4

Slide 4 text

Evaluating Fuzzers - Coverage? https://github.com/gcovr/gcovr

Slide 5

Slide 5 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults 5 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 6

Slide 6 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage 5 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 7

Slide 7 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ 5 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 8

Slide 8 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — 5 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 9

Slide 9 text

Evaluating Fuzzers - Finding New Bugs? https://www.cve.org/ 6 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 10

Slide 10 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — 7 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 11

Slide 11 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs 7 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 12

Slide 12 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ 7 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 13

Slide 13 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ 7 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 14

Slide 14 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ 7 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 15

Slide 15 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — 7 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 16

Slide 16 text

Evaluating Fuzzers - Refinding Known Bugs?

Slide 17

Slide 17 text

Evaluating Fuzzers - Refinding Known Bugs? https://hexhive.epfl.ch/magma/

Slide 18

Slide 18 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — 9 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 19

Slide 19 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs 9 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 20

Slide 20 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ 9 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 21

Slide 21 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ 9 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 22

Slide 22 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ 9 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 23

Slide 23 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ 9 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 24

Slide 24 text

Mutation Testing / Mutation Analysis Fuzzing Your Test Suite 10 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 25

Slide 25 text

Mutation Testing / Mutation Analysis unsigned int len = message_length(msg); if (len < MAX_BUF_LEN) { copy_message(msg); } else { // Invalid length , handle error } 11 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 26

Slide 26 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len < MAX_BUF_LEN) { copy_message(msg); } else { // Invalid length , handle error } 12 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 27

Slide 27 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN) { copy_message(msg); } else { // Invalid length , handle error } 13 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 28

Slide 28 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } 14 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 29

Slide 29 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } ✔

Slide 30

Slide 30 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } ✔

Slide 31

Slide 31 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } ✔ ✘

Slide 32

Slide 32 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } ✔ ✘

Slide 33

Slide 33 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } ✔ ✘ ✔

Slide 34

Slide 34 text

Mutation Testing / Mutation Analysis ① unsigned int len = message_length(msg); if (len ② < >= MAX_BUF_LEN ③ + 16) { copy_message(msg); } else { // Invalid length , handle error } ✔ ✘ ✔ ? 14 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 35

Slide 35 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ 15 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 36

Slide 36 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ Mutation Testing 15 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 37

Slide 37 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ Mutation Testing ✔ 15 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 38

Slide 38 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ Mutation Testing ✔ ✔ 15 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 39

Slide 39 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ Mutation Testing ✔ ✔ ✔ 15 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 40

Slide 40 text

Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ Mutation Testing ✔ ✔ ✔ ✘ 15 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 41

Slide 41 text

What’s the Problem? 16 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 42

Slide 42 text

What’s the Problem? • Computationally Expensive! 16 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 43

Slide 43 text

What’s the Problem? • Computationally Expensive! • Mutation Testing: Execute Test Generator (Fuzzer) for each Mutation 16 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 44

Slide 44 text

What’s the Problem? • Computationally Expensive! • Mutation Testing: Execute Test Generator (Fuzzer) for each Mutation • Fuzzing: The More Executions the Better 16 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 45

Slide 45 text

Contributions 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 46

Slide 46 text

Contributions • Reduce Computational Costs 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 47

Slide 47 text

Contributions • Reduce Computational Costs • Split Phases 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 48

Slide 48 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 49

Slide 49 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 50

Slide 50 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing • Supermutants 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 51

Slide 51 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing • Supermutants • Evaluate Multiple Mutations with one Fuzzing Run 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 52

Slide 52 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing • Supermutants • Evaluate Multiple Mutations with one Fuzzing Run • Mutation Operators 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 53

Slide 53 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing • Supermutants • Evaluate Multiple Mutations with one Fuzzing Run • Mutation Operators • Traditional Operators 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 54

Slide 54 text

Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing • Supermutants • Evaluate Multiple Mutations with one Fuzzing Run • Mutation Operators • Traditional Operators • Security Specific Operators 17 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 55

Slide 55 text

Results • Coverage Accounts for most Mutants Detected 18 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 56

Slide 56 text

Results • Coverage Accounts for most Mutants Detected • ASAN Moderately Increases Number of Killed Mutants 18 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 57

Slide 57 text

Results • Coverage Accounts for most Mutants Detected • ASAN Moderately Increases Number of Killed Mutants • Mutations are Coupled to Real Faults 18 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 58

Slide 58 text

Code Code is Publicly Available! Interested? Talk to Us! SBFT’24?! github.com/CISPA-SysSec/mua_fuzzer_bench

Slide 59

Slide 59 text

Mutation Testing / Mutation Analysis Fuzzing Your Test Suite 10 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis Evaluating Fuzzers Com parable Unbiased Custom Subjects Guaranteed Faults Coverage ✔ — — — New Bugs ✘ ✘ ✔ — Known Bugs ✔ ✘ ✘ ✔ Mutation Testing ✔ ✔ ✔ ✘ 12 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis Contributions • Reduce Computational Costs • Split Phases • Coverage Fuzzing • Mutation Fuzzing • Supermutants • Evaluate Multiple Mutations with one Fuzzing Run • Mutation Operators • Traditional Operators • Security Specific Operators 14 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis Code Code is Publicly Available! Interested? Talk to Us! SBFT’24?! github.com/CISPA-SysSec/mua_fuzzer_bench 20 USENIX — Systematic Assessment of Fuzzers using Mutation Analysis

Slide 60

Slide 60 text

Systematic Assessment of Fuzzers using Mutation Analysis Philipp Görz1 @[email protected] Björn Mathis1 @bjrnmath Keno Hassler1 Emre Güler2 @emrexgueler Thorsten Holz1 @thorstenholz Andreas Zeller1 @andreaszeller Rahul Gopinath3 @[email protected] 1 CISPA Helmholtz Center for Information Security, Germany 2 Ruhr-University Bochum, Germany 3 University of Sydney, Australia

Slide 61

Slide 61 text

Compilation Procedure Mutator Base Compiler Unmutated Executable Mutated Executable Fuzzer Compiler Subject (bitcode file) Instrumented Mutated Exectuable Mutation Finder Location Executable Mutation IDs Mutation IDs for a Supermutant Result of Subject Result of Supermutant Supermutant (bitcode file)

Slide 62

Slide 62 text

Checking Procedure Benchmark Manager Crashing Input Mutation killed? Seeds Unmutated Executable Mutated Executable Fuzzer(s) Mutation covered? Instrumented Mutated Exectuable 1. Check if Seeds (after Phase I) already kill mutation(s) 4. Check if found Crashing Input kills Mutant Run input to check that crash does not happen in unmutated executable Run input to check if crash can be confirmed 3. Fuzz using the fuzzer respective executable Run input to get covered mutations 2. Use Seeds to start Fuzzer (each Fuzzer is initialized with their respective seeds after Phase I)

Slide 63

Slide 63 text

ASan Percentages 2.7% 24.7% 24.7% 0.0% 5.5% 21.9% 21.9% 0.0% 5.6% 21.1% 21.1% 0.0% 16.2% 32.3% 32.3% 1.8% 16.3% 32.1% 32.1% 1.8% 18.0% 31.5% 31.5% 0.9% 7.0% 22.2% 22.2% 0.6% 7.5% 21.9% 21.9% 0.5% 7.5% 22.4% 22.4% 0.6% 6.7% 23.3% 23.3% 3.0% 7.4% 25.0% 25.0% 2.5% 7.3% 25.4% 25.4% 2.0% 12.4% 18.4% 18.4% 0.6% 12.6% 18.3% 18.3% 0.6% 12.1% 18.5% 18.5% 0.6% 10.4% 35.8% 35.8% 1.7% 10.4% 35.0% 35.0% 1.8% 10.0% 35.4% 35.4% 1.2% 3.7% 17.1% 17.1% 2.9% 3.6% 17.2% 17.2% 3.0% 3.0% 16.8% 16.8% 1.1% cares_name cares_parse_reply curl guetzli libevent re2 woff2_new aflpp honggfuzz libfuzzer default asan default asan default asan default asan default asan default asan default asan 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% Percentage of Covered Mutations that are Killed Found By asan default both

Slide 64

Slide 64 text

Supermutants Computational Reduction Subject #Mutants #Supermutants Factor Curl 29,118 5,804 5.02 Guetzli 22,961 13,040 1.76 Woff2 (New) 40,914 5,930 6.90 Cares (Name) 4,822 550 8.77 Cares (Parse Reply) 4,822 1,288 3.74 libevent 17,234 864 19.95 re2 21,407 9,670 2.21 Sum 141,278 37,146 3.80

Slide 65

Slide 65 text

Wallclock Time CPU (Years) 4 Servers (Days) Seed Collection 1.99 3.50 Default 14.37 25.22 Seed + Default 16.36 28.72 ASAN 15.16 26.61 24 Hours Runs 7.42 13.02 Sum 38.95 Years 68.34 Days Four servers: Intel Xeon Gold 6230R CPU (52 cores) and 188 GB RAM. Note that evaluating a single fuzzer takes 4.09 CPU years with our chosen subjects ("Seed + Default" / #Fuzzers).

Slide 66

Slide 66 text

24 Hour Runs Prog Total AFL AFL++ libFuzzer Honggfuzz re2 104 0 0 0 0 Woff2 (New) 104 0 0 0 1 Curl 104 0 0 1 0 Guetzli 104 0 0 0 1 Libevent 104 0 0 0 0 Cares (Name) 66 0 0 0 0 Cares (Parse Reply) 104 0 0 0 0 Mutants killed during 24 hour runs on 104 stubborn mutants for each subject using ASAN.

Slide 67

Slide 67 text

Not Independent Mutants Program afl aflpp honggfuzz libfuzzer Curl 4,850 5,836 4,851 3,852 Guetzli 10 24 16 0 Libevent 0 2 0 0 re2 39 66 37 47 Woff2 (New) 26 46 56 48 Cares (Name) 4 0 0 0 Cares (Parse Reply) 2 4 4 0 Number of mutants that were covered together with other mutants (i.e., mutants wrongly thought independent).