High Speed Bug Discovery with Fuzzing

High Speed Bug Discovery with Fuzzing Craig Stuntz ∈ Improving

Slides https://speakerdeck.com/craigstuntz

Sad Bunny https://www.flickr.com/photos/climber85268/446766956

Spoilers! Why should I care? (because it’s surprisingly effective at
finding bugs in software) What is it? (a simple, property-based randomized testing technique) When should I use it? (integration testing complex systems with infinite input values) How do I get started? (I’ll suggest a bunch of tools) Should I write my own? (yes, and I have stories!)

400 Crashes, 106 Distinct Security Bugs in Adobe Flash Player
https://security.googleblog.com/2011/08/fuzzing-at-scale.html

325 C Compiler bugs in GCC, Clang, & Others https://www.flux.utah.edu/paper/yang-pldi11

“…our most prolific bug-finding tool.” Robert Guo http://queue.acm.org/detail.cfm?ref=rss&id=3059007

DARPA Cyber Grand Challenge Mayhem, Xandra, Mechanical Phish, Galactica http://blogs.grammatech.com/the-cyber-grand-challenge

https://commons.wikimedia.org/wiki/File:Rabbit_american_fuzzy_lop_buck_white.jpg What is it?

Prevent Regressions Bug Discovery Help with Code Design Meets Specifications

Integration testing Unit testing Formal verification Exploratory testing

Fuzzing Integration testing Unit testing Formal verification Exploratory testing

How Many Cases Should We Test? One Only the Most
Interesting Every Possible Case Unit Testing Fuzzing Formal Verification

Property -Based Random Testing QuickCheck Chaos Monkey Fuzzing Scientist

Corpus (Collection of Examples)

System Under Test

Property

Magic!

Corpus A few handwritten examples Fuzzing databases Harvest from test
suites, defect reports Harvest from public Internet

System Under Test A function Entire application Part of OS
kernel

Properties Does it crash? Does it hang? Is the output
“valid”? Does execution trip an address or memory sanitizer? Does the output match some other system?

Magic Mutation of corpus Coverage guidance Lots of test runs

Possible Inputs Random Inputs Interesting Inputs Random Inputs with Profile
Guidance

Getting Started with afl - Compile system under test with
instrumentation - Place corpus input(s) in a folder - Invoke afl - Wait for bugs https://fuzzing-project.org/tutorial3.html

Compile with Instrumentation $ ./configure CC="afl-gcc" \ CXX="afl-g++" \ --disable-shared;
\ make

Place Corpus in Folder $ mkdir in $ cd in
$ cat > foo.json { "a": "bc" } ^D $ cd ..

Invoke afl $ afl-fuzz -i in -o out \ my_json_parser
@@ folder containing corpus “@@“ means “the current test case” system under test findings go here

some_project ├── in └─┬ out ├── queue ├── crashes └──
hangs

afl In a Nutshell ⃗

afl Fuzz Strategies Walking bit flips (try flipping each bit
in input individually) Walking byte flips (try flipping each contiguous set of 8 bits) Simple arithmetic (increment or decrement bytes in the file by certain small values) Known integers (replace bytes with “problematic” 8, 16, and 32 bit integers like 0 and FF) Profile-guided stacked tweaks and test case splicing (magic!) https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html

Walking Bit Flip Original 01010101 Flip bit 0 01010100 Flip
bit 1 01010111 Flip bit 2 01010001 <etc.> Walking 2 Bit Flip Original 01010101 Flip bits 0,1 01010110 Flip bits 2,1 01010011 Flip bits 3,2 01001101 <etc.>

Trace Execution Paths A B D C E F

Trace Execution Paths A B D C E F ?

When Should I Use It? When is it useful? https://it.wikipedia.org/wiki/File:Coniglio_ariete.JPG

Unit Tests Fuzzing Useful For Preventing Regressions, Design Finding New
Bugs Tests Functions Any Level Test Examples Hand-selected values Corpus + Mutation Execution Time Milliseconds Weeks Magic? No Yes

Instrumentation for profile guidance

$ ./configure CC="afl-gcc" \ CXX="afl-g++" \ --disable-shared; \ make

$ pip install python-afl ——— afl.init()

Properties a.k.a. Specifications https://lorinhochstein.wordpress.com/2014/06/04/crossing-the-river-with-tla/

STJSON A JSON Parser in Swift 3 compliant with RFC
7159 STJSON was written along with the article Parsing JSON is a Mineﬁeld. Basic usage: var p = STJSONParser(data: data) do { let o = p.parse() } catch let e { print(e) } Instantiation with options: var p = STJSON(data:data, maxParserDepth:1024, options:[.useUnicodeReplacementCharacter]) https://github.com/nst/STJSON https://github.com/CraigStuntz/Fizil/tree/master/StJson

Can You Fuzz the Domain?

Dumb Fuzzer Mangling Byte Arrays

public byte[] ResizePng( byte[] image) {

public boolean SomeFunction( SomeEnum firstArg, int secondArg) { ✗

Smart Fuzzing MongoDB Expression Grammar http://queue.acm.org/detail.cfm?ref=rss&id=3059007

public Assembly Compile( AbstractNode syntaxTree) {

Worth the Wait? Results Can Take Weeks

Isn’t it just for Security? https://www.flickr.com/photos/wocintechchat/25721078480/

How do I get started?

How to Get Started with Fuzzing 1. Find a program
to test 2. Find a fuzzer 3. Find a corpus 4. Choose a property 5. Let it run!

Fuzzers https://commons.wikimedia.org/wiki/File:Holland_Lop_with_Broken_Orange_Coloring.jpg

libfuzzer Fuzz testing for LLVM compilers http://llvm.org/docs/LibFuzzer.html

$ cargo install cargo-fuzz $ cargo fuzz init

$ gzip -c /bin/bash > sample.gz $ while true do
radamsa sample.gz > fuzzed.gz gzip -dc fuzzed.gz > /dev/null test $? -gt 127 && break done ← Fuzz the corpus ← Execute S.O.T. ← Check a property ← Repeat a lot! https://github.com/aoh/radamsa

ClusterFuzz Submit a fuzzer, win a bounty https://security.googleblog.com/2016/08/guided-in-process-fuzzing-of-chrome.html

OSS-Fuzz Submit your project https://github.com/google/oss-fuzz

burp, ZAP

Corpus https://en.wikipedia.org/wiki/File:Long_Room_Interior,_Trinity_College_Dublin,_Ire

“ We didn't call it fuzzing back in the 1950s,
but it was our standard practice to test programs by inputting decks of punch cards taken from the trash. -Gerald M. Weinberg http://secretsofconsulting.blogspot.com/2017/02/fuzz-testing-and-fuzz-history.html

Fuzzing SQLite with afl Start with a single test case:
create table t1(one smallint); insert into t1 values(1); select * from t1; Add a list of reserved words from documentation Then extract SQL statements from SQLite unit tests (550 files at around 220 bytes each) https://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html

Properties

Don’t Crash or Hang

Sanitizers and Canaries https://docs.google.com/presentation/d/19OSgb1N9Ezef39Blb-5lkzycq7-tMtAvy825FofyrmY

Validators

End to End Input Output Same as Input? f(input) f-1(input)

Legacy Code https://commons.wikimedia.org/wiki/File:Blue-punch-card-front-horiz.png

One of The Things …Is Not Like the Others!

Should I write my own?

Heck Yeah! https://github.com/CraigStuntz/Fizil

Interesting Stuff I Learned While Writing a Fuzzer - F#
bitwise operations - How to instrument .NET code - dnSpy is awesome - Same input -> Same code -> Different paths - Strong naming is painful - Unicode is also painful - MemoryMappedFile performance is straight-up awful

let jsonNetResult = try JsonConvert.DeserializeObject<obj>(str) |> ignore Success with |
:? JsonReaderException as jre -> jre.Message |> Error | :? JsonSerializationException as jse -> jse.Message |> Error | :? System.FormatException as fe -> if fe.Message.StartsWith("Invalid hex character”) // hard coded in Json.NET then fe.Message |> Error else reraise() ⃪ T est ⬑ Special case error stuff

use proc = new Process() proc.StartInfo.FileName <- executablePath inputMethod.BeforeStart proc
testCase.Data proc.StartInfo.UseShellExecute <- false proc.StartInfo.RedirectStandardOutput <- true proc.StartInfo.RedirectStandardError <- true proc.StartInfo.EnvironmentVariables.Add(SharedMemory.environmentVariableName, sharedMemoryName) let output = new System.Text.StringBuilder() let err = new System.Text.StringBuilder() proc.OutputDataReceived.Add(fun args -> output.Append(args.Data) |> ignore) proc.ErrorDataReceived.Add (fun args -> err.Append(args.Data) |> ignore) proc.Start() |> ignore inputMethod.AfterStart proc testCase.Data proc.BeginOutputReadLine() proc.BeginErrorReadLine() proc.WaitForExit() let exitCode = proc.ExitCode let crashed = exitCode = WinApi.ClrUnhandledExceptionCode ⃪ Set up ⃪ Read results ⃪ Important bit

/// An ordered list of functions to use when starting
with a single piece of /// example data and producing new examples to try let private allStrategies = [ bitFlip 1 bitFlip 2 bitFlip 4 byteFlip 1 byteFlip 2 byteFlip 4 arith8 arith16 arith32 interest8 interest16 ]

let totalBits = bytes.Length * 8 let testCases = seq
{ for bit = 0 to totalBits - flipBits do let newBytes = Array.copy bytes let firstByte = bit / 8 let firstByteMask, secondByteMask = bitMasks(bit, flipBits) let newFirstByte = bytes.[firstByte] ^^^ firstByteMask newBytes.[firstByte] <- newFirstByte let secondByte = firstByte + 1 if secondByteMask <> 0uy && secondByte < bytes.Length then let newSecondByte = bytes.[secondByte] ^^^ secondByteMask newBytes.[secondByte] <- newSecondByte yield newBytes } Fuzz one byte → ^^^ means xor ↓

private static void F(string arg) { Console.WriteLine("f"); Console.Error.WriteLine("Error!"); Environment.Exit(1); }

private static void F(string arg) { instrument.Trace(29875); Console.WriteLine("f"); Console.Error.WriteLine("Error!"); Environment.Exit(1);
} ← Random number

private static void F(string arg) { #if MANUAL_INSTRUMENTATION instrument.Trace(29875); #endif
Console.WriteLine("f"); Console.Error.WriteLine("Error!"); Environment.Exit(1); }

let stringify (ob: obj) : string = JsonConvert.SerializeObject(ob)

let stringify (ob: obj) : string = JsonConvert.SerializeObject(ob) // Method:
System.String\u0020Program::stringify(System.Object) .body stringify { arg_02_0 [generated] arg_07_0 [generated] nop() arg_02_0 = ldloc(ob) arg_07_0 = call(JsonConvert::SerializeObject, arg_02_0) ret(arg_07_0) }

let stringify (ob: obj) : string = JsonConvert.SerializeObject(ob) // Method:
System.String\u0020Program::stringify(System.Object) .body stringify { arg_02_0 [generated] arg_07_0 [generated] nop() arg_02_0 = ldloc(ob) arg_07_0 = call(JsonConvert::SerializeObject, arg_02_0) ret(arg_07_0) } // Method: System.String\u0020Program::stringify(System.Object) .body stringify { arg_05_0 [generated] arg_0C_0 [generated] arg_11_0 [generated] arg_05_0 = ldc.i4(23831) call(Instrument::Trace, arg_05_0) nop() arg_0C_0 = ldloc(ob) arg_11_0 = call(JsonConvert::SerializeObject, arg_0C_0) ret(arg_11_0) }

let private insertTraceInstruction(ilProcessor: ILProcessor, before: Instruction, state) = let compileTimeRandom
= state.Random.Next(0, UInt16.MaxValue |> Convert.ToInt32) let ldArg = ilProcessor.Create(OpCodes.Ldc_I4, compileTimeRandom) let callTrace = ilProcessor.Create(OpCodes.Call, state.Trace) ilProcessor.InsertBefore(before, ldArg) ilProcessor.InsertAfter (ldArg, callTrace) This margin is too narrow to contain a try/finally example, so see: https://goo.gl/W4y7JH

http://www.json.org/

https://tools.ietf.org/html/rfc4627

http://www.ecma-international.org/ecma-262/5.1/#sec-15.12

http://www.ecma-international.org/publications/standards/Ecma-404.htm

https://github.com/nst/STJSON

https://github.com/CraigStuntz/Fizil/blob/master/StJson/StJsonParser.fs

{ "a" : "bc" }

Standard Accepts, Json.NET Rejects Value 88888888888888888888888888888888888888888888888888 88888888888888888888888888888888888888888888888888 88888888888888888888888888888888888888888888888888 88888888888888888888888888888888888888888888888888 88888888888888888888888888888888888888888888888888
Standard Says No limit Json.NET MaximumJavascriptIntegerCharacterLength = 380;

Standard Rejects, Json.NET Accepts Value [,,,] Standard Says A JSON
value MUST be an object, array, number, or string, or one of the following three literal names: false null true Json.NET [null, null, null, null]

let private removeStrongName (assemblyDefinition : AssemblyDefinition) = let name =
assemblyDefinition.Name; name.HasPublicKey <- false; name.PublicKey <- Array.empty; assemblyDefinition.Modules |> Seq.iter ( fun moduleDefinition -> moduleDefinition.Attributes <- moduleDefinition.Attributes &&& ~~~ModuleAttributes.StrongNameSigned) let aptca = assemblyDefinition.CustomAttributes.FirstOrDefault( fun attr -> attr.AttributeType.FullName = typeof<System.Security.AllowPartiallyTrustedCallersAttribute>.FullName) assemblyDefinition.CustomAttributes.Remove aptca |> ignore assembly.MainModule.AssemblyReferences |> Seq.filter (fun reference -> Set.contains reference.Name assembliesToInstrument) |> Seq.iter (fun reference -> reference.PublicKeyToken <- null )

“ “If marked BeforeFieldInit then the type’s initializer method is
executed at, or sometime before, first access to any static field defined for that type.” -ECMA-335, Common Language Infrastructure (CLI), Partition I

Unicode Original JSON { "a": "bc" } ASCII Bytes 7B
20 22 61 22 20 3A 20 22 62 63 22 20 7D UTF-8 with Byte Order Mark EF BB BF 7B 20 22 61 22 20 3A 20 22 62 63 22 20 7D UTF-16 BE with BOM FE FF 00 7B 00 20 00 22 00 61 00 22 00 20 00 3A 00 20 00 22 00 62 00 63 00 22 00 20 00 7D

Resources MongoDB’s JavaScript Fuzzer http://queue.acm.org/detail.cfm?ref=rss&id=3059007 afl technical details http://lcamtuf.coredump.cx/afl/technical_details.txt afl
Help Email List [email protected] Fizil https://github.com/CraigStuntz/Fizil WTF, ACM?

Thank You! - Michał Zalewski, for afl documentation - Rehearsal
audiences, employees of - Dynamit - Improving - Ineffable Solutions

Craig Stuntz [email protected] www.craigstuntz.com @craigstuntz http://www.meetup.com/Papers-We-Love-Columbus/

High Speed Bug Discovery with Fuzzing

High Speed Bug Discovery with Fuzzing

More Decks by Craig Stuntz

Other Decks in Programming

Featured

Transcript