Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High Speed Bug Discovery with Fuzzing

High Speed Bug Discovery with Fuzzing

Unit testing is helpful at preventing regressions and guiding design, but it doesn't do a great job of helping you with exploratory testing. How can you find hidden defects in your code without a lot of manual analysis? Fuzzing is a simple but surprisingly effective technique which has been responsible for finding nearly all of the security vulnerabilities uncovered in Flash over the past five years. But it's not just limited to finding security defects! The technique was very successfully used to stabilize the Microsoft document importers for Open Office and check C++ compiler standards compliance. You'll leave this talk knowing when to use fuzzing to test your application, which tools you should use, how to implement a fuzzer from scratch, and when other techniques are a better choice.

Craig Stuntz

May 05, 2017
Tweet

More Decks by Craig Stuntz

Other Decks in Programming

Transcript

  1. High Speed Bug Discovery
    with Fuzzing
    Craig Stuntz ∈ Improving

    View Slide

  2. Slides
    https://speakerdeck.com/craigstuntz

    View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. Sad
    Bunny
    https://www.flickr.com/photos/climber85268/446766956

    View Slide

  7. Spoilers!
    Why should I care?
    (because it’s surprisingly effective at finding bugs in software)
    What is it?
    (a simple, property-based randomized testing technique)
    When should I use it?
    (integration testing complex systems with infinite input values)
    How do I get started?
    (I’ll suggest a bunch of tools)
    Should I write my own?
    (yes, and I have stories!)

    View Slide

  8. Why?

    View Slide

  9. 400
    Crashes,
    106 Distinct
    Security
    Bugs
    in Adobe Flash Player
    https://security.googleblog.com/2011/08/fuzzing-at-scale.html

    View Slide

  10. 325
    C
    Compiler
    bugs
    in GCC, Clang, & Others
    https://www.flux.utah.edu/paper/yang-pldi11

    View Slide

  11. “…our most
    prolific
    bug-finding
    tool.”
    Robert Guo
    http://queue.acm.org/detail.cfm?ref=rss&id=3059007

    View Slide

  12. DARPA
    Cyber
    Grand
    Challenge
    Mayhem, Xandra,
    Mechanical Phish, Galactica
    http://blogs.grammatech.com/the-cyber-grand-challenge

    View Slide

  13. https://commons.wikimedia.org/wiki/File:Rabbit_american_fuzzy_lop_buck_white.jpg
    What
    is it?

    View Slide

  14. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications

    View Slide

  15. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications

    View Slide

  16. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications

    View Slide

  17. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications

    View Slide

  18. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications

    View Slide

  19. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications
    Integration testing
    Unit testing
    Formal verification
    Exploratory testing

    View Slide

  20. Prevent
    Regressions
    Bug
    Discovery
    Help with
    Code Design
    Meets
    Specifications
    Fuzzing
    Integration testing
    Unit testing
    Formal verification
    Exploratory testing

    View Slide

  21. How Many Cases
    Should We Test?
    One Only the Most Interesting Every Possible Case
    Unit Testing Fuzzing
    Formal
    Verification

    View Slide

  22. Property
    -Based
    Random
    Testing
    QuickCheck
    Chaos Monkey Fuzzing
    Scientist

    View Slide

  23. View Slide

  24. Corpus
    (Collection of Examples)

    View Slide

  25. System Under Test

    View Slide

  26. Property

    View Slide

  27. Magic!

    View Slide

  28. Corpus
    A few handwritten
    examples
    Fuzzing databases
    Harvest from test suites,
    defect reports
    Harvest from public
    Internet

    View Slide

  29. System Under Test
    A function
    Entire application
    Part of OS kernel

    View Slide

  30. Properties
    Does it crash?
    Does it hang?
    Is the output “valid”?
    Does execution trip an
    address or memory
    sanitizer?
    Does the output match
    some other system?

    View Slide

  31. Magic
    Mutation of corpus
    Coverage guidance
    Lots of test runs

    View Slide

  32. Possible Inputs
    Random
    Inputs
    Interesting
    Inputs
    Random
    Inputs
    with
    Profile
    Guidance

    View Slide

  33. Getting Started with afl
    - Compile system under test with instrumentation
    - Place corpus input(s) in a folder
    - Invoke afl
    - Wait for bugs
    https://fuzzing-project.org/tutorial3.html

    View Slide

  34. Compile with Instrumentation
    $ ./configure CC="afl-gcc" \
    CXX="afl-g++" \
    --disable-shared; \
    make

    View Slide

  35. Place Corpus in Folder
    $ mkdir in
    $ cd in
    $ cat > foo.json
    { "a": "bc" }
    ^D
    $ cd ..

    View Slide

  36. Invoke afl
    $ afl-fuzz -i in -o out \
    my_json_parser @@
    folder containing corpus
    “@@“ means “the current test case”
    system under test
    findings go here

    View Slide

  37. View Slide

  38. View Slide

  39. some_project
    ├── in
    └─┬ out
    ├── queue
    ├── crashes
    └── hangs

    View Slide

  40. afl In a Nutshell

    View Slide

  41. View Slide

  42. afl Fuzz Strategies
    Walking bit flips
    (try flipping each bit in input individually)
    Walking byte flips
    (try flipping each contiguous set of 8 bits)
    Simple arithmetic
    (increment or decrement bytes in the file by certain small values)
    Known integers
    (replace bytes with “problematic” 8, 16, and 32 bit integers like 0 and FF)
    Profile-guided stacked tweaks and test case splicing
    (magic!)
    https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html

    View Slide

  43. Walking Bit Flip
    Original 01010101
    Flip bit 0 01010100
    Flip bit 1 01010111
    Flip bit 2 01010001

    Walking 2 Bit Flip
    Original 01010101
    Flip bits 0,1 01010110
    Flip bits 2,1 01010011
    Flip bits 3,2 01001101

    View Slide

  44. Trace Execution Paths
    A
    B D
    C
    E
    F

    View Slide

  45. Trace Execution Paths
    A
    B D
    C
    E
    F

    View Slide

  46. Trace Execution Paths
    A
    B D
    C
    E
    F
    ?

    View Slide

  47. When Should I
    Use It?
    When is it
    useful?
    https://it.wikipedia.org/wiki/File:Coniglio_ariete.JPG

    View Slide

  48. Unit Tests Fuzzing
    Useful For
    Preventing
    Regressions, Design
    Finding New Bugs
    Tests Functions Any Level
    Test Examples Hand-selected values Corpus + Mutation
    Execution Time Milliseconds Weeks
    Magic? No Yes

    View Slide

  49. Instrumentation
    for profile guidance

    View Slide

  50. $ ./configure CC="afl-gcc" \
    CXX="afl-g++" \
    --disable-shared; \
    make

    View Slide

  51. $ pip install python-afl
    ———
    afl.init()

    View Slide

  52. Properties
    a.k.a. Specifications
    https://lorinhochstein.wordpress.com/2014/06/04/crossing-the-river-with-tla/

    View Slide

  53. View Slide

  54. View Slide

  55. STJSON
    A JSON Parser in Swift 3 compliant with RFC 7159
    STJSON was written along with the article Parsing JSON is a Minefield.
    Basic usage:
    var p = STJSONParser(data: data)
    do {
    let o = p.parse()
    } catch let e {
    print(e)
    }
    Instantiation with options:
    var p = STJSON(data:data,
    maxParserDepth:1024,
    options:[.useUnicodeReplacementCharacter])
    https://github.com/nst/STJSON
    https://github.com/CraigStuntz/Fizil/tree/master/StJson

    View Slide

  56. Can You Fuzz the
    Domain?

    View Slide

  57. Dumb Fuzzer
    Mangling Byte Arrays

    View Slide

  58. public byte[] ResizePng(
    byte[] image)
    {

    View Slide

  59. public boolean SomeFunction(
    SomeEnum firstArg,
    int secondArg)
    {

    View Slide

  60. Smart Fuzzing
    MongoDB Expression Grammar
    http://queue.acm.org/detail.cfm?ref=rss&id=3059007

    View Slide

  61. public Assembly Compile(
    AbstractNode syntaxTree)
    {

    View Slide

  62. Worth the Wait?
    Results Can Take Weeks

    View Slide

  63. Isn’t it just for Security?
    https://www.flickr.com/photos/wocintechchat/25721078480/

    View Slide

  64. How do
    I get
    started?

    View Slide

  65. How to Get Started with Fuzzing
    1. Find a program to test
    2. Find a fuzzer
    3. Find a corpus
    4. Choose a property
    5. Let it run!

    View Slide

  66. Fuzzers
    https://commons.wikimedia.org/wiki/File:Holland_Lop_with_Broken_Orange_Coloring.jpg

    View Slide

  67. libfuzzer
    Fuzz testing for LLVM compilers
    http://llvm.org/docs/LibFuzzer.html

    View Slide

  68. $ cargo install cargo-fuzz
    $ cargo fuzz init

    View Slide

  69. $ gzip -c /bin/bash > sample.gz
    $ while true
    do
    radamsa sample.gz > fuzzed.gz
    gzip -dc fuzzed.gz > /dev/null
    test $? -gt 127 && break
    done
    ← Fuzz the corpus
    ← Execute S.O.T.
    ← Check a property
    ← Repeat a lot!
    https://github.com/aoh/radamsa

    View Slide

  70. ClusterFuzz
    Submit a fuzzer, win a bounty
    https://security.googleblog.com/2016/08/guided-in-process-fuzzing-of-chrome.html

    View Slide

  71. afl

    View Slide

  72. OSS-Fuzz
    Submit your project
    https://github.com/google/oss-fuzz

    View Slide

  73. burp, ZAP

    View Slide

  74. Corpus
    https://en.wikipedia.org/wiki/File:Long_Room_Interior,_Trinity_College_Dublin,_Ire

    View Slide


  75. We didn't call it fuzzing back in the 1950s, but it was
    our standard practice to test programs by inputting
    decks of punch cards taken from the trash.
    -Gerald M. Weinberg
    http://secretsofconsulting.blogspot.com/2017/02/fuzz-testing-and-fuzz-history.html

    View Slide

  76. View Slide

  77. Fuzzing SQLite with afl
    Start with a single test case:
    create table t1(one smallint);
    insert into t1 values(1);
    select * from t1;
    Add a list of reserved words from documentation
    Then extract SQL statements from SQLite unit tests
    (550 files at around 220 bytes each)
    https://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html

    View Slide

  78. Properties

    View Slide

  79. Don’t Crash
    or Hang

    View Slide

  80. Sanitizers
    and Canaries
    https://docs.google.com/presentation/d/19OSgb1N9Ezef39Blb-5lkzycq7-tMtAvy825FofyrmY

    View Slide

  81. Validators

    View Slide

  82. End to End
    Input
    Output
    Same as
    Input?
    f(input)
    f-1(input)

    View Slide

  83. Legacy Code
    https://commons.wikimedia.org/wiki/File:Blue-punch-card-front-horiz.png

    View Slide

  84. One of The Things
    …Is Not Like the Others!

    View Slide

  85. Should I
    write my
    own?

    View Slide

  86. Heck Yeah!
    https://github.com/CraigStuntz/Fizil

    View Slide

  87. Interesting Stuff I Learned While Writing a Fuzzer
    - F# bitwise operations
    - How to instrument .NET code
    - dnSpy is awesome
    - Same input -> Same code -> Different paths
    - Strong naming is painful
    - Unicode is also painful
    - MemoryMappedFile performance is straight-up awful

    View Slide

  88. let jsonNetResult =
    try JsonConvert.DeserializeObject(str) |> ignore
    Success
    with
    | :? JsonReaderException as jre -> jre.Message |> Error
    | :? JsonSerializationException as jse -> jse.Message |> Error
    | :? System.FormatException as fe ->
    if fe.Message.StartsWith("Invalid hex character”) // hard coded in Json.NET
    then fe.Message |> Error
    else reraise()
    ⃪ T
    est
    ⬑ Special case error stuff

    View Slide

  89. use proc = new Process()
    proc.StartInfo.FileName <- executablePath
    inputMethod.BeforeStart proc testCase.Data
    proc.StartInfo.UseShellExecute <- false
    proc.StartInfo.RedirectStandardOutput <- true
    proc.StartInfo.RedirectStandardError <- true
    proc.StartInfo.EnvironmentVariables.Add(SharedMemory.environmentVariableName, sharedMemoryName)
    let output = new System.Text.StringBuilder()
    let err = new System.Text.StringBuilder()
    proc.OutputDataReceived.Add(fun args -> output.Append(args.Data) |> ignore)
    proc.ErrorDataReceived.Add (fun args -> err.Append(args.Data) |> ignore)
    proc.Start() |> ignore
    inputMethod.AfterStart proc testCase.Data
    proc.BeginOutputReadLine()
    proc.BeginErrorReadLine()
    proc.WaitForExit()
    let exitCode = proc.ExitCode
    let crashed = exitCode = WinApi.ClrUnhandledExceptionCode
    ⃪ Set up
    ⃪ Read results
    ⃪ Important bit

    View Slide

  90. /// An ordered list of functions to use when starting with a single piece of
    /// example data and producing new examples to try
    let private allStrategies = [
    bitFlip 1
    bitFlip 2
    bitFlip 4
    byteFlip 1
    byteFlip 2
    byteFlip 4
    arith8
    arith16
    arith32
    interest8
    interest16
    ]

    View Slide

  91. let totalBits = bytes.Length * 8
    let testCases = seq {
    for bit = 0 to totalBits - flipBits do
    let newBytes = Array.copy bytes
    let firstByte = bit / 8
    let firstByteMask, secondByteMask = bitMasks(bit, flipBits)
    let newFirstByte = bytes.[firstByte] ^^^ firstByteMask
    newBytes.[firstByte] <- newFirstByte
    let secondByte = firstByte + 1
    if secondByteMask <> 0uy && secondByte < bytes.Length
    then
    let newSecondByte = bytes.[secondByte] ^^^ secondByteMask
    newBytes.[secondByte] <- newSecondByte
    yield newBytes
    }
    Fuzz one byte →
    ^^^ means xor

    View Slide

  92. View Slide

  93. View Slide

  94. private static void F(string arg)
    {
    Console.WriteLine("f");
    Console.Error.WriteLine("Error!");
    Environment.Exit(1);
    }

    View Slide

  95. private static void F(string arg)
    {
    instrument.Trace(29875);
    Console.WriteLine("f");
    Console.Error.WriteLine("Error!");
    Environment.Exit(1);
    }
    ← Random number

    View Slide

  96. private static void F(string arg)
    {
    #if MANUAL_INSTRUMENTATION
    instrument.Trace(29875);
    #endif
    Console.WriteLine("f");
    Console.Error.WriteLine("Error!");
    Environment.Exit(1);
    }

    View Slide

  97. let stringify (ob: obj) : string =
    JsonConvert.SerializeObject(ob)

    View Slide

  98. let stringify (ob: obj) : string =
    JsonConvert.SerializeObject(ob)
    // Method: System.String\u0020Program::stringify(System.Object)
    .body stringify {
    arg_02_0 [generated]
    arg_07_0 [generated]
    nop()
    arg_02_0 = ldloc(ob)
    arg_07_0 = call(JsonConvert::SerializeObject, arg_02_0)
    ret(arg_07_0)
    }

    View Slide

  99. let stringify (ob: obj) : string =
    JsonConvert.SerializeObject(ob)
    // Method: System.String\u0020Program::stringify(System.Object)
    .body stringify {
    arg_02_0 [generated]
    arg_07_0 [generated]
    nop()
    arg_02_0 = ldloc(ob)
    arg_07_0 = call(JsonConvert::SerializeObject, arg_02_0)
    ret(arg_07_0)
    }
    // Method: System.String\u0020Program::stringify(System.Object)
    .body stringify {
    arg_05_0 [generated]
    arg_0C_0 [generated]
    arg_11_0 [generated]
    arg_05_0 = ldc.i4(23831)
    call(Instrument::Trace, arg_05_0)
    nop()
    arg_0C_0 = ldloc(ob)
    arg_11_0 = call(JsonConvert::SerializeObject, arg_0C_0)
    ret(arg_11_0)
    }

    View Slide

  100. View Slide

  101. let private insertTraceInstruction(ilProcessor: ILProcessor, before: Instruction, state) =
    let compileTimeRandom = state.Random.Next(0, UInt16.MaxValue |> Convert.ToInt32)
    let ldArg = ilProcessor.Create(OpCodes.Ldc_I4, compileTimeRandom)
    let callTrace = ilProcessor.Create(OpCodes.Call, state.Trace)
    ilProcessor.InsertBefore(before, ldArg)
    ilProcessor.InsertAfter (ldArg, callTrace)
    This margin is too narrow to contain a try/finally example, so see:
    https://goo.gl/W4y7JH

    View Slide

  102. let private insertTraceInstruction(ilProcessor: ILProcessor, before: Instruction, state) =
    let compileTimeRandom = state.Random.Next(0, UInt16.MaxValue |> Convert.ToInt32)
    let ldArg = ilProcessor.Create(OpCodes.Ldc_I4, compileTimeRandom)
    let callTrace = ilProcessor.Create(OpCodes.Call, state.Trace)
    ilProcessor.InsertBefore(before, ldArg)
    ilProcessor.InsertAfter (ldArg, callTrace)
    This margin is too narrow to contain a try/finally example, so see:
    https://goo.gl/W4y7JH

    View Slide

  103. let private insertTraceInstruction(ilProcessor: ILProcessor, before: Instruction, state) =
    let compileTimeRandom = state.Random.Next(0, UInt16.MaxValue |> Convert.ToInt32)
    let ldArg = ilProcessor.Create(OpCodes.Ldc_I4, compileTimeRandom)
    let callTrace = ilProcessor.Create(OpCodes.Call, state.Trace)
    ilProcessor.InsertBefore(before, ldArg)
    ilProcessor.InsertAfter (ldArg, callTrace)
    This margin is too narrow to contain a try/finally example, so see:
    https://goo.gl/W4y7JH

    View Slide

  104. let private insertTraceInstruction(ilProcessor: ILProcessor, before: Instruction, state) =
    let compileTimeRandom = state.Random.Next(0, UInt16.MaxValue |> Convert.ToInt32)
    let ldArg = ilProcessor.Create(OpCodes.Ldc_I4, compileTimeRandom)
    let callTrace = ilProcessor.Create(OpCodes.Call, state.Trace)
    ilProcessor.InsertBefore(before, ldArg)
    ilProcessor.InsertAfter (ldArg, callTrace)
    This margin is too narrow to contain a try/finally example, so see:
    https://goo.gl/W4y7JH

    View Slide

  105. View Slide

  106. http://www.json.org/

    View Slide

  107. https://tools.ietf.org/html/rfc4627

    View Slide

  108. http://www.ecma-international.org/ecma-262/5.1/#sec-15.12

    View Slide

  109. http://www.ecma-international.org/publications/standards/Ecma-404.htm

    View Slide

  110. https://tools.ietf.org/html/rfc7158

    View Slide

  111. https://tools.ietf.org/html/rfc7159

    View Slide

  112. https://github.com/nst/STJSON

    View Slide

  113. https://github.com/CraigStuntz/Fizil/blob/master/StJson/StJsonParser.fs

    View Slide

  114. { "a" : "bc" }

    View Slide

  115. View Slide

  116. View Slide

  117. Standard Accepts, Json.NET Rejects
    Value
    88888888888888888888888888888888888888888888888888
    88888888888888888888888888888888888888888888888888
    88888888888888888888888888888888888888888888888888
    88888888888888888888888888888888888888888888888888
    88888888888888888888888888888888888888888888888888
    Standard Says No limit
    Json.NET MaximumJavascriptIntegerCharacterLength = 380;

    View Slide

  118. Standard Rejects, Json.NET Accepts
    Value [,,,]
    Standard Says
    A JSON value MUST be an object, array, number, or string, or one
    of
    the following three literal names:
    false null true
    Json.NET [null, null, null, null]

    View Slide

  119. View Slide

  120. let private removeStrongName (assemblyDefinition : AssemblyDefinition) =
    let name = assemblyDefinition.Name;
    name.HasPublicKey <- false;
    name.PublicKey <- Array.empty;
    assemblyDefinition.Modules |> Seq.iter (
    fun moduleDefinition ->
    moduleDefinition.Attributes <-
    moduleDefinition.Attributes &&& ~~~ModuleAttributes.StrongNameSigned)
    let aptca = assemblyDefinition.CustomAttributes.FirstOrDefault(
    fun attr -> attr.AttributeType.FullName
    = typeof.FullName)
    assemblyDefinition.CustomAttributes.Remove aptca |> ignore
    assembly.MainModule.AssemblyReferences
    |> Seq.filter (fun reference -> Set.contains reference.Name assembliesToInstrument)
    |> Seq.iter (fun reference ->
    reference.PublicKeyToken <- null
    )

    View Slide


  121. “If marked BeforeFieldInit then the type’s initializer
    method is executed at, or sometime before, first
    access to any static field defined for that type.”
    -ECMA-335, Common Language
    Infrastructure (CLI), Partition I

    View Slide

  122. Unicode
    Original JSON
    { "a": "bc" }
    ASCII Bytes
    7B 20 22 61 22 20 3A 20 22 62 63 22 20 7D
    UTF-8 with Byte Order Mark
    EF BB BF 7B 20 22 61 22 20 3A 20 22 62 63 22 20 7D
    UTF-16 BE with BOM
    FE FF 00 7B 00 20 00 22 00 61 00 22 00 20 00 3A 00 20 00 22
    00 62 00 63 00 22 00 20 00 7D

    View Slide

  123. View Slide

  124. Resources
    MongoDB’s JavaScript Fuzzer
    http://queue.acm.org/detail.cfm?ref=rss&id=3059007
    afl technical details
    http://lcamtuf.coredump.cx/afl/technical_details.txt
    afl Help Email List
    [email protected]
    Fizil
    https://github.com/CraigStuntz/Fizil
    WTF, ACM?

    View Slide

  125. Thank You!
    - Michał Zalewski, for afl documentation
    - Rehearsal audiences, employees of
    - Dynamit
    - Improving
    - Ineffable Solutions

    View Slide

  126. Craig Stuntz
    [email protected]
    www.craigstuntz.com
    @craigstuntz
    http://www.meetup.com/Papers-We-Love-Columbus/

    View Slide