Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compiler Fuzzing through Deep Learning (ISSTA'18)

Compiler Fuzzing through Deep Learning (ISSTA'18)

Paper: https://chriscummins.cc/issta18

Random program generation - fuzzing - is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested.

We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03x less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.

Chris Cummins

July 16, 2018
Tweet

More Decks by Chris Cummins

Other Decks in Science

Transcript

  1. Compiler Fuzzing
    through
    Deep Learning
    https://chriscummins.cc/issta18

    View full-size slide

  2. Chris Cummins
    Codeplay Software
    University of Edinburgh
    Pavlos Petoumenos
    Alastair Murray
    Hugh Leather
    University of Edinburgh
    University of Edinburgh

    View full-size slide

  3. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    clang6.0
    fuzzing a compiler
    circa [McKeenan98]
    a.out
    error: use of undeclared …
    Timeout
    Crash

    View full-size slide

  4. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    clang6.0
    fuzzing a compiler
    circa [McKeenan98]
    a.out
    error: use of undeclared …
    Timeout
    Bug!
    Crash

    View full-size slide

  5. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    clang6.0
    fuzzing a compiler
    circa [McKeenan98]
    a.out
    error: use of undeclared …
    Timeout
    Bug??
    Bug!
    Crash

    View full-size slide

  6. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    gcc5.5
    differential testing compilers
    circa [McKeenan98]
    a.out
    clang6.0
    clang3.6
    a.out
    a.out

    View full-size slide

  7. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    gcc5.5
    differential testing compilers
    circa [McKeenan98]
    a.out
    clang6.0
    clang3.6
    a.out
    a.out
    $ ./a.out
    42
    $ ./a.out
    42
    $ ./a.out
    -14522312

    View full-size slide

  8. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    gcc5.5
    differential testing compilers
    circa [McKeenan98]
    a.out
    clang6.0
    clang3.6
    a.out
    a.out
    $ ./a.out
    42
    $ ./a.out
    42
    $ ./a.out
    -14522312
    Majority rules

    View full-size slide

  9. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    gcc5.5
    differential testing compilers
    circa [McKeenan98]
    clang6.0
    clang3.6
    Also works for build failures

    View full-size slide

  10. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    gcc5.5
    differential testing compilers
    circa [McKeenan98]
    clang6.0
    clang3.6
    error: use of undeclared …
    error: use of undeclared …
    a.out
    Also works for build failures

    View full-size slide

  11. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    gcc5.5
    differential testing compilers
    circa [McKeenan98]
    clang6.0
    clang3.6
    error: use of undeclared …
    error: use of undeclared …
    a.out
    Also works for build failures
    hard to
    generate!

    View full-size slide

  12. an ideal fuzzer
    1. Cheap

    Easy to implement and extend

    (Languages and features grow quickly)
    2. Interpretable Testcases

    Necessary for triage

    (i.e. 45 lines or less [Sun2016])
    3. Plausible Output

    Representative of handwritten code

    (So that bugs gets fixed)

    View full-size slide

  13. state-of-the-art: CLSmith
    Random grammar enumeration.
    Extensive static analyses support
    subset of OpenCL features.
    Targets compiler middle ends.
    Incredibly effective!

    100s of bugs to date.
    https://github.com/ChrisLidbury/CLSmith
    #include "CLSmith.h"
    struct S0 {
    int32_t g_4[4][10];
    ...
    };
    kernel void A(global ulong *r) {
    int i, j, k;
    struct S0 c_1856;
    struct S0* p_1855 = &c_1856;
    c_1856 = c_1857;
    func_1(p_1855);
    barrier(CLK_LOCAL_MEM_FENCE |
    CLK_GLOBAL_MEM_FENCE);
    for (i = 0; i < 4; i++)
    for (j = 0; j < 10; j++)
    …>g_4[i][j], "p_1855->g_4[i][j]",
    print_hash_value);
    result[get_linear_global_id()] =
    crc64_context ^
    0xFFFFFFFFFFFFFFFFUL;
    }

    View full-size slide

  14. state-of-the-art: CLSmith
    https://github.com/ChrisLidbury/CLSmith
    1. Cheap ✖ nope!

    Years to develop! 50k lines of C++.

    Each PL feature engineered by hand.
    2. Interpretable Testcases ✖ nope!

    Avg. 1200 lines (excluding headers).

    Requires reduction: ~4 hours / test.
    3. Plausible Output ✖ nope!

    Unusual and restricted combinations of PL features.

    87 dials control “shape” of output - hand tuned.

    View full-size slide

  15. Automatic inference of fuzzers
    from examples.
    102x less code than state-of-art.
    Similar bug finding power, simpler
    test cases.
    contributions

    View full-size slide

  16. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  17. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign
    Mined from
    • 1k repos
    • 10k files
    • 2.0M LOC
    Filtered by oracle compiler.

    View full-size slide

  18. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  19. vocabulary encoding
    #define MY_CONST 3.14
    // A very simple kernel.
    kernel void Foo(global float* input, const float x) {
    input[get_global_id(0)] *= MY_CONST + x;
    }
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    1. Preprocess.
    2. Enforce code style.
    3. Rename variables.
    4. Rename functions.

    View full-size slide

  20. vocabulary encoding
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    Vocab:
    Encoded:

    View full-size slide

  21. vocabulary encoding
    Token Index
    kernel 0
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0
    Vocab:
    Encoded:

    View full-size slide

  22. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1
    Vocab:
    Encoded:

    View full-size slide

  23. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2
    Vocab:
    Encoded:

    View full-size slide

  24. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2
    Vocab:
    Encoded: 1

    View full-size slide

  25. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3
    Vocab:
    Encoded:

    View full-size slide

  26. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4
    Vocab:
    Encoded:

    View full-size slide

  27. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5
    Vocab:
    Encoded:

    View full-size slide

  28. vocabulary encoding
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5 1
    Vocab:
    Encoded:
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5

    View full-size slide

  29. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5
    float 6
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5 1 6
    Vocab:
    Encoded:

    View full-size slide

  30. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5
    float 6
    * 7
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5 1 6 7
    Vocab:
    Encoded:

    View full-size slide

  31. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5
    float 6
    * 7
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5 1 6 7
    Vocab:
    Encoded: 1

    View full-size slide

  32. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5
    float 6
    * 7
    a 8
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5 1 6 7 1 8
    Vocab:
    Encoded:

    View full-size slide

  33. vocabulary encoding
    Token Index
    kernel 0
    [space] 1
    void 2
    A 3
    ( 4
    global 5
    float 6
    * 7
    a 8
    Token Index
    ] 18
    = 19
    3 20
    . 21
    1 22
    4 23
    + 24
    ; 25
    Token Index
    , 9
    const 10
    b 11
    ) 12
    { 13
    \n 14
    [ 15
    get_global_id 16
    0 17
    kernel void A(global float* a, const float b) {
    a[get_global_id(0)] *= 3.14 + b;
    }
    0 1 2 1 3 4 5 1 6 7 1 8 ...
    Vocab:
    Encoded:

    View full-size slide

  34. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  35. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  36. neural network
    181 tokens 181 tokens
    Input: 30M token corpus

    Learns probability distribution over corpus.
    < 500 lines of code, 12 hours training on GPU.
    512x2 LSTM
    0 1 2 ...

    View full-size slide

  37. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  38. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  39. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void

    Input:

    View full-size slide

  40. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void

    Input:

    View full-size slide

  41. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A

    Input:

    View full-size slide

  42. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A

    Input: 3

    View full-size slide

  43. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A

    Input: 3

    View full-size slide

  44. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(

    Input: 3

    View full-size slide

  45. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(

    Input: 3 4

    View full-size slide

  46. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(

    Input: 3 4

    View full-size slide

  47. synthesizer + harness
    0 1 2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: 3 4
    kernel void A(global
    0
    1
    kernel ' ' void A ( global int double float

    View full-size slide

  48. synthesizer + harness
    1
    2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: 3 4
    … 5

    View full-size slide

  49. synthesizer + harness
    1
    2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: 3 4
    … 5

    View full-size slide

  50. synthesizer + harness
    1
    2 1
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: 3 4
    … 5

    View full-size slide

  51. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: … 1
    2 1 3 4 5

    View full-size slide

  52. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: … 1
    2
    1 3 4 5 1

    View full-size slide

  53. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global

    Input: … 1
    2
    1 3 4 5 1

    View full-size slide

  54. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global int

    Input: … 1
    2
    1 3 4 5 1

    View full-size slide

  55. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global int

    Input: … 1
    2
    1 3 4 5 6
    3 4 5 1

    View full-size slide

  56. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    Decoded:
    Output:
    0
    1
    kernel ' ' void A ( global int double float
    kernel void A(global int* a) {
    /* snip */ }

    Input: … 1
    2
    1 3 4 5 1
    1
    2
    1 3 4 5 6
    3 4 5 1

    View full-size slide

  57. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    3. Can we parse signature?
    Yes: Generate input data, compile and run it.
    No: Compile it but don’t run it.
    Decoded: kernel void A(global int* a) {
    /* snip */ }

    View full-size slide

  58. synthesizer + harness
    1. Seed the model with the start of a program.
    2. Predict tokens until { } brackets balance.
    3. Can we parse signature?
    Yes: Generate input data, compile and run it.
    No: Compile it but don’t run it.
    Decoded: kernel void A(global int* a) {
    /* snip */ }

    View full-size slide

  59. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  60. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign
    Standard majority voting.
    False-positive filtering of
    runtime behavior:
    Combination of off-the-shelf tools
    and ad-hoc filters.
    Took ~1 dev-day to develop.

    View full-size slide

  61. int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Training
    Corpus
    Vocabulary
    Encoding
    Neural
    Network Synthesizer
    our approach
    Harness
    int main(
    int argc,
    char**
    argv) { ...
    int main(
    int argc,
    char**
    argv) { ...
    Generated
    Test Cases
    Voting
    Heuristics
    Testing
    Campaign

    View full-size slide

  62. how well does it work?

    View full-size slide

  63. 48 hours per testbed
    testing campaign
    10 OpenCL compilers
    3 GPUs, 5 CPUs, Xeon Phi, Emulator
    Test with optimizations on / off
    Treat as separate testbeds

    View full-size slide

  64. Errors in ever compiler!
    Num results (log)
    1
    100
    10,000
    C
    om
    piler crash
    B
    uild
    Tim
    eout
    B
    uild
    Failure
    Program
    C
    rash
    W
    rong
    O
    utput
    69
    252
    51
    860
    7,040
    results overview

    View full-size slide

  65. … crashes during parsing / compilation
    67 bug reports to date…
    void A() {void* a; uint4 b=0; b=(b>b)?a:a }
    Affects: Intel OpenCL SDK 1.2.0.25
    kernel void A(global int* a) {
    int b = get_global_id(0);
    a[b] = (6 * 32) + 4 * (32 / 32) + a;
    }
    Affects: Beignet 1.3
    “Bad code” finds bugs in error handling

    View full-size slide

  66. kernel void A() {
    __builtin_astype(d, uint4);
    }
    … crashes during type checking
    67 bug reports to date…
    Affects: 6 / 10 compilers we tested
    Unexpected outcome: Learning from
    handwritten code leads to bugs found in
    compiler builtins!

    View full-size slide

  67. kernel void A(global double* a, global double* b,
    global double* c, int d, int e) {
    double f;

    int g = get_global_id(0);
    if (g < e - d - 1)
    c[g] = (((e) / d) % 5) % (e + d);
    }
    67 bug reports to date…
    Affects: Intel OpenCL SDK 1.2.0.25
    … errors in optimizers
    CLSmith doesn’t allow

    thread-dependent control flow.

    View full-size slide

  68. $ docker run chriscummins/opencl_fuzz
    https://chriscummins.cc/issta18
    runs in docker
    try it for yourself!
    code and paper on GitHub

    View full-size slide

  69. comparison to CLSmith
    LLVM robustness experiments
    extending to a second language

    + more!
    read the paper for …
    https://chriscummins.cc/issta18
    code and paper on GitHub

    View full-size slide

  70. Compiler Fuzzing through Deep Learning
    https://chriscummins.cc/issta18
    Problem: Compilers inputs are hard to generate
    Use DL to infer PL generator from examples
    102x less code than state-of-art, 3.03x faster
    Lots of bugs!

    View full-size slide