Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning & Compilers

Chris Cummins
September 09, 2016

Machine Learning & Compilers

Predictive modeling using machine learning is an effective method for building compiler heuristics, but there is a shortage of benchmarks. Typical machine learning experiments outside of the compilation field train over thousands or millions of examples. In machine learning for compilers, however, there are typically only a few dozen common benchmarks available. This limits the quality of learned models, as they have very sparse training data for what are often high-dimensional feature spaces.

In this talk I present CLgen, a tool for generating benchmarks for predictive modeling.

Chris Cummins

September 09, 2016
Tweet

More Decks by Chris Cummins

Other Decks in Science

Transcript

  1. Machine
    Learning
    & Compilers

    View Slide

  2. Optimisation heuristics
    are too course.
    (aka. it doesn’t take much to do better than -O3)

    View Slide

  3. Better than -O3
    #!/bin/sh
    while true; do
    sort --random-sort < "cflags.txt" | head -n 20 | xargs gcc -O1 app.c
    time ./a.out
    done
    after 200 attempts, ~5% improvement

    View Slide

  4. *actual homework

    View Slide

  5. 200 attempts sounds like a lot…
    (200 x 10 x 30 s ≈ 16 hrs)

    View Slide

  6. ocean
    Drop in the
    250 GCC flags ≥ 2250 options ≈ 1075
    Atoms in the universe ≈ 1080

    View Slide

  7. Exhaustive search is
    not really* practical.
    *in this universe

    View Slide

  8. Machine Learning
    Estimate y = f(x)

    View Slide

  9. Machine Learning
    Estimate y = f(x)

    View Slide

  10. Machine Learning
    Estimate y = f(x)
    Optimisations Features
    Cflags # instructions
    Workgroup size Arithmetic density
    CPU or GPU Dataset size

    View Slide

  11. Use a GPU
    Machine Learning
    Estimate y = f(x)
    Use a CPU
    The idea.

    View Slide

  12. Use a GPU
    Machine Learning
    Estimate y = f(x)
    Use a CPU
    The reality.
    (same data, wrong conclusions)

    View Slide

  13. Synthesizing Benchmarks
    for Predictive Modeling

    View Slide

  14. Why?

    There aren’t enough benchmarks

    View Slide

  15. View Slide

  16. Why?

    More benchmarks = better models

    View Slide

  17. View Slide

  18. Why?

    No adequate solution

    View Slide

  19. View Slide

  20. How?
    Teach an AI to program from GitHub

    View Slide

  21. Implementation

    View Slide

  22. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  23. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  24. We teach an AI to code by showing it lots of code.
    Huge repository of public knowledge:
    And they have an API :-)
    $ curl https://api.github.com/search/repositories\?
    q\=opencl\&sort\=stars\&order\=desc
    {
    "total_count": 3155,
    "incomplete_results": false,
    "items": [
    {
    "id": 7296244,
    "name": "lwjgl3",
    "full_name": "LWJGL/lwjgl3",

    View Slide

  25. OpenCL is not a first-class language.
    Search repositories using loose keyword terms.
    e.g. opencl, nvidia, gpu, cl, amd.
    Recursively iterate over git trees to get .cl files.
    /src/guassian.cl
    #include
    /include/common.h
    #include “detail/math.cl”
    Foo:MyOpenCLRepo
    (0.6% miss rate)

    View Slide

  26. /* Copyright (C) 2004 Joe Bloggs */
    //
    // DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
    // TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
    //
    // 0. You just DO WHAT THE FUCK YOU WANT TO.
    #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif // CLAMPING
    }
    // Do something really flipping cool
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    //
    //
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    x 8078 files
    2.8 million lines

    View Slide

  27. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  28. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  29. /* Copyright (C) 2004 Joe Bloggs */
    //
    // DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
    // TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
    //
    // 0. You just DO WHAT THE FUCK YOU WANT TO.
    #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif // CLAMPING
    }
    // Do something really flipping cool
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    //
    //
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }

    View Slide

  30. /* Copyright (C) 2004 Joe Bloggs */
    //
    // DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
    // TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
    //
    // 0. You just DO WHAT THE FUCK YOU WANT TO.
    #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif // CLAMPING
    }
    // Do something really flipping cool
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    //
    //
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Is this real, valid OpenCL?
    Can we minimise non-functional variance?

    View Slide

  31. /* Copyright (C) 2004 Joe Bloggs */
    //
    // DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
    // TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
    //
    // 0. You just DO WHAT THE FUCK YOU WANT TO.
    #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif // CLAMPING
    }
    // Do something really flipping cool
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    //
    //
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Is this real, valid OpenCL?
    Can we minimise non-functional variance?
    Strip comments

    View Slide

  32. /* Copyright (C) 2004 Joe Bloggs */
    //
    // DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
    // TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
    //
    // 0. You just DO WHAT THE FUCK YOU WANT TO.
    #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif // CLAMPING
    }
    // Do something really flipping cool
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    //
    //
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Is this real, valid OpenCL?
    Can we minimise non-functional variance?
    Strip comments

    View Slide

  33. #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Strip comments

    View Slide

  34. #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Strip comments
    Preprocess

    View Slide

  35. #define CLAMPING
    #define THRESHOLD_MIN 1.0f
    #define THRESHOLD_MAX 1.0f
    float myclamp(float in) {
    #ifdef CLAMPING
    return in > THRESHOLD_MAX ? THRESHOLD_MAX : in < THRESHOLD_MIN ? THRESHOLD_MIN : in;
    #else
    return in;
    #endif
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Strip comments
    Preprocess

    View Slide

  36. Strip comments
    float myclamp(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Preprocess

    View Slide

  37. Strip comments
    float myclamp(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Preprocess
    Does it compile?
    Does it contain instructions?

    View Slide

  38. Strip comments
    float myclamp(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Preprocess
    Does it compile?
    Does it contain instructions?
    Does it compile?
    Does it contain instructions?

    View Slide

  39. Strip comments
    float myclamp(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Preprocess
    Does it compile?
    Does it contain instructions?
    Rewrite function names

    View Slide

  40. Strip comments
    float myclamp(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void findAllNodesMergedAabb(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = myclamp(in[id]);
    }
    }
    Preprocess
    Does it compile?
    Does it contain instructions?
    Rewrite function names

    View Slide

  41. float A(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void B(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = A(in[id]);
    }
    }
    Rewrite function names
    Strip comments
    Preprocess
    Does it compile?
    Does it contain instructions?

    View Slide

  42. float A(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void B(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = A(in[id]);
    }
    }
    Rewrite function names
    Strip comments
    Preprocess
    Does it compile?
    Does it contain instructions?
    Rewrite variable names

    View Slide

  43. float A(float in) {
    return in > 1.0f ? 1.0f : in < 0.0f ? 0.0f : in;
    }
    __kernel void B(__global float* in, __global float* out, int num_elems)
    {
    int id = get_global_id(0);
    if (id < num_elems)
    {
    out[id] = A(in[id]);
    }
    }
    Rewrite function names
    Strip comments
    Preprocess
    Does it compile?
    Does it contain instructions?
    Rewrite variable names

    View Slide

  44. float A(float a) {
    return a > 1.0f ? 1.0f : a < 0.0f ? 0.0f : a;
    }
    __kernel void B(__global float* a, __global float* b, int c)
    {
    int d = get_global_id(0);
    if (d < c)
    {
    b[d] = A(a[d]);
    }
    }
    Rewrite variable names
    Rewrite function names
    Strip comments
    Preprocess
    Does it compile?
    Does it contain instructions?

    View Slide

  45. float A(float a) {
    return a > 1.0f ? 1.0f : a < 0.0f ? 0.0f : a;
    }
    __kernel void B(__global float* a, __global float* b, int c)
    {
    int d = get_global_id(0);
    if (d < c)
    {
    b[d] = A(a[d]);
    }
    }
    Rewrite variable names
    Rewrite function names
    Strip comments
    Preprocess
    Does it compile?
    Does it contain instructions?
    Enforce code style

    View Slide

  46. float A(float a) {
    return a > 1.0f ? 1.0f : a < 0.0f ? 0.0f : a;
    }
    __kernel void B(__global float* a, __global float* b, int c)
    {
    int d = get_global_id(0);
    if (d < c)
    {
    b[d] = A(a[d]);
    }
    }
    Rewrite variable names
    Rewrite function names
    Strip comments
    Preprocess
    Does it compile?
    Does it contain instructions?
    Enforce code style

    View Slide

  47. float A(float a) {
    return a > 1.0f ? 1.0f : in < 0.0f ? 0.0f : a;
    }
    __kernel void B(__global float* a, __global float* b, int c) {
    int d = get_global_id(0);
    if (d < c) {
    b[d] = A(a[d]);
    }
    }
    Rewrite variable names
    Rewrite function names
    Strip comments
    Preprocess
    Enforce code style
    Does it compile?
    Does it contain instructions?

    View Slide

  48. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  49. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  50. Forward Pass Backward Pass Update Weights
    Network Design
    2048 nodes, 3 layers
    Stochastic Gradient Descent
    Initial learning rate 0.005
    Train for 50 epochs
    Learning rate decay every 5 epochs

    View Slide

  51. Forward Pass Backward Pass Update Weights
    Network Design
    2048 nodes, 3 layers
    Stochastic Gradient Descent
    Initial learning rate 0.005
    Train for 50 epochs
    Learning rate decay every 5 epochs

    View Slide

  52. Estimate y = f(x)
    Distribution of characters 1278595 lines of OpenCL
    Machine Learning

    View Slide

  53. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  54. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  55. S = ‘__kernel void A(__global float* a) {’
    depth = 1
    while depth > 0:
    c = predict_next_character(S)
    if c == ‘{’:
    depth += 1
    if c == ‘}’:
    depth -= 1
    S += c
    return S
    Kernel Synthesis

    View Slide

  56. Demo
    (you had to be there)

    View Slide

  57. __kernel void A(__global float* a,
    __global float* b,
    __global float* c,
    const int d) {
    int e = get_global_id(0);
    float f = 0.0;
    for (int g = 0; g < d; g++) {
    c[g] = 0.0f;
    }
    barrier(1);
    a[get_global_id(0)] = 2*b[get_global_id(0)];
    }

    View Slide

  58. __kernel void A(__global float* a,
    __global float* b,
    __global float* c,
    const int d) {
    int e = get_global_id(0);
    if (e >= d) {
    return;
    }
    c[e] = a[e] + b[e] + 2 * a[e] + b[e] + 4;
    }

    View Slide

  59. __kernel void A(__global float* a,
    __global float* b,
    __global float* c,
    const int d) {
    unsigned int e = get_global_id(0);
    float16 f = (float16)(0.0);
    for (unsigned int g = 0; g < d; g++) {
    float16 h = a[g];
    f.s0 += h.s0;
    f.s1 += h.s1;
    /* snip ... */
    f.sE += h.sE;
    f.sF += h.sF;
    }
    b[e] = f.s0 + f.s1 + f.s2 + f.s3 + f.s4 +
    f.s5 + f.s6 + f.s7 + f.s8 + f.s9 + f.sA +
    f.sB + f.sC + f.sD + f.sE + f.sF;
    }

    View Slide

  60. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  61. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  62. __kernel void A(__global float* a,
    __global float* b,
    __global float* c,
    const int d) {
    int e = get_global_id(0);
    if (e >= d) {
    return;
    }
    c[e] = a[e] + b[e] + 2 * a[e] + b[e] + 4;
    }

    View Slide

  63. __kernel void A(__global float* a,
    __global float* b,
    __global float* c,
    const int d) {
    int e = get_global_id(0);
    if (e >= d) {
    return;
    }
    c[e] = a[e] + b[e] + 2 * a[e] + b[e] + 4;
    }

    View Slide

  64. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  65. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  66. _A = random_payload(_A) # generate inputs
    _B = random_payload(_B)
    _C = copy(_C)
    _D = copy(_B)
    A_ = k(_A) # compute outputs
    B_ = k(_B)
    C_ = k(_C)
    D_ = k(_D)
    # differential test
    assert (A_ != _A || B_ != _B) else NO_OUTPUTS
    assert (A_ != B_ || C_ != D_) else INPUT_INSENSITIVE
    assert (A_ == C_ && B_ == D_) else NON_DETERMINISTIC

    View Slide

  67. CLgen
    CLdrive
    Language
    Corpus
    GitHub
    Software
    Repositories
    clsmith
    clsmith
    Content Files Rejection
    Filter
    Search
    engine
    Source
    Normalizer
    Training
    parameters
    Rejection
    Filter
    LSTM network
    Synthesizer
    Synthesis
    parameters
    Argument
    Extractor
    Benchmark
    parameters
    clsmith
    clsmith
    Synthesized
    Benchmarks
    Benchmark
    Driver
    clsmith
    clsmith
    Synthesized
    Payloads
    clsmith
    clsmith
    Performance
    Results
    Dynamic
    Checker

    View Slide

  68. Does it work?
    (yes)

    View Slide

  69. 52%

    View Slide

  70. View Slide

  71. 7 programs, 1,000 synthetic benchmarks. 1.27x faster

    View Slide

  72. 71 programs, 1,000 synthetic benchmarks. 2.66x faster

    View Slide

  73. Good Things
    Bad Things
    Basically* language agnostic.
    35 million repos on GitHub. We’re using 0.00004%.
    Generates 2000 OpenCL benchmarks per machine per day.
    No support for things declared outside of kernel scope.
    Undirected almost to a fault.
    AMD rage.

    View Slide

  74. Thanks!
    More benchmarks = Better models
    No general way of creating benchmarks
    I taught a Neural Network to program from GitHub
    Improves state of the art by 3.38x

    View Slide