Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compilers and ML - intro

Compilers and ML - intro

My first presentation around compilers and how their construction relates to ML.

Ioannis Petrousov

January 18, 2023
Tweet

More Decks by Ioannis Petrousov

Other Decks in Education

Transcript

  1. Compilers and ML
    What’s their connection?
    Ioannis Petrousov
    18.01.2023

    View Slide

  2. $ whoami
    Ioannis Petrousov
    ● 2017: Graduated from UOWM while working FT.
    ● Moved to the Netherlands.
    ● .
    ● . (several adventures later)
    ● .
    ● 2022: Started working as a freelancer

    View Slide

  3. - Notes from the book: A Practical Approach to Compiler Construction - Des Watson
    - This presentation is for me to more-or-less verify what I’ve learned.
    - Expects some interaction!
    Disclaimer

    View Slide

  4. TOC
    1. What is a compiler?
    2. Compiler building blocks.
    3. How compilers are made.

    View Slide

  5. How I got into compilers
    - I felt like a fool when trying to understand
    George Hotz’s streams.
    - What’s tinycore?
    - How do you compile something to run on
    M1 neural engine?
    - How do you compile something on Google
    Coral Edge TPU?
    - What is the LLVM backend and how can you
    use it?

    View Slide

  6. How I got into compilers
    - I felt like a fool when trying to understand
    George Hotz’s streams.
    - What’s tinycore?
    - How do you compile something to run on
    M1 neural engine?
    - How do you compile something on Google
    Coral Edge TPU?
    - What is the LLVM backend and how can you
    use it?

    View Slide

  7. Which compilers do you know?

    View Slide

  8. What is a compiler?
    Compiler
    The secret lies within the compiler.

    View Slide

  9. Buildings blocks of a compiler
    It’s a black box which takes as input a logic written in a higher level programming language and produces that logic in some machine code.
    Frontend
    Compiler
    IR Backend

    View Slide

  10. Compiler frontend
    Frontend
    1. Lexical analysis
    - Group lexical tokens
    - OUTPUT: group of lexical tokens
    2. Syntax analysis
    - INPUT: groups of lexical tokens
    - Perform reduction on the lexical token group and reduce them
    into BNF statements
    - OUTPUT: Construct the program's syntax tree
    3. Semantic analysis
    - INPUT: Abstract Syntax Tree
    - Traverses the tree and inserts the following:
    - types, operator overload, scopes, etc.
    - OUTPUT: Intermediate Representation IR
    read
    process
    write

    View Slide

  11. Lexical analysis
    Frontend
    1. Lexical analysis
    - Group lexical tokens
    - OUTPUT: group of lexical tokens
    read
    while (i <= 100) {
    tot += a[i]; /* form vector
    total */
    i++;
    }
    while (reserved word), (, i
    (identifier), <=, 100 (integer
    constant), ), {, tot (identifier),+=,
    a (identifier), [, i (identifier), ],
    ;, i (identifier), ++, ;, }
    INPUT OUTPUT

    View Slide

  12. How does the lexer identify tokens
    - Regex
    - Keywords
    - Etc…
    - Case .. switch (manually written)

    View Slide

  13. Syntax analysis
    Frontend
    2. Syntax analysis
    - INPUT: groups of lexical tokens
    - Perform reduction on the lexical token group and reduce them into BNF statements
    - OUTPUT: Construct the program's syntax tree
    process
    while (reserved word), (, i
    (identifier), <=, 100 (integer
    constant), ), {, tot (identifier),+=,
    a (identifier), [, i (identifier), ],
    ;, i (identifier), ++, ;, }
    INPUT PROCESS
    In simpler words, it performs a syntax check to see if the
    symbols adhere to the programming language’s syntax rules.

    View Slide

  14. How does the syntax analyser know the syntax of a language?
    In other words, how is a programming language defined?

    View Slide

  15. How to learn a language
    Programmer
    Pick a
    language
    Read its
    syntax
    Write a
    hello world
    program
    Skills
    - Rust
    - C
    - Go
    Skills
    - Rust
    - Go

    View Slide

  16. EBNF
    - Similarly, a syntax analyser knows the syntax of a language through a “document”.
    - The “document” is written in a meta-language called EBNF.
    - EBNF defines the syntax of a programming language.
    EBNF example
    ::=
    ::= |
    ::= "." |
    ::=
    ::= ","
    ::= "Sr." | "Jr." | | ""
    ::= | ""
    https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form

    View Slide

  17. Syntax analysis (AST)
    Frontend
    2. Syntax analysis
    - INPUT: groups of lexical tokens
    - Perform reduction on the lexical token group and reduce them into BNF statements
    - OUTPUT: Construct the program's syntax tree
    process
    int(reserved word),i
    (identifier,=,5 (constant),;
    INPUT REDUCTION
    ::= = ;
    ::= int | float
    ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
    statement
    =
    i 5

    View Slide

  18. Semantic analysis
    3. Semantic analysis
    - INPUT: Abstract Syntax Tree
    - Traverses the tree and inserts the following:
    - types, operator overload, scopes, etc.
    - OUTPUT: Intermediate Representation
    write
    INPUT
    https://cs.lmu.edu/~ray/notes/ir/
    Frontend
    IR
    OUTPUT
    i var value=100
    C const value=6
    name string value=qwerty

    View Slide

  19. Intermediate Representation (IR)
    - A combination of the Syntax Tree + Symbol table.
    - No standard way of constructing one.
    - Dependent on the Frontend and Backend blocks.
    - There are multiple IRs out there.
    IR

    View Slide

  20. Modular blocks
    - Compiler blocks are
    interchangeable.
    - A frontend must know
    how to write an IR.
    - A backend must know
    how to interpret an IR
    into machine code.
    Frontend IR Backend
    Frontend-1
    Frontend-2
    Frontend-3
    IR-1 Backend-1
    Backend-2
    IR-2

    View Slide

  21. Practically
    https://gcc.gnu.org/frontends.html
    https://llvm.org/

    View Slide

  22. Compiler Backend
    - Is concerned with
    - Optimization
    - Code generation
    - Optimization:
    - Memory
    - Power
    - Speed
    - Generates machine code for the
    specific machine processor.
    Backend
    RISC CISC
    VM
    https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/

    View Slide

  23. Approaches towards code generation
    - One-off custom generator which does
    pattern matching against the IR and
    data structures.
    - Using the peephole optimization on
    the IR instruction.
    - Use regex on the IR and replace it
    with machine code.
    - Replace matching text with machine
    code. Backend

    View Slide

  24. Approaches towards compiler construction

    View Slide

  25. Elementary language
    EBNF definition of an elementary language.
    S → Az|z
    A → x A|B
    B → y
    Usage examples
    xyz
    xxxxyz
    xxxxz
    z

    View Slide

  26. Handwritten syntax analyser
    int ch;
    void error(char *msg) {
    printf("Error - found character %c - %s\n",ch,msg);
    exit(1);
    void s() {
    if (ch == ’z’) ch = getchar();
    else {
    a();
    if (ch != ’z’) error("z expected");
    else ch = getchar();
    }
    printf("Success!\n");
    }
    int main(int argc, char *argv[]) {
    ch = getchar();
    s();
    return 0;
    }
    }
    void b() {
    if (ch == ’y’) ch = getchar();
    else error("y expected");
    }
    void a() {
    if (ch == ’x’) {
    ch = getchar();
    a();
    }
    else b();
    }
    xyz
    Success!
    $ ./simpletopdown
    xxxxyz
    Success!
    $ ./simpletopdown
    xxxxz
    Error - found character z - y expected
    $ ./simpletopdown
    z
    Success!
    A Practical Approach to Compiler Construction (Undergraduate Topics in Computer Science)

    View Slide

  27. Compiler construction tools
    - yacc
    - flex
    - bison
    Calculator example on terminal

    View Slide

  28. What you should be able to answer
    - What is a compiler?
    - Compiler building blocks.
    - How is Java different than C.
    - How to navigate compiler pages: llvm.org || gcc.gnu.org
    - How to make your own compilers (more or less).

    View Slide

  29. ML and Compilers
    ML accelerator hardware
    - GPUs
    - Google Coral Edge TPU.
    - Apple Neural Engine.
    - Custom made FPGA accelerators.
    - …
    The field is open for disruption.
    - Custom accelerator architectures require
    custom compilers.
    - Compilers must be able to optimize ML
    code.

    View Slide

  30. Find my notes on compilers on github
    https://github.com/gpetrousov/compilers

    View Slide