Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compilers and ML - intro

Compilers and ML - intro

My first presentation around compilers and how their construction relates to ML.

Ioannis Petrousov

January 18, 2023
Tweet

More Decks by Ioannis Petrousov

Other Decks in Education

Transcript

  1. $ whoami Ioannis Petrousov • 2017: Graduated from UOWM while

    working FT. • Moved to the Netherlands. • . • . (several adventures later) • . • 2022: Started working as a freelancer
  2. - Notes from the book: A Practical Approach to Compiler

    Construction - Des Watson - This presentation is for me to more-or-less verify what I’ve learned. - Expects some interaction! Disclaimer
  3. How I got into compilers - I felt like a

    fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?
  4. How I got into compilers - I felt like a

    fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?
  5. Buildings blocks of a compiler It’s a black box which

    takes as input a logic written in a higher level programming language and produces that logic in some machine code. Frontend Compiler IR Backend
  6. Compiler frontend Frontend 1. Lexical analysis - Group lexical tokens

    - OUTPUT: group of lexical tokens 2. Syntax analysis - INPUT: groups of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree 3. Semantic analysis - INPUT: Abstract Syntax Tree - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation IR read process write
  7. Lexical analysis Frontend 1. Lexical analysis - Group lexical tokens

    - OUTPUT: group of lexical tokens read while (i <= 100) { tot += a[i]; /* form vector total */ i++; } while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT OUTPUT
  8. How does the lexer identify tokens - Regex - Keywords

    - Etc… - Case .. switch (manually written)
  9. Syntax analysis Frontend 2. Syntax analysis - INPUT: groups of

    lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT PROCESS In simpler words, it performs a syntax check to see if the symbols adhere to the programming language’s syntax rules.
  10. How does the syntax analyser know the syntax of a

    language? In other words, how is a programming language defined?
  11. How to learn a language Programmer Pick a language Read

    its syntax Write a hello world program Skills - Rust - C - Go Skills - Rust - Go
  12. EBNF - Similarly, a syntax analyser knows the syntax of

    a language through a “document”. - The “document” is written in a meta-language called EBNF. - EBNF defines the syntax of a programming language. EBNF example <postal-address> ::= <name-part> <street-address> <zip-part> <name-part> ::= <personal-part> <last-name> <opt-suffix-part> <EOL> | <personal-part> <name-part> <personal-part> ::= <initial> "." | <first-name> <street-address> ::= <house-num> <street-name> <opt-apt-num> <EOL> <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL> <opt-suffix-part> ::= "Sr." | "Jr." | <roman-numeral> | "" <opt-apt-num> ::= <apt-num> | "" https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form
  13. Syntax analysis (AST) Frontend 2. Syntax analysis - INPUT: groups

    of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process int(reserved word),i (identifier,=,5 (constant),; INPUT REDUCTION <statement> ::= <type> <identifier> = <digit>; <type> ::= int | float <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 statement = i 5
  14. Semantic analysis 3. Semantic analysis - INPUT: Abstract Syntax Tree

    - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation write INPUT https://cs.lmu.edu/~ray/notes/ir/ Frontend IR OUTPUT i var value=100 C const value=6 name string value=qwerty
  15. Intermediate Representation (IR) - A combination of the Syntax Tree

    + Symbol table. - No standard way of constructing one. - Dependent on the Frontend and Backend blocks. - There are multiple IRs out there. IR
  16. Modular blocks - Compiler blocks are interchangeable. - A frontend

    must know how to write an IR. - A backend must know how to interpret an IR into machine code. Frontend IR Backend Frontend-1 Frontend-2 Frontend-3 IR-1 Backend-1 Backend-2 IR-2
  17. Compiler Backend - Is concerned with - Optimization - Code

    generation - Optimization: - Memory - Power - Speed - Generates machine code for the specific machine processor. Backend RISC CISC VM https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/
  18. Approaches towards code generation - One-off custom generator which does

    pattern matching against the IR and data structures. - Using the peephole optimization on the IR instruction. - Use regex on the IR and replace it with machine code. - Replace matching text with machine code. Backend
  19. Elementary language EBNF definition of an elementary language. S →

    Az|z A → x A|B B → y Usage examples xyz xxxxyz xxxxz z
  20. Handwritten syntax analyser int ch; void error(char *msg) { printf("Error

    - found character %c - %s\n",ch,msg); exit(1); void s() { if (ch == ’z’) ch = getchar(); else { a(); if (ch != ’z’) error("z expected"); else ch = getchar(); } printf("Success!\n"); } int main(int argc, char *argv[]) { ch = getchar(); s(); return 0; } } void b() { if (ch == ’y’) ch = getchar(); else error("y expected"); } void a() { if (ch == ’x’) { ch = getchar(); a(); } else b(); } xyz Success! $ ./simpletopdown xxxxyz Success! $ ./simpletopdown xxxxz Error - found character z - y expected $ ./simpletopdown z Success! A Practical Approach to Compiler Construction (Undergraduate Topics in Computer Science)
  21. What you should be able to answer - What is

    a compiler? - Compiler building blocks. - How is Java different than C. - How to navigate compiler pages: llvm.org || gcc.gnu.org - How to make your own compilers (more or less).
  22. ML and Compilers ML accelerator hardware - GPUs - Google

    Coral Edge TPU. - Apple Neural Engine. - Custom made FPGA accelerators. - … The field is open for disruption. - Custom accelerator architectures require custom compilers. - Compilers must be able to optimize ML code.