Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compilers and ML - intro

Compilers and ML - intro

My first presentation around compilers and how their construction relates to ML.

Avatar for Ioannis Petrousov

Ioannis Petrousov

January 18, 2023
Tweet

More Decks by Ioannis Petrousov

Other Decks in Education

Transcript

  1. $ whoami Ioannis Petrousov • 2017: Graduated from UOWM while

    working FT. • Moved to the Netherlands. • . • . (several adventures later) • . • 2022: Started working as a freelancer
  2. - Notes from the book: A Practical Approach to Compiler

    Construction - Des Watson - This presentation is for me to more-or-less verify what I’ve learned. - Expects some interaction! Disclaimer
  3. How I got into compilers - I felt like a

    fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?
  4. How I got into compilers - I felt like a

    fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?
  5. Buildings blocks of a compiler It’s a black box which

    takes as input a logic written in a higher level programming language and produces that logic in some machine code. Frontend Compiler IR Backend
  6. Compiler frontend Frontend 1. Lexical analysis - Group lexical tokens

    - OUTPUT: group of lexical tokens 2. Syntax analysis - INPUT: groups of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree 3. Semantic analysis - INPUT: Abstract Syntax Tree - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation IR read process write
  7. Lexical analysis Frontend 1. Lexical analysis - Group lexical tokens

    - OUTPUT: group of lexical tokens read while (i <= 100) { tot += a[i]; /* form vector total */ i++; } while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT OUTPUT
  8. How does the lexer identify tokens - Regex - Keywords

    - Etc… - Case .. switch (manually written)
  9. Syntax analysis Frontend 2. Syntax analysis - INPUT: groups of

    lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT PROCESS In simpler words, it performs a syntax check to see if the symbols adhere to the programming language’s syntax rules.
  10. How does the syntax analyser know the syntax of a

    language? In other words, how is a programming language defined?
  11. How to learn a language Programmer Pick a language Read

    its syntax Write a hello world program Skills - Rust - C - Go Skills - Rust - Go
  12. EBNF - Similarly, a syntax analyser knows the syntax of

    a language through a “document”. - The “document” is written in a meta-language called EBNF. - EBNF defines the syntax of a programming language. EBNF example <postal-address> ::= <name-part> <street-address> <zip-part> <name-part> ::= <personal-part> <last-name> <opt-suffix-part> <EOL> | <personal-part> <name-part> <personal-part> ::= <initial> "." | <first-name> <street-address> ::= <house-num> <street-name> <opt-apt-num> <EOL> <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL> <opt-suffix-part> ::= "Sr." | "Jr." | <roman-numeral> | "" <opt-apt-num> ::= <apt-num> | "" https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form
  13. Syntax analysis (AST) Frontend 2. Syntax analysis - INPUT: groups

    of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process int(reserved word),i (identifier,=,5 (constant),; INPUT REDUCTION <statement> ::= <type> <identifier> = <digit>; <type> ::= int | float <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 statement = i 5
  14. Semantic analysis 3. Semantic analysis - INPUT: Abstract Syntax Tree

    - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation write INPUT https://cs.lmu.edu/~ray/notes/ir/ Frontend IR OUTPUT i var value=100 C const value=6 name string value=qwerty
  15. Intermediate Representation (IR) - A combination of the Syntax Tree

    + Symbol table. - No standard way of constructing one. - Dependent on the Frontend and Backend blocks. - There are multiple IRs out there. IR
  16. Modular blocks - Compiler blocks are interchangeable. - A frontend

    must know how to write an IR. - A backend must know how to interpret an IR into machine code. Frontend IR Backend Frontend-1 Frontend-2 Frontend-3 IR-1 Backend-1 Backend-2 IR-2
  17. Compiler Backend - Is concerned with - Optimization - Code

    generation - Optimization: - Memory - Power - Speed - Generates machine code for the specific machine processor. Backend RISC CISC VM https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/
  18. Approaches towards code generation - One-off custom generator which does

    pattern matching against the IR and data structures. - Using the peephole optimization on the IR instruction. - Use regex on the IR and replace it with machine code. - Replace matching text with machine code. Backend
  19. Elementary language EBNF definition of an elementary language. S →

    Az|z A → x A|B B → y Usage examples xyz xxxxyz xxxxz z
  20. Handwritten syntax analyser int ch; void error(char *msg) { printf("Error

    - found character %c - %s\n",ch,msg); exit(1); void s() { if (ch == ’z’) ch = getchar(); else { a(); if (ch != ’z’) error("z expected"); else ch = getchar(); } printf("Success!\n"); } int main(int argc, char *argv[]) { ch = getchar(); s(); return 0; } } void b() { if (ch == ’y’) ch = getchar(); else error("y expected"); } void a() { if (ch == ’x’) { ch = getchar(); a(); } else b(); } xyz Success! $ ./simpletopdown xxxxyz Success! $ ./simpletopdown xxxxz Error - found character z - y expected $ ./simpletopdown z Success! A Practical Approach to Compiler Construction (Undergraduate Topics in Computer Science)
  21. What you should be able to answer - What is

    a compiler? - Compiler building blocks. - How is Java different than C. - How to navigate compiler pages: llvm.org || gcc.gnu.org - How to make your own compilers (more or less).
  22. ML and Compilers ML accelerator hardware - GPUs - Google

    Coral Edge TPU. - Apple Neural Engine. - Custom made FPGA accelerators. - … The field is open for disruption. - Custom accelerator architectures require custom compilers. - Compilers must be able to optimize ML code.