$ whoami Ioannis Petrousov ● 2017: Graduated from UOWM while working FT. ● Moved to the Netherlands. ● . ● . (several adventures later) ● . ● 2022: Started working as a freelancer
- Notes from the book: A Practical Approach to Compiler Construction - Des Watson - This presentation is for me to more-or-less verify what I’ve learned. - Expects some interaction! Disclaimer
How I got into compilers - I felt like a fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?
How I got into compilers - I felt like a fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?
Buildings blocks of a compiler It’s a black box which takes as input a logic written in a higher level programming language and produces that logic in some machine code. Frontend Compiler IR Backend
Compiler frontend Frontend 1. Lexical analysis - Group lexical tokens - OUTPUT: group of lexical tokens 2. Syntax analysis - INPUT: groups of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree 3. Semantic analysis - INPUT: Abstract Syntax Tree - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation IR read process write
Lexical analysis Frontend 1. Lexical analysis - Group lexical tokens - OUTPUT: group of lexical tokens read while (i <= 100) { tot += a[i]; /* form vector total */ i++; } while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT OUTPUT
Syntax analysis Frontend 2. Syntax analysis - INPUT: groups of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT PROCESS In simpler words, it performs a syntax check to see if the symbols adhere to the programming language’s syntax rules.
EBNF - Similarly, a syntax analyser knows the syntax of a language through a “document”. - The “document” is written in a meta-language called EBNF. - EBNF defines the syntax of a programming language. EBNF example ::= ::= | ::= "." | ::= ::= "," ::= "Sr." | "Jr." | | "" ::= | "" https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form
Semantic analysis 3. Semantic analysis - INPUT: Abstract Syntax Tree - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation write INPUT https://cs.lmu.edu/~ray/notes/ir/ Frontend IR OUTPUT i var value=100 C const value=6 name string value=qwerty
Intermediate Representation (IR) - A combination of the Syntax Tree + Symbol table. - No standard way of constructing one. - Dependent on the Frontend and Backend blocks. - There are multiple IRs out there. IR
Modular blocks - Compiler blocks are interchangeable. - A frontend must know how to write an IR. - A backend must know how to interpret an IR into machine code. Frontend IR Backend Frontend-1 Frontend-2 Frontend-3 IR-1 Backend-1 Backend-2 IR-2
Approaches towards code generation - One-off custom generator which does pattern matching against the IR and data structures. - Using the peephole optimization on the IR instruction. - Use regex on the IR and replace it with machine code. - Replace matching text with machine code. Backend
What you should be able to answer - What is a compiler? - Compiler building blocks. - How is Java different than C. - How to navigate compiler pages: llvm.org || gcc.gnu.org - How to make your own compilers (more or less).
ML and Compilers ML accelerator hardware - GPUs - Google Coral Edge TPU. - Apple Neural Engine. - Custom made FPGA accelerators. - … The field is open for disruption. - Custom accelerator architectures require custom compilers. - Compilers must be able to optimize ML code.