Compilers and ML - intro

Compilers and ML What’s their connection? Ioannis Petrousov 18.01.2023

$ whoami Ioannis Petrousov • 2017: Graduated from UOWM while
working FT. • Moved to the Netherlands. • . • . (several adventures later) • . • 2022: Started working as a freelancer

- Notes from the book: A Practical Approach to Compiler
Construction - Des Watson - This presentation is for me to more-or-less verify what I’ve learned. - Expects some interaction! Disclaimer

TOC 1. What is a compiler? 2. Compiler building blocks.
3. How compilers are made.

How I got into compilers - I felt like a
fool when trying to understand George Hotz’s streams. - What’s tinycore? - How do you compile something to run on M1 neural engine? - How do you compile something on Google Coral Edge TPU? - What is the LLVM backend and how can you use it?

Which compilers do you know?

What is a compiler? Compiler The secret lies within the
compiler.

Buildings blocks of a compiler It’s a black box which
takes as input a logic written in a higher level programming language and produces that logic in some machine code. Frontend Compiler IR Backend

Compiler frontend Frontend 1. Lexical analysis - Group lexical tokens
- OUTPUT: group of lexical tokens 2. Syntax analysis - INPUT: groups of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree 3. Semantic analysis - INPUT: Abstract Syntax Tree - Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation IR read process write

Lexical analysis Frontend 1. Lexical analysis - Group lexical tokens
- OUTPUT: group of lexical tokens read while (i <= 100) { tot += a[i]; /* form vector total */ i++; } while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT OUTPUT

How does the lexer identify tokens - Regex - Keywords
- Etc… - Case .. switch (manually written)

Syntax analysis Frontend 2. Syntax analysis - INPUT: groups of
lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process while (reserved word), (, i (identifier), <=, 100 (integer constant), ), {, tot (identifier),+=, a (identifier), [, i (identifier), ], ;, i (identifier), ++, ;, } INPUT PROCESS In simpler words, it performs a syntax check to see if the symbols adhere to the programming language’s syntax rules.

How does the syntax analyser know the syntax of a
language? In other words, how is a programming language deﬁned?

How to learn a language Programmer Pick a language Read
its syntax Write a hello world program Skills - Rust - C - Go Skills - Rust - Go

EBNF - Similarly, a syntax analyser knows the syntax of
a language through a “document”. - The “document” is written in a meta-language called EBNF. - EBNF deﬁnes the syntax of a programming language. EBNF example <postal-address> ::= <name-part> <street-address> <zip-part> <name-part> ::= <personal-part> <last-name> <opt-suffix-part> <EOL> | <personal-part> <name-part> <personal-part> ::= <initial> "." | <first-name> <street-address> ::= <house-num> <street-name> <opt-apt-num> <EOL> <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL> <opt-suffix-part> ::= "Sr." | "Jr." | <roman-numeral> | "" <opt-apt-num> ::= <apt-num> | "" https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form

Syntax analysis (AST) Frontend 2. Syntax analysis - INPUT: groups
of lexical tokens - Perform reduction on the lexical token group and reduce them into BNF statements - OUTPUT: Construct the program's syntax tree process int(reserved word),i (identifier,=,5 (constant),; INPUT REDUCTION <statement> ::= <type> <identifier> = <digit>; <type> ::= int | float <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 statement = i 5

Semantic analysis 3. Semantic analysis - INPUT: Abstract Syntax Tree
- Traverses the tree and inserts the following: - types, operator overload, scopes, etc. - OUTPUT: Intermediate Representation write INPUT https://cs.lmu.edu/~ray/notes/ir/ Frontend IR OUTPUT i var value=100 C const value=6 name string value=qwerty

Intermediate Representation (IR) - A combination of the Syntax Tree
+ Symbol table. - No standard way of constructing one. - Dependent on the Frontend and Backend blocks. - There are multiple IRs out there. IR

Modular blocks - Compiler blocks are interchangeable. - A frontend
must know how to write an IR. - A backend must know how to interpret an IR into machine code. Frontend IR Backend Frontend-1 Frontend-2 Frontend-3 IR-1 Backend-1 Backend-2 IR-2

Practically https://gcc.gnu.org/frontends.html https://llvm.org/

Compiler Backend - Is concerned with - Optimization - Code
generation - Optimization: - Memory - Power - Speed - Generates machine code for the speciﬁc machine processor. Backend RISC CISC VM https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/

Approaches towards code generation - One-oﬀ custom generator which does
pattern matching against the IR and data structures. - Using the peephole optimization on the IR instruction. - Use regex on the IR and replace it with machine code. - Replace matching text with machine code. Backend

Approaches towards compiler construction

Elementary language EBNF deﬁnition of an elementary language. S →
Az|z A → x A|B B → y Usage examples xyz xxxxyz xxxxz z

Handwritten syntax analyser int ch; void error(char *msg) { printf("Error
- found character %c - %s\n",ch,msg); exit(1); void s() { if (ch == ’z’) ch = getchar(); else { a(); if (ch != ’z’) error("z expected"); else ch = getchar(); } printf("Success!\n"); } int main(int argc, char *argv[]) { ch = getchar(); s(); return 0; } } void b() { if (ch == ’y’) ch = getchar(); else error("y expected"); } void a() { if (ch == ’x’) { ch = getchar(); a(); } else b(); } xyz Success! $ ./simpletopdown xxxxyz Success! $ ./simpletopdown xxxxz Error - found character z - y expected $ ./simpletopdown z Success! A Practical Approach to Compiler Construction (Undergraduate Topics in Computer Science)

Compiler construction tools - yacc - ﬂex - bison Calculator
example on terminal

What you should be able to answer - What is
a compiler? - Compiler building blocks. - How is Java diﬀerent than C. - How to navigate compiler pages: llvm.org || gcc.gnu.org - How to make your own compilers (more or less).

ML and Compilers ML accelerator hardware - GPUs - Google
Coral Edge TPU. - Apple Neural Engine. - Custom made FPGA accelerators. - … The ﬁeld is open for disruption. - Custom accelerator architectures require custom compilers. - Compilers must be able to optimize ML code.

Find my notes on compilers on github https://github.com/gpetrousov/compilers

Compilers and ML - intro

Compilers and ML - intro

Ioannis Petrousov

More Decks by Ioannis Petrousov

Other Decks in Education

Featured

Transcript

Compilers and ML What’s their connection? Ioannis Petrousov 18.01.2023

$ whoami Ioannis Petrousov • 2017: Graduated from UOWM while

- Notes from the book: A Practical Approach to Compiler

TOC 1. What is a compiler? 2. Compiler building blocks.

How I got into compilers - I felt like a

How I got into compilers - I felt like a

Which compilers do you know?

What is a compiler? Compiler The secret lies within the

Buildings blocks of a compiler It’s a black box which

Compiler frontend Frontend 1. Lexical analysis - Group lexical tokens

Lexical analysis Frontend 1. Lexical analysis - Group lexical tokens

How does the lexer identify tokens - Regex - Keywords

Syntax analysis Frontend 2. Syntax analysis - INPUT: groups of

How does the syntax analyser know the syntax of a

How to learn a language Programmer Pick a language Read

EBNF - Similarly, a syntax analyser knows the syntax of

Syntax analysis (AST) Frontend 2. Syntax analysis - INPUT: groups

Semantic analysis 3. Semantic analysis - INPUT: Abstract Syntax Tree

Intermediate Representation (IR) - A combination of the Syntax Tree

Modular blocks - Compiler blocks are interchangeable. - A frontend

Practically https://gcc.gnu.org/frontends.html https://llvm.org/

Compiler Backend - Is concerned with - Optimization - Code

Approaches towards code generation - One-oﬀ custom generator which does

Approaches towards compiler construction

Elementary language EBNF deﬁnition of an elementary language. S →

Handwritten syntax analyser int ch; void error(char *msg) { printf("Error

Compiler construction tools - yacc - ﬂex - bison Calculator

What you should be able to answer - What is

ML and Compilers ML accelerator hardware - GPUs - Google

Find my notes on compilers on github https://github.com/gpetrousov/compilers