Slide 1

Slide 1 text

Creating a language using only assembly language. Kernel/VM Tanken-tai #11 Koichi Nakamura

Slide 2

Slide 2 text

Codes •https://github.com/nineties/amber

Slide 3

Slide 3 text

Profile • Koichi Nakamura • twitter: @9_ties • developing an IoT device • http://idein.jp

Slide 4

Slide 4 text

I was a compiler writer • wrote compilers at student experiment • minCaml compiler by O’Caml • minCaml compiler by Haskell https://github.com/nineties/Choco • studied optimizing compilers at graduate school • wrote compilers for special purpose CPUs

Slide 5

Slide 5 text

Wanted to create my own language • name: “Amber” • It was ‘“rowl” at first. • I wanted to enjoy the creation process itself. • How could I?

Slide 6

Slide 6 text

Let’s play with limitations 1. Use assembly language only. 2. No libraries. 3. No code generators. libc etc. High-level langs. like C flex/bison etc.

Slide 7

Slide 7 text

Strategy:Bootstrapping Write language 1 by assembly language Write a little bit high-level language 2 by language 1 Write Amber by language k Write Amber by Amber here now

Slide 8

Slide 8 text

What’s the point? • For fun. • To cultivate knowledge, techniques, know-hows of compiler-writing. • But it’s not cost-effective study method... • To feel a sense of gratitude and respect for predecessors.

Slide 9

Slide 9 text

I’ll show the outline of my development process.

Slide 10

Slide 10 text

1. Created “rowl0” by assembly language

Slide 11

Slide 11 text

Made a little bit high-level lang. more than asm. • language name: rowl0 • compiler name: rlc

Slide 12

Slide 12 text

From regular expressions of tokens

Slide 13

Slide 13 text

Wrote a state transition diagram

Slide 14

Slide 14 text

Converted to jump table

Slide 15

Slide 15 text

And wrote the lexer

Slide 16

Slide 16 text

Wrote rowl0’s syntax by BNF

Slide 17

Slide 17 text

Then wrote the parser • recursive descent method

Slide 18

Slide 18 text

Generates codes together with parsing • writing memory management is difficult here. • generates codes without building syntax trees. code generation parsing

Slide 19

Slide 19 text

Completed the first language “rowl0”! • no symbol tables. • function params must be p0,p1,p2,... • to use local variables, allocate stack mems by “allocate(n)” then use x0,x1,x2,...

Slide 20

Slide 20 text

2.Created a LISP “rowl-core” by “rowl0”

Slide 21

Slide 21 text

Made a LISP temporarily • language name: rowl-core • interpreter name: rlci • easy to implement • productivity improvement

Slide 22

Slide 22 text

Wrote lexer and parser

Slide 23

Slide 23 text

Writing became more comfortable

Slide 24

Slide 24 text

Wrote eval

Slide 25

Slide 25 text

No memory management • mmap and munmap is the only function 1. Does not recovery garbage memories 2. Allocates fresh memories for new objects 3. So, it will die eventually • When it can compile the next generation compilers, it’s no problem. malloc, free

Slide 26

Slide 26 text

Completed a LISP “rowl-core”! • rich functions • lambda, map etc. • macros

Slide 27

Slide 27 text

3.Created a language to write “VM” by “rowl-core”

Slide 28

Slide 28 text

Decided to create a VM for the next generation • Created a language just for writing the virtual machine. • Defined it as a DSL in the LISP “rowl-core” • No need of writing lexer and parser!

Slide 29

Slide 29 text

Wrote the compiler like this

Slide 30

Slide 30 text

Now I could use higher-order functions • productivity was improved a lot

Slide 31

Slide 31 text

4.Created a virtual machine “rlvm” by the DSL

Slide 32

Slide 32 text

Wrote codes of VM with the DSL like this

Slide 33

Slide 33 text

Wrote a garbage collector • Copying GC • Cheney’s algorithm

Slide 34

Slide 34 text

Wrote primitive functions

Slide 35

Slide 35 text

An application of meta-programming • The table of instructions of the VM

Slide 36

Slide 36 text

Generates various codes from the table • reflects changes of instructions automatically • It is very easy to make this kind of mechanism with LISP vm_instructions eval loop of the VM Linker Disassembler Assembler Assembler used internally in Amber

Slide 37

Slide 37 text

Wrote instruction sets

Slide 38

Slide 38 text

Floating point arithmetics

Slide 39

Slide 39 text

Multi-precision integer arithmetics

Slide 40

Slide 40 text

Exception handling

Slide 41

Slide 41 text

Delimited Continuation

Slide 42

Slide 42 text

Completed the virtual machine “rlvm”! • 186 instructions • stack machine • copying GC • exception handling • shift/reset delimited continuation • floating-point arithmetics, multi-precision arithmetics

Slide 43

Slide 43 text

5.Created a tool chain for “rlvm”

Slide 44

Slide 44 text

There was no programming tools for “rlvm” • Created a tool chain for the VM • a programming language “rowl1” • its compiler • assembler • disassembler • linker

Slide 45

Slide 45 text

Wrote “rowl1”, assembler and compiler • Defined as a DSL of “rowl-core”

Slide 46

Slide 46 text

Wrote linker and disassembler • Wrote these tools by “rowl1”, so they run on “rlvm” • The linker requires GC since it uses a lot of memory

Slide 47

Slide 47 text

Example outputs of the disassembler

Slide 48

Slide 48 text

Ready to program on “rlvm”! • writing programs for rlvm • disassembling of byte-codes • supports separate compilation • Reached the starting line

Slide 49

Slide 49 text

6.Wrote “Amber” by “rowl1”

Slide 50

Slide 50 text

Started developing “Amber” • dynamic scripting language • instance-based object-oriented system • run on rlvm

Slide 51

Slide 51 text

Wrote an assembler • The former assembler assembles codes ahead of time and run on rlci • This assembler assembles codes just in time and run on rlvm • fills addresses by backpatching

Slide 52

Slide 52 text

Wrote the object system • slots, messages and parent delegation

Slide 53

Slide 53 text

Wrote Amber’s core feature on the system • dynamic pattern-matching engine • mechanism of partial function fusion

Slide 54

Slide 54 text

Wrote the compiler • Made Amber compiler as one of Amber objects VM object system pattern-matching engine compiler Amber’s core system matching of syntax tree resource management

Slide 55

Slide 55 text

Wrote closure-conversion

Slide 56

Slide 56 text

Wrote parsers • compiles parsers at run-time • each parser is a usual Amber object (closure) VM object system pattern-matching engine compiler Amber core system compile parsers

Slide 57

Slide 57 text

very simple syntax 1. literals are expressions 2. for a symbol h and expressions e1,..,en (n>=0), h{e1, ..., en} is an expression 3. no other form of Amber’s expression

Slide 58

Slide 58 text

Used Packrat parsing method • scanner less

Slide 59

Slide 59 text

Encoding/decoding floating-point literal was difficult • wrote them by my self because of “no libc” limitation • require multi-precision integer arithmetic which I wrote before “3.14” 0x40091eb851eb851f strtod, sprintf

Slide 60

Slide 60 text

Amber interpreter is completed! • dynamic scripting language • run on rlvm • instance-based object oriented system • dynamic pattern-matching engine • partial function fusion • lexical closure • I got modern programming language!

Slide 61

Slide 61 text

7. Created Amber’s standard library

Slide 62

Slide 62 text

Amber has strong self extensibility • Amber’s simple syntax is extended in a standard library • amber/lib/syntax/parse.ab • Builds its syntax during boot sequence

Slide 63

Slide 63 text

Only has very simple syntax at first used string literal for comments because there is no syntax for comments

Slide 64

Slide 64 text

Defines a syntax for defining syntaxes

Slide 65

Slide 65 text

Defines Amber’s syntax with the syntax

Slide 66

Slide 66 text

Builds macro system

Slide 67

Slide 67 text

Gives meanings to syntaxes by macros

Slide 68

Slide 68 text

Now Amber got rich syntax

Slide 69

Slide 69 text

Extends object system

Slide 70

Slide 70 text

Now Amber got rich object system • Inheritence, mix-in etc.

Slide 71

Slide 71 text

Now the development is under suspension • No plans of further updates • Try following commands to invoke Amber shell (Linux only) • See the outputs of the make command % git clone https://github.com/nineties/amber.git % cd amber % make; sudo make install % amber

Slide 72

Slide 72 text

Summary rowl0 rlc rlci rowl-core as lang. for writing VM rlvm rowl1 linker disassembler compiler compiler Amber interpreter impl. impl. run self-extension language tool • I could reach relatively high-level language. Feel satisfied.

Slide 73

Slide 73 text

No content