I was a compiler writer
• wrote compilers at student experiment
• minCaml compiler by O’Caml
• minCaml compiler by Haskell
https://github.com/nineties/Choco
• studied optimizing compilers at graduate school
• wrote compilers for special purpose CPUs
Slide 5
Slide 5 text
Wanted to create my own language
• name: “Amber”
• It was ‘“rowl” at first.
• I wanted to enjoy the creation process itself.
• How could I?
Slide 6
Slide 6 text
Let’s play with limitations
1. Use assembly language only.
2. No libraries.
3. No code generators.
libc etc.
High-level langs. like C
flex/bison etc.
Slide 7
Slide 7 text
Strategy:Bootstrapping
Write language 1 by assembly language
Write a little bit high-level language 2 by language 1
Write Amber by language k
Write Amber by Amber here now
Slide 8
Slide 8 text
What’s the point?
• For fun.
• To cultivate knowledge, techniques, know-hows of compiler-writing.
• But it’s not cost-effective study method...
• To feel a sense of gratitude and respect for predecessors.
Slide 9
Slide 9 text
I’ll show the outline of my development process.
Slide 10
Slide 10 text
1. Created “rowl0” by assembly language
Slide 11
Slide 11 text
Made a little bit high-level lang. more than asm.
• language name: rowl0
• compiler name: rlc
Slide 12
Slide 12 text
From regular expressions of tokens
Slide 13
Slide 13 text
Wrote a state transition diagram
Slide 14
Slide 14 text
Converted to jump table
Slide 15
Slide 15 text
And wrote the lexer
Slide 16
Slide 16 text
Wrote rowl0’s syntax by BNF
Slide 17
Slide 17 text
Then wrote the parser
• recursive descent method
Slide 18
Slide 18 text
Generates codes together with parsing
• writing memory
management is difficult here.
• generates codes without
building syntax trees.
code generation
parsing
Slide 19
Slide 19 text
Completed the first language “rowl0”!
• no symbol tables.
• function params must be
p0,p1,p2,...
• to use local variables, allocate
stack mems by “allocate(n)” then
use
x0,x1,x2,...
Slide 20
Slide 20 text
2.Created a LISP “rowl-core” by “rowl0”
Slide 21
Slide 21 text
Made a LISP temporarily
• language name: rowl-core
• interpreter name: rlci
• easy to implement
• productivity improvement
Slide 22
Slide 22 text
Wrote lexer and parser
Slide 23
Slide 23 text
Writing became more comfortable
Slide 24
Slide 24 text
Wrote eval
Slide 25
Slide 25 text
No memory management
• mmap and munmap is the only function
1. Does not recovery garbage memories
2. Allocates fresh memories for new objects
3. So, it will die eventually
• When it can compile the next generation compilers, it’s no problem.
malloc, free
Slide 26
Slide 26 text
Completed a LISP “rowl-core”!
• rich functions
• lambda, map etc.
• macros
Slide 27
Slide 27 text
3.Created a language to write “VM” by “rowl-core”
Slide 28
Slide 28 text
Decided to create a VM for the next generation
• Created a language just for writing the virtual machine.
• Defined it as a DSL in the LISP “rowl-core”
• No need of writing lexer and parser!
Slide 29
Slide 29 text
Wrote the compiler like this
Slide 30
Slide 30 text
Now I could use higher-order functions
• productivity was improved a lot
Slide 31
Slide 31 text
4.Created a virtual machine “rlvm” by the DSL
Slide 32
Slide 32 text
Wrote codes of VM with the DSL like this
Slide 33
Slide 33 text
Wrote a garbage collector
• Copying GC
• Cheney’s algorithm
Slide 34
Slide 34 text
Wrote primitive functions
Slide 35
Slide 35 text
An application of meta-programming
• The table of instructions of the VM
Slide 36
Slide 36 text
Generates various codes from the table
• reflects changes of
instructions
automatically
• It is very easy to make
this kind of mechanism
with LISP
vm_instructions
eval loop of the VM
Linker
Disassembler
Assembler
Assembler used internally in Amber
There was no programming tools for “rlvm”
• Created a tool chain for the VM
• a programming language “rowl1”
• its compiler
• assembler
• disassembler
• linker
Slide 45
Slide 45 text
Wrote “rowl1”, assembler and compiler
• Defined as a DSL of “rowl-core”
Slide 46
Slide 46 text
Wrote linker and disassembler
• Wrote these tools by “rowl1”, so they run on “rlvm”
• The linker requires GC since it uses a lot of memory
Slide 47
Slide 47 text
Example outputs of the disassembler
Slide 48
Slide 48 text
Ready to program on “rlvm”!
• writing programs for rlvm
• disassembling of byte-codes
• supports separate compilation
• Reached the starting line
Slide 49
Slide 49 text
6.Wrote “Amber” by “rowl1”
Slide 50
Slide 50 text
Started developing “Amber”
• dynamic scripting language
• instance-based object-oriented system
• run on rlvm
Slide 51
Slide 51 text
Wrote an assembler
• The former assembler
assembles codes ahead of
time and run on rlci
• This assembler assembles
codes just in time and run
on rlvm
• fills addresses by
backpatching
Slide 52
Slide 52 text
Wrote the object system
• slots, messages and parent delegation
Slide 53
Slide 53 text
Wrote Amber’s core feature on the system
• dynamic pattern-matching engine
• mechanism of partial function fusion
Slide 54
Slide 54 text
Wrote the compiler
• Made Amber compiler as one of Amber objects
VM
object system
pattern-matching engine
compiler
Amber’s core system
matching of syntax tree
resource management
Slide 55
Slide 55 text
Wrote closure-conversion
Slide 56
Slide 56 text
Wrote parsers
• compiles parsers at run-time
• each parser is a usual Amber object (closure)
VM
object system
pattern-matching engine
compiler
Amber core system
compile
parsers
Slide 57
Slide 57 text
very simple syntax
1. literals are expressions
2. for a symbol h and expressions e1,..,en (n>=0),
h{e1, ..., en} is an expression
3. no other form of Amber’s expression
Slide 58
Slide 58 text
Used Packrat parsing method
• scanner less
Slide 59
Slide 59 text
Encoding/decoding floating-point literal was difficult
• wrote them by my self
because of “no libc” limitation
• require multi-precision integer arithmetic which I wrote before
“3.14” 0x40091eb851eb851f
strtod, sprintf
Slide 60
Slide 60 text
Amber interpreter is completed!
• dynamic scripting language
• run on rlvm
• instance-based object oriented system
• dynamic pattern-matching engine
• partial function fusion
• lexical closure
• I got modern programming language!
Slide 61
Slide 61 text
7. Created Amber’s standard library
Slide 62
Slide 62 text
Amber has strong self extensibility
• Amber’s simple syntax is extended in a standard library
• amber/lib/syntax/parse.ab
• Builds its syntax during boot sequence
Slide 63
Slide 63 text
Only has very simple syntax at first
used string literal for comments
because there is no syntax for comments
Slide 64
Slide 64 text
Defines a syntax for defining syntaxes
Slide 65
Slide 65 text
Defines Amber’s syntax with the syntax
Slide 66
Slide 66 text
Builds macro system
Slide 67
Slide 67 text
Gives meanings to syntaxes by macros
Slide 68
Slide 68 text
Now Amber got rich syntax
Slide 69
Slide 69 text
Extends object system
Slide 70
Slide 70 text
Now Amber got rich object system
• Inheritence, mix-in etc.
Slide 71
Slide 71 text
Now the development is under suspension
• No plans of further updates
• Try following commands to invoke Amber shell (Linux only)
• See the outputs of the make command
% git clone https://github.com/nineties/amber.git
% cd amber
% make; sudo make install
% amber
Slide 72
Slide 72 text
Summary
rowl0
rlc rlci
rowl-core
as
lang. for
writing VM
rlvm
rowl1
linker
disassembler
compiler
compiler
Amber
interpreter
impl. impl.
run
self-extension
language
tool
• I could reach relatively high-level language. Feel satisfied.