Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Creating a language using only assembly language

Creating a language using only assembly language

Koichi Nakamura

June 11, 2015
Tweet

More Decks by Koichi Nakamura

Other Decks in Programming

Transcript

  1. Creating a language
    using only
    assembly language.
    Kernel/VM Tanken-tai #11
    Koichi Nakamura

    View Slide

  2. Codes
    •https://github.com/nineties/amber

    View Slide

  3. Profile
    • Koichi Nakamura
    • twitter: @9_ties
    • developing an IoT device
    • http://idein.jp

    View Slide

  4. I was a compiler writer
    • wrote compilers at student experiment
    • minCaml compiler by O’Caml
    • minCaml compiler by Haskell
    https://github.com/nineties/Choco
    • studied optimizing compilers at graduate school
    • wrote compilers for special purpose CPUs

    View Slide

  5. Wanted to create my own language
    • name: “Amber”
    • It was ‘“rowl” at first.
    • I wanted to enjoy the creation process itself.
    • How could I?

    View Slide

  6. Let’s play with limitations
    1. Use assembly language only.
    2. No libraries.
    3. No code generators.
    libc etc.
    High-level langs. like C
    flex/bison etc.

    View Slide

  7. Strategy:Bootstrapping
    Write language 1 by assembly language
    Write a little bit high-level language 2 by language 1
    Write Amber by language k
    Write Amber by Amber here now

    View Slide

  8. What’s the point?
    • For fun.
    • To cultivate knowledge, techniques, know-hows of compiler-writing.
    • But it’s not cost-effective study method...
    • To feel a sense of gratitude and respect for predecessors.

    View Slide

  9. I’ll show the outline of my development process.

    View Slide

  10. 1. Created “rowl0” by assembly language

    View Slide

  11. Made a little bit high-level lang. more than asm.
    • language name: rowl0
    • compiler name: rlc

    View Slide

  12. From regular expressions of tokens

    View Slide

  13. Wrote a state transition diagram

    View Slide

  14. Converted to jump table

    View Slide

  15. And wrote the lexer

    View Slide

  16. Wrote rowl0’s syntax by BNF

    View Slide

  17. Then wrote the parser
    • recursive descent method

    View Slide

  18. Generates codes together with parsing
    • writing memory
    management is difficult here.
    • generates codes without
    building syntax trees.
    code generation
    parsing

    View Slide

  19. Completed the first language “rowl0”!
    • no symbol tables.
    • function params must be
    p0,p1,p2,...
    • to use local variables, allocate
    stack mems by “allocate(n)” then
    use
    x0,x1,x2,...

    View Slide

  20. 2.Created a LISP “rowl-core” by “rowl0”

    View Slide

  21. Made a LISP temporarily
    • language name: rowl-core
    • interpreter name: rlci
    • easy to implement
    • productivity improvement

    View Slide

  22. Wrote lexer and parser

    View Slide

  23. Writing became more comfortable

    View Slide

  24. Wrote eval

    View Slide

  25. No memory management
    • mmap and munmap is the only function
    1. Does not recovery garbage memories
    2. Allocates fresh memories for new objects
    3. So, it will die eventually
    • When it can compile the next generation compilers, it’s no problem.
    malloc, free

    View Slide

  26. Completed a LISP “rowl-core”!
    • rich functions
    • lambda, map etc.
    • macros

    View Slide

  27. 3.Created a language to write “VM” by “rowl-core”

    View Slide

  28. Decided to create a VM for the next generation
    • Created a language just for writing the virtual machine.
    • Defined it as a DSL in the LISP “rowl-core”
    • No need of writing lexer and parser!

    View Slide

  29. Wrote the compiler like this

    View Slide

  30. Now I could use higher-order functions
    • productivity was improved a lot

    View Slide

  31. 4.Created a virtual machine “rlvm” by the DSL

    View Slide

  32. Wrote codes of VM with the DSL like this

    View Slide

  33. Wrote a garbage collector
    • Copying GC
    • Cheney’s algorithm

    View Slide

  34. Wrote primitive functions

    View Slide

  35. An application of meta-programming
    • The table of instructions of the VM

    View Slide

  36. Generates various codes from the table
    • reflects changes of
    instructions
    automatically
    • It is very easy to make
    this kind of mechanism
    with LISP
    vm_instructions
    eval loop of the VM
    Linker
    Disassembler
    Assembler
    Assembler used internally in Amber

    View Slide

  37. Wrote instruction sets

    View Slide

  38. Floating point arithmetics

    View Slide

  39. Multi-precision integer arithmetics

    View Slide

  40. Exception handling

    View Slide

  41. Delimited Continuation

    View Slide

  42. Completed the virtual machine “rlvm”!
    • 186 instructions
    • stack machine
    • copying GC
    • exception handling
    • shift/reset delimited continuation
    • floating-point arithmetics, multi-precision arithmetics

    View Slide

  43. 5.Created a tool chain for “rlvm”

    View Slide

  44. There was no programming tools for “rlvm”
    • Created a tool chain for the VM
    • a programming language “rowl1”
    • its compiler
    • assembler
    • disassembler
    • linker

    View Slide

  45. Wrote “rowl1”, assembler and compiler
    • Defined as a DSL of “rowl-core”

    View Slide

  46. Wrote linker and disassembler
    • Wrote these tools by “rowl1”, so they run on “rlvm”
    • The linker requires GC since it uses a lot of memory

    View Slide

  47. Example outputs of the disassembler

    View Slide

  48. Ready to program on “rlvm”!
    • writing programs for rlvm
    • disassembling of byte-codes
    • supports separate compilation
    • Reached the starting line

    View Slide

  49. 6.Wrote “Amber” by “rowl1”

    View Slide

  50. Started developing “Amber”
    • dynamic scripting language
    • instance-based object-oriented system
    • run on rlvm

    View Slide

  51. Wrote an assembler
    • The former assembler
    assembles codes ahead of
    time and run on rlci
    • This assembler assembles
    codes just in time and run
    on rlvm
    • fills addresses by
    backpatching

    View Slide

  52. Wrote the object system
    • slots, messages and parent delegation

    View Slide

  53. Wrote Amber’s core feature on the system
    • dynamic pattern-matching engine
    • mechanism of partial function fusion

    View Slide

  54. Wrote the compiler
    • Made Amber compiler as one of Amber objects
    VM
    object system
    pattern-matching engine
    compiler
    Amber’s core system
    matching of syntax tree
    resource management

    View Slide

  55. Wrote closure-conversion

    View Slide

  56. Wrote parsers
    • compiles parsers at run-time
    • each parser is a usual Amber object (closure)
    VM
    object system
    pattern-matching engine
    compiler
    Amber core system
    compile
    parsers

    View Slide

  57. very simple syntax
    1. literals are expressions
    2. for a symbol h and expressions e1,..,en (n>=0),
    h{e1, ..., en} is an expression
    3. no other form of Amber’s expression

    View Slide

  58. Used Packrat parsing method
    • scanner less

    View Slide

  59. Encoding/decoding floating-point literal was difficult
    • wrote them by my self
    because of “no libc” limitation
    • require multi-precision integer arithmetic which I wrote before
    “3.14” 0x40091eb851eb851f
    strtod, sprintf

    View Slide

  60. Amber interpreter is completed!
    • dynamic scripting language
    • run on rlvm
    • instance-based object oriented system
    • dynamic pattern-matching engine
    • partial function fusion
    • lexical closure
    • I got modern programming language!

    View Slide

  61. 7. Created Amber’s standard library

    View Slide

  62. Amber has strong self extensibility
    • Amber’s simple syntax is extended in a standard library
    • amber/lib/syntax/parse.ab
    • Builds its syntax during boot sequence

    View Slide

  63. Only has very simple syntax at first
    used string literal for comments
    because there is no syntax for comments

    View Slide

  64. Defines a syntax for defining syntaxes

    View Slide

  65. Defines Amber’s syntax with the syntax

    View Slide

  66. Builds macro system

    View Slide

  67. Gives meanings to syntaxes by macros

    View Slide

  68. Now Amber got rich syntax

    View Slide

  69. Extends object system

    View Slide

  70. Now Amber got rich object system
    • Inheritence, mix-in etc.

    View Slide

  71. Now the development is under suspension
    • No plans of further updates
    • Try following commands to invoke Amber shell (Linux only)
    • See the outputs of the make command
    % git clone https://github.com/nineties/amber.git
    % cd amber
    % make; sudo make install
    % amber

    View Slide

  72. Summary
    rowl0
    rlc rlci
    rowl-core
    as
    lang. for
    writing VM
    rlvm
    rowl1
    linker
    disassembler
    compiler
    compiler
    Amber
    interpreter
    impl. impl.
    run
    self-extension
    language
    tool
    • I could reach relatively high-level language. Feel satisfied.

    View Slide

  73. View Slide