A Virtual Brainfuck Machine

A Virtual Brainfuck Machine

This is the "talk version" of this blog post: https://thorstenball.com/blog/2017/01/04/a-virtual-brainfuck-machine-in-go/

Given at the Gophers Meetup Frankfurt

324b2e4d8ae9fcbd7b2983f13481075a?s=128

Thorsten Ball

April 06, 2017
Tweet

Transcript

  1. Hello there!

  2. Who?! Thorsten Ball Software Developer thorstenball.com mrnugget / @thorstenball

  3. I wrote a book! Get it: interpreterbook.com Coupon code for

    20% off: aigudegophers
  4. ++++++++[>++++[>++>+++>+++>+<< <<-]>+>+>- >>+[<]<-]>>.>---.+++ ++++..+++.>>. <-.<.+++.------.- -------. >>+.> + + .

  5. >++++++++++>+>+[ [+++++[>++++++++<-]>.<++++++[>--------<-]+<<<]>.>>[ [-]<[>+<-]>>[<<+>+>-]<[>+<-[>+<-[>+<-[>+<-[>+<-[>+<- [>+<-[>+<-[>+<-[>[-]>+>+<<<-[>+<-]]]]]]]]]]]+>>> ]<<< ]

  6. Brainfuck — Invented by Urban Müller — It's a teaching

    language, not a joke — ... well, okay, it's a joke, too
  7. A Virtual Brainfuck Machine

  8. Understanding Brainfuck Programming languages live in different worlds.

  9. Programming in C

  10. Programming in Go

  11. Programming in Forth 3 4 + .

  12. Programming in Ruby — ... or Java — ... or

    PHP — ... or JavaScript You don't have to worry. Just bounce around.
  13. Brainfuck's View of the World

  14. Brainfuck's View of the World — Memory: 30000 cells initialized

    to 0 — Data pointer: Points to cell — Code: The program that's executed by the machine — Instruction pointer: Points to the next instruction — Input and output streams: STDIN and STDOUT — CPU: Executes the code
  15. The Instruction Set — > - Increment the data pointer

    by 1 — < - Decrement the data pointer by 1 — + - Increment the value in the current cell — - - Decrement the value in the current cell — . - Print current cell — , - Read a character to current cell — [ - If the current cell contains a zero, jump to matching ] — ] - If the current cell does not contain a zero, jump to matching [
  16. The Instruction Set - Hello World ++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>--- .+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

  17. The Instruction Set - Hello World + + + +

    + + + + [ >
  18. The Instruction Set - Hello World PLUS PLUS PLUS PLUS

    PLUS PLUS PLUS PLUS JUMP_IF_ZERO RIGHT
  19. Starting our Interpreter — Interpreters give meaning to symbols by

    doing what they are supposed to mean — The Brainfuck interpreter manipulates the Brainfuck machine — We need to build the Brainfuck machine
  20. type Machine struct { code string ip int memory [30000]int

    dp int input io.Reader output io.Writer } func NewMachine(code string, in io.Reader, out io.Writer) *Machine { return &Machine{ code: code, input: in, output: out, } }
  21. func (m *Machine) Execute() { for m.ip < len(m.code) {

    ins := m.code[m.ip] switch ins { case '+': m.memory[m.dp]++ case '-': m.memory[m.dp]-- case '>': m.dp++ case '<': m.dp-- } m.ip++ } }
  22. type Machine struct { // [...] buf []byte } func

    NewMachine(code string, in io.Reader, out io.Writer) *Machine { return &Machine{ // [...] buf: make([]byte, 1), } }
  23. func (m *Machine) readChar() { n, err := m.input.Read(m.buf) if

    err != nil { panic(err) } if n != 1 { panic("wrong num bytes read") } m.memory[m.dp] = int(m.buf[0]) } func (m *Machine) putChar() { m.buf[0] = byte(m.memory[m.dp]) n, err := m.output.Write(m.buf) if err != nil { panic(err) } if n != 1 { panic("wrong num bytes written") } }
  24. func (m *Machine) Execute() { for m.ip < len(m.code) {

    // [...] case ',': m.readChar() case '.': m.putChar() // [...] } }
  25. Pseudo looping in pseudo code switch currentInstruction { case '[':

    if currentMemoryCellValue() == 0 { positionOfMatchingBracket = findMatching("]") instructionPointer = positionOfMatchingBracket + 1 } case ']': if currentMemoryCellValue() != 0 { positionOfMatchingBracket = findMatching("[") instructionPointer = positionOfMatchingBracket + 1 } }
  26. Problem: Brackets can be nested.

  27. The simplest and slowest solution case '[': if m.memory[m.dp] ==

    0 { depth := 1 for depth != 0 { m.ip++ switch m.code[m.ip] { case '[': depth++ case ']': depth-- } } }
  28. func (m *Machine) Execute() { for m.ip < len(m.code) {

    ins := m.code[m.ip] switch ins { // [...] case '[': if m.memory[m.dp] == 0 { depth := 1 for depth != 0 { m.ip++ switch m.code[m.ip] { case '[': depth++ case ']': depth-- } } } case ']': if m.memory[m.dp] != 0 { depth := 1 for depth != 0 { m.ip-- switch m.code[m.ip] { case ']': depth++ case '[': depth-- } } } } m.ip++ } }
  29. Done!

  30. Hello World! $ cat ./hello_world.b ++++++++[>++++[>++>+++>+++>+<< <<-]>+>+>->>+[<]<-]>>.>---.+++ ++++..+++.>>.<-.<.+++.------.- -------.>>+.>++. $

    go build -o machine && ./machine ./hello_world.b Hello World!
  31. Slow Brainfuck! Bad Brainfuck!

  32. None
  33. None
  34. The Mandelbrot Benchmark $ time ./machine ./mandelbrot.b >/dev/null 68.24s user

    0.18s system 99% cpu 1:08.60 total mandelbrot.b A Mandelbrot fractal viewer in Brainfuck Written by Eric Bosman
  35. Slowing us down 1. Repeated instructions 2. Brackets

  36. What if... — ... we had instructions that said "increase

    by 5" instead of "increase" 5 times? — ... we had an instruction that said "go to this matching bracket if current memory cell is empty"? Spoiler: we'd be faster!
  37. New Instruction Set type InsType byte const ( Plus InsType

    = '+' Minus InsType = '-' Right InsType = '>' Left InsType = '<' PutChar InsType = '.' ReadChar InsType = ',' JumpIfZero InsType = '[' JumpIfNotZero InsType = ']' ) type Instruction struct { Type InsType Argument int }
  38. New Machine type Machine struct { code []*Instruction // <---

    WOOP WOOP! ip int memory [30000]int dp int input io.Reader output io.Writer readBuf []byte } func NewMachine(instructions []*Instruction, in io.Reader, out io.Writer) *Machine { return &Machine{ code: instructions, input: in, output: out, readBuf: make([]byte, 1), } }
  39. Execute - Part 1 func (m *Machine) Execute() { for

    m.ip < len(m.code) { ins := m.code[m.ip] switch ins.Type { case Plus: m.memory[m.dp] += ins.Argument case Minus: m.memory[m.dp] -= ins.Argument case Right: m.dp += ins.Argument case Left: m.dp -= ins.Argument // ... } m.ip++ } }
  40. Execute - Part 2 func (m *Machine) Execute() { //

    ... case PutChar: for i := 0; i < ins.Argument; i++ { m.putChar() } case ReadChar: for i := 0; i < ins.Argument; i++ { m.readChar() } // ... }
  41. Execute - Part 3 func (m *Machine) Execute() { //

    ... case JumpIfZero: if m.memory[m.dp] == 0 { m.ip = ins.Argument continue } case JumpIfNotZero: if m.memory[m.dp] != 0 { m.ip = ins.Argument continue } } // ... m.ip++ } }
  42. But, wait, ... how?

  43. Brainfuck | ? | New Instruction Set

  44. Compiler Wikipedia says: a computer program (or a set of

    programs) that transforms source code written in a programming language (the source language) into another computer language (the target language)
  45. Let's do this! type Compiler struct { code string //

    <--- Brainfuck code codeLength int position int instructions []*Instruction // <--- New instruction set } func NewCompiler(code string) *Compiler { return &Compiler{ code: code, codeLength: len(code), instructions: []*Instruction{}, } }
  46. func (c *Compiler) Compile() []*Instruction { for c.position < c.codeLength

    { current := c.code[c.position] switch current { case '+': c.CompileFoldableInstruction('+', Plus) case '-': c.CompileFoldableInstruction('-', Minus) case '<': c.CompileFoldableInstruction('<', Left) case '>': c.CompileFoldableInstruction('>', Right) case '.': c.CompileFoldableInstruction('.', PutChar) case ',': c.CompileFoldableInstruction(',', ReadChar) } c.position++ } return c.instructions }
  47. func (c *Compiler) CompileFoldableInstruction(char byte, insType InsType) { count :=

    1 for c.position < c.codeLength-1 && c.code[c.position+1] == char { count++ c.position++ } c.EmitWithArg(insType, count) } func (c *Compiler) EmitWithArg(insType InsType, arg int) int { ins := &Instruction{Type: insType, Argument: arg} c.instructions = append(c.instructions, ins) return len(c.instructions) - 1 }
  48. Problems when compiling loops — NOT foldable. We can't turn

    [[[]]] into [] — NOT countable. Instructions in beetween might change. — NOT stateless. We have to remember positions.
  49. Solution — "[" — emit a JumpIfZero instruction — Argument

    will be 0 -- a placeholder value — "]" — emit JumpIfNotZero with correct argument — change JumpIfZero argument to correct position
  50. Solution — "[" — emit a JumpIfZero instruction — Argument

    will be 0 -- a placeholder value — "]" — emit JumpIfNotZero with correct argument — change JumpIfZero argument to correct position How do we keep track of JumpIfZero instructions? Solution to problem in solution: with a stack! Stack, the data structure. First in, last out.
  51. func (c *Compiler) Compile() []*Instruction { loopStack := []int{} for

    c.position < c.codeLength { current := c.code[c.position] switch current { case '[': insPos := c.EmitWithArg(JumpIfZero, 0) loopStack = append(loopStack, insPos) // [...] } c.position++ } return c.instructions }
  52. func (c *Compiler) Compile() []*Instruction { // [...] case ']':

    // Pop position of last JumpIfZero ("[") instruction off stack openInstruction := loopStack[len(loopStack)-1] loopStack = loopStack[:len(loopStack)-1] // Emit the new JumpIfNotZero ("]") instruction, // with correct position as argument closeInstructionPos := c.EmitWithArg(JumpIfNotZero, openInstruction) // Patch the old JumpIfZero ("[") instruction with new position c.instructions[openInstruction].Argument = closeInstructionPos // [...] }
  53. This really works! Input: +++[---[+]>>>]<<< Output: []*Instruction{ &Instruction{Type: Plus, Argument:

    3}, &Instruction{Type: JumpIfZero, Argument: 7}, &Instruction{Type: Minus, Argument: 3}, &Instruction{Type: JumpIfZero, Argument: 5}, &Instruction{Type: Plus, Argument: 1}, &Instruction{Type: JumpIfNotZero, Argument: 3}, &Instruction{Type: Right, Argument: 3}, &Instruction{Type: JumpIfNotZero, Argument: 1}, &Instruction{Type: Left, Argument: 3}, }
  54. How much faster does this make my production Brainfuck code?

  55. $ time ./machine ./mandelbrot.b >/dev/null 13.43s user 0.04s system 99%

    cpu 13.496 total
  56. $ time ./machine ./mandelbrot.b >/dev/null 13.43s user 0.04s system 99%

    cpu 13.496 total 13.496 total! before: 1:08.60 total!
  57. Why am I here?

  58. — Instruction Set — The Switch — Virtual Machine —

    Bytecode Compiler Not too bad, right?
  59. Make Eric proud! — brainfuck optimzation strategies: http://calmerthanyouare.org/ 2015/01/07/optimizing-brainfuck.html —

    Hello, JIT World: The Joy Of Simple JITs: http:// blog.reverberate.org/2012/12/hello-jit-world-joy-of-simple-jits.html — interpreter, compiler, jit: https://nickdesaulniers.github.io/blog/ 2015/05/25/interpreter-compiler-jit/ — a optimized brainfuck compiler written in sed: https://github.com/ stedolan/bf.sed — the original Brainfuck distribution: https://gist.github.com/ rdebath/0ca09ec0fdcf3f82478f — there's much, much more