Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Virtual Brainfuck Machine

A Virtual Brainfuck Machine

This is the "talk version" of this blog post: https://thorstenball.com/blog/2017/01/04/a-virtual-brainfuck-machine-in-go/

Given at the Gophers Meetup Frankfurt

Thorsten Ball

April 06, 2017
Tweet

More Decks by Thorsten Ball

Other Decks in Programming

Transcript

  1. Hello there!

    View Slide

  2. Who?!
    Thorsten Ball
    Software Developer
    thorstenball.com
    mrnugget / @thorstenball

    View Slide

  3. I wrote a book!
    Get it: interpreterbook.com
    Coupon code for 20% off:
    aigudegophers

    View Slide

  4. ++++++++[>++++[>++>+++>+++>+<<
    <<-]>+>+>-
    >>+[<]<-]>>.>---.+++
    ++++..+++.>>.
    <-.<.+++.------.-
    -------.
    >>+.>
    +
    +
    .

    View Slide

  5. >++++++++++>+>+[
    [+++++[>++++++++<-]>.<++++++[>--------<-]+<<<]>.>>[
    [-]<[>+<-]>>[<<+>+>-]<[>+<-[>+<-[>+<-[>+<-[>+<-[>+<-
    [>+<-[>+<-[>+<-[>[-]>+>+<<<-[>+<-]]]]]]]]]]]+>>>
    ]<<<
    ]

    View Slide

  6. Brainfuck
    — Invented by Urban Müller
    — It's a teaching language, not a joke
    — ... well, okay, it's a joke, too

    View Slide

  7. A Virtual Brainfuck Machine

    View Slide

  8. Understanding Brainfuck
    Programming languages live in different worlds.

    View Slide

  9. Programming in
    C

    View Slide

  10. Programming in
    Go

    View Slide

  11. Programming in
    Forth
    3 4 + .

    View Slide

  12. Programming in Ruby
    — ... or Java
    — ... or PHP
    — ... or JavaScript
    You don't have to worry.
    Just bounce around.

    View Slide

  13. Brainfuck's View of the
    World

    View Slide

  14. Brainfuck's View of the World
    — Memory: 30000 cells
    initialized to 0
    — Data pointer: Points to cell
    — Code: The program that's
    executed by the machine
    — Instruction pointer: Points
    to the next instruction
    — Input and output streams:
    STDIN and STDOUT
    — CPU: Executes the code

    View Slide

  15. The Instruction Set
    — > - Increment the data pointer by 1
    — < - Decrement the data pointer by 1
    — + - Increment the value in the current cell
    — - - Decrement the value in the current cell
    — . - Print current cell
    — , - Read a character to current cell
    — [ - If the current cell contains a zero, jump to matching ]
    — ] - If the current cell does not contain a zero, jump to
    matching [

    View Slide

  16. The Instruction Set - Hello World
    ++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---
    .+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

    View Slide

  17. The Instruction Set - Hello World
    +
    +
    +
    +
    +
    +
    +
    +
    [
    >

    View Slide

  18. The Instruction Set - Hello World
    PLUS
    PLUS
    PLUS
    PLUS
    PLUS
    PLUS
    PLUS
    PLUS
    JUMP_IF_ZERO
    RIGHT

    View Slide

  19. Starting our Interpreter
    — Interpreters give meaning to symbols by doing
    what they are supposed to mean
    — The Brainfuck interpreter manipulates the
    Brainfuck machine
    — We need to build the Brainfuck machine

    View Slide

  20. type Machine struct {
    code string
    ip int
    memory [30000]int
    dp int
    input io.Reader
    output io.Writer
    }
    func NewMachine(code string, in io.Reader, out io.Writer) *Machine {
    return &Machine{
    code: code,
    input: in,
    output: out,
    }
    }

    View Slide

  21. func (m *Machine) Execute() {
    for m.ip < len(m.code) {
    ins := m.code[m.ip]
    switch ins {
    case '+':
    m.memory[m.dp]++
    case '-':
    m.memory[m.dp]--
    case '>':
    m.dp++
    case '<':
    m.dp--
    }
    m.ip++
    }
    }

    View Slide

  22. type Machine struct {
    // [...]
    buf []byte
    }
    func NewMachine(code string, in io.Reader, out io.Writer) *Machine {
    return &Machine{
    // [...]
    buf: make([]byte, 1),
    }
    }

    View Slide

  23. func (m *Machine) readChar() {
    n, err := m.input.Read(m.buf)
    if err != nil {
    panic(err)
    }
    if n != 1 {
    panic("wrong num bytes read")
    }
    m.memory[m.dp] = int(m.buf[0])
    }
    func (m *Machine) putChar() {
    m.buf[0] = byte(m.memory[m.dp])
    n, err := m.output.Write(m.buf)
    if err != nil {
    panic(err)
    }
    if n != 1 {
    panic("wrong num bytes written")
    }
    }

    View Slide

  24. func (m *Machine) Execute() {
    for m.ip < len(m.code) {
    // [...]
    case ',':
    m.readChar()
    case '.':
    m.putChar()
    // [...]
    }
    }

    View Slide

  25. Pseudo looping in pseudo code
    switch currentInstruction {
    case '[':
    if currentMemoryCellValue() == 0 {
    positionOfMatchingBracket = findMatching("]")
    instructionPointer = positionOfMatchingBracket + 1
    }
    case ']':
    if currentMemoryCellValue() != 0 {
    positionOfMatchingBracket = findMatching("[")
    instructionPointer = positionOfMatchingBracket + 1
    }
    }

    View Slide

  26. Problem: Brackets can be
    nested.

    View Slide

  27. The simplest and slowest solution
    case '[':
    if m.memory[m.dp] == 0 {
    depth := 1
    for depth != 0 {
    m.ip++
    switch m.code[m.ip] {
    case '[':
    depth++
    case ']':
    depth--
    }
    }
    }

    View Slide

  28. func (m *Machine) Execute() {
    for m.ip < len(m.code) {
    ins := m.code[m.ip]
    switch ins {
    // [...]
    case '[':
    if m.memory[m.dp] == 0 {
    depth := 1
    for depth != 0 {
    m.ip++
    switch m.code[m.ip] {
    case '[':
    depth++
    case ']':
    depth--
    }
    }
    }
    case ']':
    if m.memory[m.dp] != 0 {
    depth := 1
    for depth != 0 {
    m.ip--
    switch m.code[m.ip] {
    case ']':
    depth++
    case '[':
    depth--
    }
    }
    }
    }
    m.ip++
    }
    }

    View Slide

  29. Done!

    View Slide

  30. Hello World!
    $ cat ./hello_world.b
    ++++++++[>++++[>++>+++>+++>+<<
    <<-]>+>+>->>+[<]<-]>>.>---.+++
    ++++..+++.>>.<-.<.+++.------.-
    -------.>>+.>++.
    $ go build -o machine && ./machine ./hello_world.b
    Hello World!

    View Slide

  31. Slow Brainfuck!
    Bad Brainfuck!

    View Slide

  32. View Slide

  33. View Slide

  34. The Mandelbrot Benchmark
    $ time ./machine ./mandelbrot.b >/dev/null
    68.24s user 0.18s system 99% cpu 1:08.60 total
    mandelbrot.b
    A Mandelbrot fractal viewer in Brainfuck
    Written by Eric Bosman

    View Slide

  35. Slowing us down
    1. Repeated instructions
    2. Brackets

    View Slide

  36. What if...
    — ... we had instructions that said "increase by 5"
    instead of "increase" 5 times?
    — ... we had an instruction that said "go to this
    matching bracket if current memory cell is
    empty"?
    Spoiler: we'd be faster!

    View Slide

  37. New Instruction Set
    type InsType byte
    const (
    Plus InsType = '+'
    Minus InsType = '-'
    Right InsType = '>'
    Left InsType = '<'
    PutChar InsType = '.'
    ReadChar InsType = ','
    JumpIfZero InsType = '['
    JumpIfNotZero InsType = ']'
    )
    type Instruction struct {
    Type InsType
    Argument int
    }

    View Slide

  38. New Machine
    type Machine struct {
    code []*Instruction // <--- WOOP WOOP!
    ip int
    memory [30000]int
    dp int
    input io.Reader
    output io.Writer
    readBuf []byte
    }
    func NewMachine(instructions []*Instruction, in io.Reader, out io.Writer) *Machine {
    return &Machine{
    code: instructions,
    input: in,
    output: out,
    readBuf: make([]byte, 1),
    }
    }

    View Slide

  39. Execute - Part 1
    func (m *Machine) Execute() {
    for m.ip < len(m.code) {
    ins := m.code[m.ip]
    switch ins.Type {
    case Plus:
    m.memory[m.dp] += ins.Argument
    case Minus:
    m.memory[m.dp] -= ins.Argument
    case Right:
    m.dp += ins.Argument
    case Left:
    m.dp -= ins.Argument
    // ...
    }
    m.ip++
    }
    }

    View Slide

  40. Execute - Part 2
    func (m *Machine) Execute() {
    // ...
    case PutChar:
    for i := 0; i < ins.Argument; i++ {
    m.putChar()
    }
    case ReadChar:
    for i := 0; i < ins.Argument; i++ {
    m.readChar()
    }
    // ...
    }

    View Slide

  41. Execute - Part 3
    func (m *Machine) Execute() {
    // ...
    case JumpIfZero:
    if m.memory[m.dp] == 0 {
    m.ip = ins.Argument
    continue
    }
    case JumpIfNotZero:
    if m.memory[m.dp] != 0 {
    m.ip = ins.Argument
    continue
    }
    }
    // ...
    m.ip++
    }
    }

    View Slide

  42. But, wait, ... how?

    View Slide

  43. Brainfuck
    |
    ?
    |
    New Instruction Set

    View Slide

  44. Compiler
    Wikipedia says:
    a computer program (or a set of programs) that
    transforms source code written in a programming
    language (the source language) into another
    computer language (the target language)

    View Slide

  45. Let's do this!
    type Compiler struct {
    code string // <--- Brainfuck code
    codeLength int
    position int
    instructions []*Instruction // <--- New instruction set
    }
    func NewCompiler(code string) *Compiler {
    return &Compiler{
    code: code,
    codeLength: len(code),
    instructions: []*Instruction{},
    }
    }

    View Slide

  46. func (c *Compiler) Compile() []*Instruction {
    for c.position < c.codeLength {
    current := c.code[c.position]
    switch current {
    case '+':
    c.CompileFoldableInstruction('+', Plus)
    case '-':
    c.CompileFoldableInstruction('-', Minus)
    case '<':
    c.CompileFoldableInstruction('<', Left)
    case '>':
    c.CompileFoldableInstruction('>', Right)
    case '.':
    c.CompileFoldableInstruction('.', PutChar)
    case ',':
    c.CompileFoldableInstruction(',', ReadChar)
    }
    c.position++
    }
    return c.instructions
    }

    View Slide

  47. func (c *Compiler) CompileFoldableInstruction(char byte, insType InsType) {
    count := 1
    for c.position < c.codeLength-1 && c.code[c.position+1] == char {
    count++
    c.position++
    }
    c.EmitWithArg(insType, count)
    }
    func (c *Compiler) EmitWithArg(insType InsType, arg int) int {
    ins := &Instruction{Type: insType, Argument: arg}
    c.instructions = append(c.instructions, ins)
    return len(c.instructions) - 1
    }

    View Slide

  48. Problems when compiling loops
    — NOT foldable. We can't turn [[[]]] into []
    — NOT countable. Instructions in beetween might
    change.
    — NOT stateless. We have to remember positions.

    View Slide

  49. Solution
    — "["
    — emit a JumpIfZero instruction
    — Argument will be 0 -- a placeholder value
    — "]"
    — emit JumpIfNotZero with correct argument
    — change JumpIfZero argument to correct position

    View Slide

  50. Solution
    — "["
    — emit a JumpIfZero instruction
    — Argument will be 0 -- a placeholder value
    — "]"
    — emit JumpIfNotZero with correct argument
    — change JumpIfZero argument to correct position
    How do we keep track of JumpIfZero instructions?
    Solution to problem in solution: with a stack!
    Stack, the data structure.
    First in, last out.

    View Slide

  51. func (c *Compiler) Compile() []*Instruction {
    loopStack := []int{}
    for c.position < c.codeLength {
    current := c.code[c.position]
    switch current {
    case '[':
    insPos := c.EmitWithArg(JumpIfZero, 0)
    loopStack = append(loopStack, insPos)
    // [...]
    }
    c.position++
    }
    return c.instructions
    }

    View Slide

  52. func (c *Compiler) Compile() []*Instruction {
    // [...]
    case ']':
    // Pop position of last JumpIfZero ("[") instruction off stack
    openInstruction := loopStack[len(loopStack)-1]
    loopStack = loopStack[:len(loopStack)-1]
    // Emit the new JumpIfNotZero ("]") instruction,
    // with correct position as argument
    closeInstructionPos := c.EmitWithArg(JumpIfNotZero, openInstruction)
    // Patch the old JumpIfZero ("[") instruction with new position
    c.instructions[openInstruction].Argument = closeInstructionPos
    // [...]
    }

    View Slide

  53. This really works!
    Input:
    +++[---[+]>>>]<<<
    Output:
    []*Instruction{
    &Instruction{Type: Plus, Argument: 3},
    &Instruction{Type: JumpIfZero, Argument: 7},
    &Instruction{Type: Minus, Argument: 3},
    &Instruction{Type: JumpIfZero, Argument: 5},
    &Instruction{Type: Plus, Argument: 1},
    &Instruction{Type: JumpIfNotZero, Argument: 3},
    &Instruction{Type: Right, Argument: 3},
    &Instruction{Type: JumpIfNotZero, Argument: 1},
    &Instruction{Type: Left, Argument: 3},
    }

    View Slide

  54. How much faster does this
    make my production Brainfuck
    code?

    View Slide

  55. $ time ./machine ./mandelbrot.b >/dev/null
    13.43s user 0.04s system 99% cpu 13.496 total

    View Slide

  56. $ time ./machine ./mandelbrot.b >/dev/null
    13.43s user 0.04s system 99% cpu 13.496 total
    13.496 total!
    before: 1:08.60 total!

    View Slide

  57. Why am I here?

    View Slide

  58. — Instruction Set
    — The Switch
    — Virtual Machine
    — Bytecode Compiler
    Not too bad, right?

    View Slide

  59. Make Eric proud!
    — brainfuck optimzation strategies: http://calmerthanyouare.org/
    2015/01/07/optimizing-brainfuck.html
    — Hello, JIT World: The Joy Of Simple JITs: http://
    blog.reverberate.org/2012/12/hello-jit-world-joy-of-simple-jits.html
    — interpreter, compiler, jit: https://nickdesaulniers.github.io/blog/
    2015/05/25/interpreter-compiler-jit/
    — a optimized brainfuck compiler written in sed: https://github.com/
    stedolan/bf.sed
    — the original Brainfuck distribution: https://gist.github.com/
    rdebath/0ca09ec0fdcf3f82478f
    — there's much, much more

    View Slide