Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Writing a Tiny Compiler

Writing a Tiny Compiler

Given at try! Swift NYC on September 1, 2016.
http://www.tryswiftnyc.com
Code at https://github.com/segiddins/Sipquick

4d6be90af74894fd132fb06dacec04d7?s=128

Samuel E. Giddins

September 01, 2016
Tweet

Transcript

  1. Writing a Tiny Compiler Samuel Giddins

  2. Follow Along https://github.com/segiddins/Sipquick

  3. Follow Along $ cloc Sipquick ------------------------------------------------------------------------------- Language files blank comment

    code ------------------------------------------------------------------------------- Swift 6 43 0 310 ------------------------------------------------------------------------------- SUM: 6 43 0 310 -------------------------------------------------------------------------------
  4. Why write a compiler?

  5. Why write a compiler? → Better idea of how compilers

    work → It's a well-known problem domain → It's doable in any language → I've never written one before
  6. What is a compiler? let compiler: (String) -> Data /*

    Executable */
  7. What is a compiler? A compiler transforms source into an

    executable
  8. What is a compiler? A compiler transforms a program written

    in one language into another language
  9. How do I write a compiler?

  10. How do I write a compiler? → Parse → Lex

    → Semantic Analysis → Optimize → Optimize → Generate Code
  11. Compilers are highly functional

  12. Sipquick noun a small, energetic bird in The Wise Man's

    Fear
  13. Sipquick A small language invented for the purpose of this

    talk. → S-expressions → Dynamic → Functional-ish
  14. Sipquick (def hello name (+ "Hello, " name)) (print (hello

    "world!"))
  15. Parser A parser-combinator, taking several ideas from Yasuhiro Inami's try!

    Swift Tokyo talk.
  16. Parser let schemeParser: Parser<String, [Sexp]> = { return ignoreSecond( sexp().

    then(unichar("\n").maybe()). maybeMany(). fmap { $0.map { $0.0 } } .then(dropEndingMetadata().or(empty().andThatsAll()))) }
  17. Lexer We turn this mess of Parser generics into our

    AST (in the form of sexps) inline with the parsing.
  18. Lexer _.fmap { Sexp.single($0) } _.fmap { Sexp.many($0) } _.fmap

    { _ in Sexp.none }
  19. Semantic Analysis let sema: ([Sexp]) -> [Expression]

  20. Semantic Analysis struct Expression { init(sexp: Sexp) { switch exp

    { ... } } }
  21. Semantic Analysis Figuring out what our expressions actually express. In

    most compilers, this stage would guarentee the semantics of your program are somehow "well- formed".
  22. Optimization let optimizations: Array<[Expression] -> [Expression]>

  23. Optimization The Sipquick compiler does none.

  24. Code Generation let codeGen: ([Expression]) -> Data

  25. Code Generation Transforming expressions into something that is loosely "executable".

  26. Code Generation In my opinion, the hardest part of a

    compiler. Especially if you're not compiling something like C.
  27. None
  28. Code Generation I chose to target Ruby, since it's dynamic,

    has a robust standard library, and I know it better than is healthy.
  29. Code Generation extension Expression { func asRuby() -> String {

    switch self.type { ... } } }
  30. Code Generation The gist of our code gen step is

    that we map Expressions into Ruby code snippets. After joining them together, adding a shebang, and running chmod, we have a file we can run.
  31. Code Generation ; fibonacci.spq (def fib x (condition (<= x

    1) x (+ (fib (- x 1)) (fib (- x 2))))) (print (fib 10)) #!/usr/bin/env ruby def fib(x) if (x.<=(1)) x else (fib((x.-(1))).+(fib((x.-(2))))) end end print(fib(10))
  32. Code Generation extension Expression { func parenthesize(_ s: String) ->

    String { return "(\(s))" } var isOperatorCall: Bool { switch kind { case .call: guard let funcName = args.first else { return false } switch funcName { case "+", "-", "*", "%", "/", "||", "&&", "&", "|", "==", ">", "<", ">=", "<=", "[]", "..": return true default: return false } default: return false } } func asRuby(depth: Int = 0) -> String { let indent = String(repeating: " ", count: 2 * depth) switch kind { case .bare: return args.joined(separator: " ") case .call: if isOperatorCall { let op = args.first! let rec = children.first! let opArgs = children.dropFirst() return parenthesize(rec.asRuby(depth: depth + 1)) + ".\(op)" + parenthesize(opArgs.map { $0.asRuby(depth: depth + 1) }.joined(separator: ", ")) } else { return args.joined(separator: " ") + "(" + children.map { "(\($0.asRuby(depth: depth + 1)))" }.joined(separator: ",\n\(indent)") + ")" } case .functionDefinition: let name = args.first! let argNames = args.dropFirst() return "def \(name)(\(argNames.joined(separator: ", ")))\n" + children.map { indent + $0.asRuby(depth: depth + 1) }.joined(separator: "\n") + "\nend" case .empty: return "" case .variableDeclaration: let varName = args.joined(separator: " ") return "\(varName) = (\(children.map {$0.asRuby(depth: depth + 1)}.joined(separator: ", ")))" case .conditional: guard children.count == 3 else { fatalError("a conditional must have exactly three arguments") } let conditional = children[0] let positive = children[1] let negative = children[2] return "if \(conditional.asRuby(depth: depth + 1))\n\(indent) \(positive.asRuby(depth: depth + 1))\nelse\n\(indent) \(negative.asRuby(depth: depth + 1))\nend" } } }
  33. Code Generation extension Expression { func parenthesize(_ s: String) ->

    String var isOperatorCall: Bool { get } func asRuby(depth: Int = 0) -> String { switch kind { case .bare: { } case .call: if isOperatorCall { } else { } case .functionDefinition: { } case .empty: { } case .variableDeclaration: { } case .conditional: guard children.count == 3 else { } let conditional = children[0] let positive = children[1] let negative = children[2] { } } } }
  34. Using The Compiler $ cat fibonacci.spq (def fib x (condition

    (<= x 1) x (+ (fib (- x 1)) (fib (- x 2))))) (puts (fib ([] ARGV 0))) $ sipquick fibonacci.spq fibonacci $ ./fibonacci 10 55
  35. Testing The Compiler

  36. Testing The Compiler → Integration tests → Compile && Run

    && Verify → Only testing positive cases → 0 unit test coverage
  37. Testing The Compiler (def print_even x (condition (== (% x

    2) 0) (print "even") (print "odd"))) (print_even 17) (print_even 12) (print_even -1) (print_even 1) (print_even (* 0 1)) ///// it allows branching 0 oddevenoddoddeven
  38. Testing The Compiler import Foundation.NSString struct Test { let script:

    String let name: String let expectedOutput: String let expectedExit: Int init(script: String) { self.script = script let contents = try! String.init(contentsOfFile: script) let metadata = contents.components(separatedBy: "/////\n")[1].components(separatedBy: "\n") self.name = metadata[0] self.expectedExit = Int(metadata[1])! self.expectedOutput = metadata.dropFirst(2).joined(separator: "\n").trimmingCharacters(in: .whitespacesAndNewlines) } func run() -> (Bool, String) { let (compileOutput, compileStatus) = sipquick_test .run(path: sipquick_path, arguments: [script, "/private/var/tmp/sipquick-test \(name).exe"]) guard compileStatus == 0 else { return (false, "failed to compile \(name):\n\(compileOutput)") } let (output, status) = sipquick_test .run(path: "/private/var/tmp/sipquick-test \(name).exe", arguments: []) let success = output == expectedOutput && status == expectedExit let errorMessage = "failed \(name): got \(output.debugDescription) (\(status)), expected \(expectedOutput.debugDescription) (\(expectedExit))" return (success, errorMessage) } }
  39. Testing The Compiler let sipquick_path = String(CommandLine.arguments[0].characters.dropLast(5)) let specDirectory =

    "/Users/segiddins/Desktop/Sipquick/sipquick-spec/" let specFiles = try! FileManager().contentsOfDirectory(atPath: specDirectory).filter { $0.hasSuffix(".spq") }.map { specDirectory + $0 } let tests = specFiles.map(Test.init) let failures = tests.map { $0.run() }.filter { $0.0 == false } if failures.isEmpty { exit(EXIT_SUCCESS) } failures.map { $0.1 }.forEach { print($0) } exit(EXIT_FAILURE)
  40. Testing the Compiler (def fizzbuzz x (condition (<= x 0)

    (return "") (condition (== 0 (% x 15)) (+ (fizzbuzz (- x 1)) "fizzbuzz") (condition (== 0 (% x 3)) (+ (fizzbuzz (- x 1)) "fizz") (condition (== 0 (% x 5)) (+ (fizzbuzz (- x 1)) "buzz") (+ (fizzbuzz (- x 1)) (String x))))))) (def fetch_arg position default (condition (!= nil ([] ARGV position)) ([] ARGV position) (return default))) (print (fizzbuzz (fetch_arg 0 100))) ///// it computes fizzbuzz 0 12fizz4buzzfizz78fizzbuzz11fizz1314fizzbuzz1617fizz19buzzfizz2223fizzbuzz26fizz2 829fizzbuzz3132fizz34buzzfizz3738fizzbuzz41fizz4344fizzbuzz4647fizz49buzzfizz525 3fizzbuzz56fizz5859fizzbuzz6162fizz64buzzfizz6768fizzbuzz71fizz7374fizzbuzz7677f izz79buzzfizz8283fizzbuzz86fizz8889fizzbuzz9192fizz94buzzfizz9798fizzbuzz
  41. TODO → Refactor Expression to be an enum → Add

    optimizations → Allow defining new variables in function scope → Add parsing error messages → Add semantic analysis errors → Compile to machine code
  42. TODO → Implement a proper standard library → Actually implementing

    comment parsing → Add lambdas → Multiple-expression expressions
  43. Lessons Learned

  44. Lessons Learned → Writing a compiler is hard → Testing

    a compiler is really, really, really necessary → String parsing needs a better interface → Error messages are hard → The LLVM API is meant for typed languages → Implementing your own language is super rewarding
  45. Lessons Learned "Real" programming languages are far superior to anything

    I can write in a weekend. They take expertise and time and care and discipline. I'll keep that in mind next time I want to complain about swiftc.
  46. Thank You! ! @segiddins