$30 off During Our Annual Pro Sale. View Details »

Parser Combinator in Swift

Parser Combinator in Swift

Yasuhiro Inami

March 04, 2016
Tweet

More Decks by Yasuhiro Inami

Other Decks in Programming

Transcript

  1. Parser
    Combinator
    in Swift
    2016/03/02-04 try! Swift (#tryswiftconf)
    Yasuhiro Inami / @inamiy

    View Slide

  2. View Slide

  3. Parser
    Combinator

    View Slide

  4. Parser
    Takes input data (frequently text) and
    builds a data structure, e.g. abstract
    syntax tree (AST)
    • Lexer
    • Strings/Bytes → Tokens
    • Parser
    • Tokens → Syntax Tree

    View Slide

  5. View Slide

  6. View Slide

  7. Bottom-Up

    View Slide

  8. View Slide

  9. View Slide

  10. Top-Down

    View Slide

  11. Parsing Algorithms
    • Bottom-Up
    • Operator Precedence Parsing
    • LR(k) (SLR / LALR / etc)
    • Top-Down
    • Recursive-descent / LL(k) (Table-driven)
    • Predictive / Backtracking
    • Packrat (Recursive-descent + memoization)

    View Slide

  12. Combinator
    • Lambda expression without free
    variables
    • I = λx.x = { x in x }
    • K = λxy.x = { x, y in x }
    • S = λxyz.xz(yz)
    = { x, y, z in x(z)(y(z))}
    • Y = S(K(SII))(S(S(KS)K)(K(SII)))
    • ι = λx.xSK

    View Slide

  13. Parser Combinator
    Higher-order function which takes:
    • Input: Parser(s)
    • Output: New Parser
    Combining simple parsers to construct
    more complex parsers
    → Functional Programming approach

    View Slide

  14. Simple Parser
    in Swift

    View Slide

  15. Parser Monad
    struct Parser {
    let parse: String -> (A, String)?
    }
    • Container of state-transforming function
    • Input: String
    • Output: Optional tuple of "output" & "remaining input"

    View Slide

  16. // Applicative pure
    func pure(a: A) -> Parser {
    return Parser { (a, $0) }
    }
    // Alternative empty
    func empty() -> Parser {
    return Parser { _ in nil }
    }

    View Slide

  17. // Monad bind (>>=)
    func >>- (p: Parser, f: A -> Parser) -> Parser {
    return Parser { input in
    if let (result, input2) = p.parse(input) {
    return f(result).parse(input2)
    } else {
    return nil
    }
    }
    }
    // Functor fmap (<$>)
    func <^> (f: A -> B, p: Parser) -> Parser {
    return p >>- { a in pure(f(a)) }
    }

    View Slide

  18. // Alternative choice (associative operation)
    func <|> (p: Parser, q: Parser) -> Parser {
    return Parser { input in
    if let (result, input2) = p.parse(input) {
    return (result, input2)
    } else {
    return q.parse(input)
    }
    }
    }

    View Slide

  19. // Applicative sequential application
    func <*> (p: Parser B>, q: Parser) -> Parser {
    return p >>- { f in f <^> q }
    }
    // Sequence actions, discarding the value of the second argument
    func <* (p: Parser, q: Parser) -> Parser {
    return const <^> p <*> q
    }
    // Sequence actions, discarding the value of the first argument
    func *> (p: Parser, q: Parser) -> Parser {
    return const(id) <^> p <*> q
    }

    View Slide

  20. Parse 1 character
    func satisfy(predicate: Character -> Bool) -> Parser {
    return Parser { input in
    if let (head, tail) = uncons(input) where predicate(head) {
    return (tail, head)
    } else {
    return nil
    }
    }
    }

    View Slide

  21. Parse 1 character
    func any() -> Parser {
    return satisfy(const(true))
    }
    func digit() -> Parser {
    return satisfy { "0"..."9" ~= $0 }
    }
    func char(c: Character) -> Parser {
    return satisfy { $0 == c }
    }

    View Slide

  22. Parse string
    func string(str: String) -> Parser {
    if let (head, tail) = uncons(str) {
    return char(head) *> string(tail) *> pure(str)
    } else {
    return pure("")
    }
    }

    View Slide

  23. Combine parsers
    func many(p: Parser) -> Parser<[A]> {
    return many1(p) <|> pure([])
    }
    func many1(p: Parser) -> Parser<[A]> {
    return cons <^> p <*> many(p)
    }

    View Slide

  24. Combine parsers
    func skipMany(p: Parser) -> Parser<()> {
    return skipMany1(p) <|> pure(())
    }
    func skipMany1(p: Parser) -> Parser<()> {
    return p *> skipMany(p)
    }
    func skipSpaces() -> Parser<()> {
    return skipMany(space)
    }

    View Slide

  25. Parse tokens
    func symbol(str: String) -> Parser {
    return skipSpaces() *> string(str) <* skipSpaces()
    }
    func natural() -> Parser {
    return skipSpaces() *>
    ({ Int(String($0))! } <^> many1(digit()))
    <* skipSpaces()
    }

    View Slide

  26. Let's play!

    View Slide

  27. Simple Arithmetics
    Backus–Naur Form (BNF)
    ::= + |
    ::= * |
    ::= ( ) |
    ::= '0' | '1' | '2' | ...

    View Slide

  28. Simple Arithmetics
    func expr() -> Parser {
    return term() >>- { t in // uses right recursion
    (symbol("+") *> expr() >>- { e in pure(t + e) }) <|> pure(t)
    }
    }
    func term() -> Parser {
    return factor() >>- { f in
    (symbol("*") *> term() >>- { t in pure(f * t) }) <|> pure(f)
    }
    }
    func factor() -> Parser {
    return (symbol("(") *> expr() <* symbol(")")) <|> natural()
    }

    View Slide

  29. let (ans, _) = expr().parse(" ( 12 + 3 ) * 4+5")!
    expect(ans) == 65

    View Slide

  30. View Slide

  31. More?

    View Slide

  32. TryParsec
    https://github.com/inamiy/TryParsec

    View Slide

  33. TryParsec
    • Monadic Parser Combinator for ✨✨ try! Swift ✨✨
    • Inspired by Haskell's Attoparsec / Aeson
    • Supports CSV / XML / JSON
    • LL(*) with backtracking by default
    • Doesn't try, but please try! :)

    View Slide

  34. TryParsec
    • Basic Operators: >>-, <^>, <*>, *>, <*, <|>, >
    • Combinators: many, many1, manyTill, zeroOrOne,
    skipMany, skipMany1, sepBy, sepBy1, sepEndBy, sepEndBy1,
    count, chainl, chainl1, chainr, chainr1
    • Text (UnicodeScalarView): peek, endOfInput, satisfy,
    skip, skipWhile, take, takeWhile, any, char, not, string,
    asciiCI, oneOf, noneOf, space, skipSpaces, number... (etc)

    View Slide

  35. Parse JSON

    View Slide

  36. enum JSON
    enum JSON {
    case String(Swift.String)
    case Number(Double)
    case Bool(Swift.Bool)
    case Null
    case Array([JSON])
    case Object([Swift.String : JSON])
    }

    View Slide

  37. JSON example
    {
    "string": "hello",
    "num": 123.45,
    "bool": true,
    "null": null,
    "array": ["hello", 9.99, false],
    "dict": { "key": "value" },
    "object": { "enabled": true }
    }

    View Slide

  38. Parse JSON (to AST)
    let json = parseJSON(jsonString).value
    print(json)
    Result:
    JSON.Object(["null": .Null, "num": .Number(123.45),
    "bool": .Bool(true), "string": .String("hello"),
    "array": .Array([.String("hello"), .Number(9.99), .Bool(false)]),
    "dict": .Object(["key": .String("value")]),
    "object": .Object(["enabled": .Bool(true)])
    ])

    View Slide

  39. JSON Decoding

    View Slide

  40. struct Model
    struct Model {
    let string: String
    let num: Double
    let bool: Bool
    let null: Any?
    let array: [Any]
    let dict: [String : Any]
    let subModel: SubModel
    let dummy: Bool? // doesn't exist in raw JSON
    }

    View Slide

  41. FromJSON (Protocol)
    extension Model: FromJSON {
    static func fromJSON(json: JSON) -> Result {
    return fromJSONObject(json) {
    curry(self.init)
    <^> $0 !! "string"
    <*> $0 !! "num"
    <*> $0 !! "bool"
    <*> $0 !! "null"
    <*> $0 !! "array"
    <*> $0 !! "dict"
    <*> $0 !! "object" // mapping to SubModel
    <*> $0 !? "dummy" // doesn't exist in raw JSON
    }
    }
    }

    View Slide

  42. Decode from JSON String
    let model = decode(jsonString).value
    print(model)
    Result:
    Model(string: "hello", num: 123.45, bool: true, null: nil,
    array: ["hello", 9.99, false],
    dict: ["key": "value"],
    subModel: SubModel(enabled: true),
    dummy: nil)

    View Slide

  43. JSON Encoding

    View Slide

  44. ToJSON (Protocol)
    extension Model: ToJSON {
    static func toJSON(model: Model) -> JSON {
    return toJSONObject([
    "string" ~ model.string,
    "num" ~ model.num,
    "bool" ~ model.bool,
    "null" ~ model.null,
    "array" ~ model.array,
    "dict" ~ model.dict,
    "subModel" ~ model.subModel
    ])
    }
    }

    View Slide

  45. Encode to JSON String
    let jsonString = encode(model)
    print(jsonString)
    Result:
    "{ \"bool\" : true, \"null\" : null,
    \"num\" : 123.45, \"string\" : \"hello\",
    \"array\" : [ \"hello\", 9.99, false ],
    \"dict\" : { \"key\" : \"value\" },
    \"subModel\" : { \"enabled\" : true } }"

    View Slide

  46. Summary (TryParsec)
    • Supports CSV / XML / JSON
    • Simple, readable, and easy to create your own parsers
    • Caveats
    • Needs performance improvements
    • FromJSON / ToJSON doesn't work in some nested structure
    • Swift 3 (with higher kinded types support) will surely
    solve this problem!

    View Slide

  47. Thanks!
    https://github.com/inamiy/TryParsec

    View Slide