Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Parser Combinator in Swift

Parser Combinator in Swift

Yasuhiro Inami

March 04, 2016
Tweet

More Decks by Yasuhiro Inami

Other Decks in Programming

Transcript

  1. Parser Takes input data (frequently text) and builds a data

    structure, e.g. abstract syntax tree (AST) • Lexer • Strings/Bytes → Tokens • Parser • Tokens → Syntax Tree
  2. Parsing Algorithms • Bottom-Up • Operator Precedence Parsing • LR(k)

    (SLR / LALR / etc) • Top-Down • Recursive-descent / LL(k) (Table-driven) • Predictive / Backtracking • Packrat (Recursive-descent + memoization)
  3. Combinator • Lambda expression without free variables • I =

    λx.x = { x in x } • K = λxy.x = { x, y in x } • S = λxyz.xz(yz) = { x, y, z in x(z)(y(z))} • Y = S(K(SII))(S(S(KS)K)(K(SII))) • ι = λx.xSK
  4. Parser Combinator Higher-order function which takes: • Input: Parser(s) •

    Output: New Parser Combining simple parsers to construct more complex parsers → Functional Programming approach
  5. Parser Monad struct Parser<A> { let parse: String -> (A,

    String)? } • Container of state-transforming function • Input: String • Output: Optional tuple of "output" & "remaining input"
  6. // Applicative pure func pure<A>(a: A) -> Parser<A> { return

    Parser { (a, $0) } } // Alternative empty func empty<A>() -> Parser<A> { return Parser { _ in nil } }
  7. // Monad bind (>>=) func >>- <A, B>(p: Parser<A>, f:

    A -> Parser<B>) -> Parser<B> { return Parser { input in if let (result, input2) = p.parse(input) { return f(result).parse(input2) } else { return nil } } } // Functor fmap (<$>) func <^> <A, B>(f: A -> B, p: Parser<A>) -> Parser<B> { return p >>- { a in pure(f(a)) } }
  8. // Alternative choice (associative operation) func <|> <A>(p: Parser<A>, q:

    Parser<A>) -> Parser<A> { return Parser { input in if let (result, input2) = p.parse(input) { return (result, input2) } else { return q.parse(input) } } }
  9. // Applicative sequential application func <*> <A, B>(p: Parser<A ->

    B>, q: Parser<A>) -> Parser<B> { return p >>- { f in f <^> q } } // Sequence actions, discarding the value of the second argument func <* <A, B>(p: Parser<A>, q: Parser<B>) -> Parser<A> { return const <^> p <*> q } // Sequence actions, discarding the value of the first argument func *> <A, B>(p: Parser<A>, q: Parser<B>) -> Parser<B> { return const(id) <^> p <*> q }
  10. Parse 1 character func satisfy(predicate: Character -> Bool) -> Parser<Character>

    { return Parser { input in if let (head, tail) = uncons(input) where predicate(head) { return (tail, head) } else { return nil } } }
  11. Parse 1 character func any() -> Parser<Character> { return satisfy(const(true))

    } func digit() -> Parser<Character> { return satisfy { "0"..."9" ~= $0 } } func char(c: Character) -> Parser<Character> { return satisfy { $0 == c } }
  12. Parse string func string(str: String) -> Parser<String> { if let

    (head, tail) = uncons(str) { return char(head) *> string(tail) *> pure(str) } else { return pure("") } }
  13. Combine parsers func many<A>(p: Parser<A>) -> Parser<[A]> { return many1(p)

    <|> pure([]) } func many1<A>(p: Parser<A>) -> Parser<[A]> { return cons <^> p <*> many(p) }
  14. Combine parsers func skipMany<A>(p: Parser<A>) -> Parser<()> { return skipMany1(p)

    <|> pure(()) } func skipMany1<A>(p: Parser<A>) -> Parser<()> { return p *> skipMany(p) } func skipSpaces() -> Parser<()> { return skipMany(space) }
  15. Parse tokens func symbol(str: String) -> Parser<String> { return skipSpaces()

    *> string(str) <* skipSpaces() } func natural() -> Parser<Int> { return skipSpaces() *> ({ Int(String($0))! } <^> many1(digit())) <* skipSpaces() }
  16. Simple Arithmetics Backus–Naur Form (BNF) <expr> ::= <expr> + <term>

    | <term> <term> ::= <factor> * <term> | <factor> <factor> ::= ( <expr> ) | <number> <number> ::= '0' | '1' | '2' | ...
  17. Simple Arithmetics func expr() -> Parser<Int> { return term() >>-

    { t in // uses right recursion (symbol("+") *> expr() >>- { e in pure(t + e) }) <|> pure(t) } } func term() -> Parser<Int> { return factor() >>- { f in (symbol("*") *> term() >>- { t in pure(f * t) }) <|> pure(f) } } func factor() -> Parser<Int> { return (symbol("(") *> expr() <* symbol(")")) <|> natural() }
  18. let (ans, _) = expr().parse(" ( 12 + 3 )

    * 4+5")! expect(ans) == 65
  19. TryParsec • Monadic Parser Combinator for ✨✨ try! Swift ✨✨

    • Inspired by Haskell's Attoparsec / Aeson • Supports CSV / XML / JSON • LL(*) with backtracking by default • Doesn't try, but please try! :)
  20. TryParsec • Basic Operators: >>-, <^>, <*>, *>, <*, <|>,

    <?> • Combinators: many, many1, manyTill, zeroOrOne, skipMany, skipMany1, sepBy, sepBy1, sepEndBy, sepEndBy1, count, chainl, chainl1, chainr, chainr1 • Text (UnicodeScalarView): peek, endOfInput, satisfy, skip, skipWhile, take, takeWhile, any, char, not, string, asciiCI, oneOf, noneOf, space, skipSpaces, number... (etc)
  21. enum JSON enum JSON { case String(Swift.String) case Number(Double) case

    Bool(Swift.Bool) case Null case Array([JSON]) case Object([Swift.String : JSON]) }
  22. JSON example { "string": "hello", "num": 123.45, "bool": true, "null":

    null, "array": ["hello", 9.99, false], "dict": { "key": "value" }, "object": { "enabled": true } }
  23. Parse JSON (to AST) let json = parseJSON(jsonString).value print(json) Result:

    JSON.Object(["null": .Null, "num": .Number(123.45), "bool": .Bool(true), "string": .String("hello"), "array": .Array([.String("hello"), .Number(9.99), .Bool(false)]), "dict": .Object(["key": .String("value")]), "object": .Object(["enabled": .Bool(true)]) ])
  24. struct Model struct Model { let string: String let num:

    Double let bool: Bool let null: Any? let array: [Any] let dict: [String : Any] let subModel: SubModel let dummy: Bool? // doesn't exist in raw JSON }
  25. FromJSON (Protocol) extension Model: FromJSON { static func fromJSON(json: JSON)

    -> Result<Model, JSON.ParseError> { return fromJSONObject(json) { curry(self.init) <^> $0 !! "string" <*> $0 !! "num" <*> $0 !! "bool" <*> $0 !! "null" <*> $0 !! "array" <*> $0 !! "dict" <*> $0 !! "object" // mapping to SubModel <*> $0 !? "dummy" // doesn't exist in raw JSON } } }
  26. Decode from JSON String let model = decode(jsonString).value print(model) Result:

    Model(string: "hello", num: 123.45, bool: true, null: nil, array: ["hello", 9.99, false], dict: ["key": "value"], subModel: SubModel(enabled: true), dummy: nil)
  27. ToJSON (Protocol) extension Model: ToJSON { static func toJSON(model: Model)

    -> JSON { return toJSONObject([ "string" ~ model.string, "num" ~ model.num, "bool" ~ model.bool, "null" ~ model.null, "array" ~ model.array, "dict" ~ model.dict, "subModel" ~ model.subModel ]) } }
  28. Encode to JSON String let jsonString = encode(model) print(jsonString) Result:

    "{ \"bool\" : true, \"null\" : null, \"num\" : 123.45, \"string\" : \"hello\", \"array\" : [ \"hello\", 9.99, false ], \"dict\" : { \"key\" : \"value\" }, \"subModel\" : { \"enabled\" : true } }"
  29. Summary (TryParsec) • Supports CSV / XML / JSON •

    Simple, readable, and easy to create your own parsers • Caveats • Needs performance improvements • FromJSON / ToJSON doesn't work in some nested structure • Swift 3 (with higher kinded types support) will surely solve this problem!