Manipulating Abstract Syntax Trees (ASTs) to generate safe SQL Queries with Quill

Manipulating Abstract Syntax Trees (ASTs) to generate safe SQL Queries with Quill

Quill (https://getquill.io/) is a LINQ for Scala, which transforms collection-like code into SQL queries in compile-time. In this session you will learn how Quill uses the compiler to generate safe SQL code, generating and parsing ASTs, making inferences and even performing code optimisations.

6b8d8f84c065846e8ec1f12f67d79991?s=128

Juliano Alves

April 04, 2020
Tweet

Transcript

  1. Juliano Alves @vonjuliano juliano-alves.com Manipulating Abstract Syntax Trees (ASTs) to

    generate safe SQL Queries with Quill.
  2. Who am I? • Software Engineer, Searcher of perfect modularization,

    Lover of Functional Languages • The cool ones Scala, Clojure, Elixir • The "vintage" ones Java, C#, Python, Ruby @vonjuliano juliano-alves.com
  3. None
  4. None
  5. None
  6. None
  7. https://getquill.io

  8. Quill Secret Sauce

  9. Quill Secret Sauce

  10. Quill Secret Sauce

  11. What is an AST?

  12. Abstract syntax trees are data structures widely used in compilers

    to represent the structure of program code ... It often serves as an intermediate representation of the program https://en.wikipedia.org/wiki/Abstract_syntax_tree
  13. Quill AST

  14. Overview of Quill AST

  15. A Query represents database/collections actions.

  16. An Operation happens between operands and operators. (A || B

    && C)
  17. An Action is basically an statement.

  18. Ordering defines the order.

  19. A Value indicates the different contexts for values.

  20. A Lift represents lifted values/structures coming from outside the quotation

    scope
  21. Iterable Operation handles collections ops

  22. Option Operation knows EVERYTHING about options

  23. Other elements

  24. Complete-ish Quill AST

  25. Show me the code

  26. scala> case class Person(id: Int, name: String, age: Int) scala>

    case class Address(fk: Option[Int], street: String, number :Int) scala> val q1 = quote { query[Person] } scala> pprint.pprintln(q1.ast) Entity("Person", List())
  27. scala> val q2 = quote { query[Person].filter(_.age > 25) }

    scala> pprint.pprintln(q2.ast) Filter( Entity("Person", List()), Ident("x1"), BinaryOperation( Property(Ident("x1"), "age"), >, Constant(25) ) )
  28. scala> val q3 = quote { | query[Person].filter(_.age > 25).map(_.name)

    | } scala> pprint.pprintln(q3.ast) Map( Filter( Entity("Person", List()), Ident("x1"), BinaryOperation( Property(Ident("x1"), "age"), >, Constant(25)) ), Ident("x2"), Property(Ident("x2"), "name") )
  29. scala> val q4 = quote { | query[Person] | .leftJoin(query[Address])

    | .on((p, a) => a.fk.exists(_ == p.id)) | .map {case (p, a) => (p.name, a.flatMap(_.fk))} | } scala> pprint.pprintln(q4.ast)
  30. Map( Join( LeftJoin, Entity("Person", List()), Entity("Address", List()), Ident("p"), Ident("a"), OptionExists(Property(Ident("a"),

    "fk"), Ident("x1"), BinaryOperation(Ident("x1"), ==, Property(Ident("p"), "id"))) ), Ident("x01"), Tuple( List( Property(Property(Ident("x01"), "_1"), "name"), OptionTableFlatMap(Property(Ident("x01"), "_2"), Ident("x2"), Property(Ident("x2"), "fk")) ) ) )
  31. Tokens and Tokenizers

  32. "A Token is a sequence of characters that can be

    treated as a unit in the grammar of the programming languages" https://www.geeksforgeeks.org/introduction-of-lexical-analysis/ Keywords, identifiers or operators are tokens.
  33. // Statement.scala sealed trait Token sealed trait TagToken extends Token

    case class StringToken(string: String) extends Token { override def toString = string } case class ScalarTagToken(lift: ScalarTag) extends TagToken { override def toString = s"lift(${lift.name})" } case class QuotationTagToken(tag: QuotationTag) extends TagToken { override def toString = s"quoted(${tag.uid()})" }
  34. // Statement.scala case class ScalarLiftToken(lift: ScalarLift) extends Token { override

    def toString = s"lift(${lift.name})" } case class Statement(tokens: List[Token]) extends Token { override def toString = tokens.mkString } case class SetContainsToken(a: Token, op: Token, b: Token) extends Token { override def toString = s"${a.toString} ${op.toString} (${b.toString})" }
  35. // Ast.scala sealed trait Ast { override def toString =

    { import ... implicit def externalTokenizer: Tokenizer[External] = Tokenizer[External](_ => stmt"?") implicit val namingStrategy: NamingStrategy = io.getquill.Literal this.token.toString } }
  36. // StatementInterpolator.scala trait Tokenizer[T] { def token(v: T): Token }

    implicit def listTokenizer[T](implicit tokenize: Tokenizer[T]): Tokenizer[List[T]] = Tokenizer[List[T]] { case list => list.mkStmt() } implicit def stringTokenizer: Tokenizer[String] = Tokenizer[String] { case string => StringToken(string) } // many more implementations
  37. Let's tokenize!

  38. // SqlQuery.scala case class FlattenSqlQuery( from: List[FromContext] = List(), where:

    Option[Ast] = None, groupBy: Option[Ast] = None, orderBy: List[OrderByCriteria] = Nil, limit: Option[Ast] = None, offset: Option[Ast] = None, select: List[SelectValue], distinct: Boolean = false ) extends SqlQuery
  39. // SqlIdiom.scala > FlattenSqlQueryTokenizerHelper protected class FlattenSqlQueryTokenizerHelper(q: FlattenSqlQuery) (implicit astTokenizer:

    Tokenizer[Ast], strategy: NamingStrategy) { import q._ def distinctTokenizer = (if (distinct) "DISTINCT " else "").token def withDistinct = select match { case Nil => stmt"$distinctTokenizer*" case _ => stmt"$distinctTokenizer${select.token}" } DISTINCT $t
  40. // SqlIdiom.scala > FlattenSqlQueryTokenizerHelper def withFrom = from match {

    case Nil => withDistinct case head :: tail => val t = tail.foldLeft(stmt"${head.token}") { case (a, b: FlatJoinContext) => stmt"$a ${(b: FromContext).token}" case (a, b) => stmt"$a, ${b.token}" } stmt"$withDistinct FROM $t" } DISTINCT $t FROM $t
  41. // SqlIdiom.scala > FlattenSqlQueryTokenizerHelper def withWhere = where match {

    case None => withFrom case Some(where) => stmt"$withFrom WHERE ${where.token}" } DISTINCT $t FROM $t WHERE $t
  42. // SqlIdiom.scala > FlattenSqlQueryTokenizerHelper def withGroupBy = groupBy match {

    case None => withWhere case Some(groupBy) => stmt"$withWhere GROUP BY ${tokenizeGroupBy(groupBy)}" } DISTINCT $t FROM $t WHERE $t GROUP BY $t
  43. // SqlIdiom.scala > FlattenSqlQueryTokenizerHelper def withOrderBy = orderBy match {

    case Nil => withGroupBy case orderBy => stmt"$withGroupBy ${tokenOrderBy(orderBy)}" } protected def tokenOrderBy(criterias: List[OrderByCriteria]) (implicit astTokenizer: Tokenizer[Ast], strategy: NamingStrategy) = stmt"ORDER BY ${criterias.token}" DISTINCT $t FROM $t WHERE $t GROUP BY $t ORDER BY $t
  44. // SqlIdiom.scala def withLimitOffset = limitOffsetToken(withOrderBy).token((limit, offset)) protected def limitOffsetToken(query:

    Statement) (implicit astTokenizer: Tokenizer[Ast], strategy: NamingStrategy) = Tokenizer[(Option[Ast], Option[Ast])] { case (None, None) => query case (Some(limit), None) => stmt"$query LIMIT ${limit.token}" case (Some(limit), Some(offset)) => stmt"$query LIMIT ${limit.token} OFFSET ${offset.token}" case (None, Some(offset)) => stmt"$query OFFSET ${offset.token}" } DISTINCT $t FROM $t WHERE $t GROUP BY $t ORDER BY $t LIMIT $t
  45. // SqlIdiom.scala > FlattenSqlQueryTokenizerHelper def apply = stmt"SELECT $withLimitOffset" SELECT

    DISTINCT $t FROM $t WHERE $t GROUP BY $t ORDER BY $t LIMIT $t
  46. // SqlNormalize.scala private val normalize = (identity[Ast] _) .andThen(DemarcateExternalAliases.apply _)

    .andThen(new FlattenOptionOperation(concatBehavior).apply _) .andThen(new SimplifyNullChecks(equalityBehavior).apply _) .andThen(Normalize.apply _) // Need to do RenameProperties before ExpandJoin which normalizes-out all the tuple indexes // on which RenameProperties relies .andThen(RenameProperties.apply _) .andThen(ExpandDistinct.apply _) .andThen(NestImpureMappedInfix.apply _) .andThen(Normalize.apply _) .andThen(ExpandJoin.apply _) .andThen(ExpandMappedInfix.apply _) .andThen(Normalize.apply _)) def apply(ast: Ast) = normalize(ast)
  47. scala> case class Person(id: Int, name: String, age: Int) scala>

    case class Address(fk: Option[Int], street: String, number :Int) scala> import io.getquill._ scala> val ctx = new SqlMirrorContext(PostgresDialect, Literal) scala> import ctx._
  48. scala> val q1 = quote { query[Person] } scala> pprint.pprintln(q1.ast)

    Entity("Person", List()) scala> pprint.pprintln(ctx.run(q1).string) "SELECT x.id, x.name, x.age FROM Person x"
  49. scala> val q2 = quote { query[Person].filter(_.age > 25) }

    scala> pprint.pprintln(q2.ast) Filter( Entity("Person", List()), Ident("x1"), BinaryOperation( Property(Ident("x1"), "age"), >, Constant(25) ) ) scala> pprint.pprintln(ctx.run(q2).string) "SELECT x1.id, x1.name, x1.age FROM Person x1 WHERE x1.age > 25"
  50. scala> val q3 = quote { | query[Person].filter(_.age > 25).map(_.name)

    | } scala> pprint.pprintln(q3.ast) Map( Filter( Entity("Person", List()), Ident("x1"), BinaryOperation( Property(Ident("x1"), "age"), >, Constant(25)) ), Ident("x2"), Property(Ident("x2"), "name") )
  51. scala> pprint.pprintln(ctx.run(q3).string) "SELECT x1.name FROM Person x1 WHERE x1.age >

    25"
  52. scala> val q4 = quote { | query[Person] | .leftJoin(query[Address])

    | .on((p, a) => a.fk.exists(_ == p.id)) | .map {case (p, a) => (p.name, a.flatMap(_.fk))} | } scala> pprint.pprintln(q4.ast)
  53. Map( Join( LeftJoin, Entity("Person", List()), Entity("Address", List()), Ident("p"), Ident("a"), OptionExists(Property(Ident("a"),

    "fk"), Ident("x1"), BinaryOperation(Ident("x1"), ==, Property(Ident("p"), "id"))) ), Ident("x01"), Tuple( List( Property(Property(Ident("x01"), "_1"), "name"), OptionTableFlatMap(Property(Ident("x01"), "_2"), Ident("x2"), Property(Ident("x2"), "fk")) ) ) )
  54. scala> pprint.pprintln(ctx.run(q4).string) "SELECT p.name, a.fk FROM Person p LEFT JOIN

    Address a ON a.fk = p.id"
  55. Beta Reductions

  56. scala> val pq = quote { | query[Person].map(_.id).filter(_ > 1)

    | } scala> val aq = quote { | query[Address].map(_.fk).filter(_.exists(_ > 1)) | } scala> pprint.pprintln(pq.ast) scala> pprint.pprintln(aq.ast)
  57. Filter( Map(Entity("Person", List()), Ident("x1"), Property(Ident("x1"), "id")), Ident("x2"), BinaryOperation(Ident("x2"), >, Constant(1))

    ) Filter( Map(Entity("Address", List()), Ident("x1"), Property(Ident("x1"), "fk")), Ident("x2"), OptionExists(Ident("x2"), Ident("x3"), BinaryOperation(Ident("x3"), >, Constant(1))) )
  58. scala> pprint.pprintln(ctx.run(pq).string) "SELECT x1.id FROM Person x1 WHERE x1.id >

    1" scala> pprint.pprintln(ctx.run(aq).string) "SELECT x1.fk FROM Address x1 WHERE x1.fk > 1"
  59. A beta reduction is the process of calculating a result

    from the application of a function to an expression. https://wiki.haskell.org/Beta_reduction
  60. // FlattenOptionOperation.scala override def apply(ast: Ast): Ast = ast match

    { // ... case OptionExists(ast, alias, body) => if (containsNonFallthroughElement(body)) { val reduction = BetaReduction(body, alias -> ast) apply((IsNotNullCheck(ast) +&&+ reduction): Ast) } else { uncheckedReduction(ast, alias, body) } // ... }
  61. // BetaReduction.scala override def apply(o: OptionOperation): OptionOperation = o match

    { // ... case OptionMap(a, b, c) => OptionMap(apply(a), b, BetaReduction(replacements - b)(c)) case OptionForall(a, b, c) => OptionForall(apply(a), b, BetaReduction(replacements - b)(c)) case OptionExists(a, b, c) => OptionExists(apply(a), b, BetaReduction(replacements - b)(c)) case other => super.apply(other) }
  62. Optimizations

  63. scala> case class Person(id: Int, name: String, age: Int) scala>

    val q = quote { | (value: String) => | if (value == "drinks") | query[Person].filter(_.age >= 21) | else | query[Person] | } scala> pprint.pprintln(q.ast)
  64. Function( List(Ident("value")), If( BinaryOperation(Ident("value"), ==, Constant("drinks")), Filter( Entity("Person", List()), Ident("x1"),

    BinaryOperation( Property(Ident("x1"), "age"), >=, Constant(21)) ), Entity("Person", List()) ) )
  65. scala> pprint.pprintln(q("drinks").ast) If( BinaryOperation(Constant("drinks"), ==, Constant("drinks")), Filter( Entity("Person", List()), Ident("x1"),

    BinaryOperation( Property(Ident("x1"), "age"), >=, Constant(21)) ), Entity("Person", List()) )
  66. > sbt -Dquill.transform.reducetrivial=true . // TriviallyCheckable.scala def unapply(ast: Ast): Option[Ast]

    = ast match { // ... case TriviallyCheckable(one) +==+ TriviallyCheckable(two) if (one == two) => Some(Constant(true)) case TriviallyCheckable(one) +==+ TriviallyCheckable(two) if (one != two) => Some(Constant(false)) // ... } https://github.com/getquill/quill/pull/1693
  67. > sbt -Dquill.transform.reducetrivial=true . // BetaReduction.scala override def apply(ast: Ast):

    Ast = ast match { // ... case If(TriviallyTrueCondition(), thenClause, _) if (Messages.reduceTrivials) => apply(thenClause) case If(TriviallyFalseCondition(), _, elseClause) if (Messages.reduceTrivials) => apply(elseClause) // ... } https://github.com/getquill/quill/pull/1693
  68. scala> pprint.pprintln(q("drinks").ast) Filter( Entity("Person", List()), Ident("x1"), BinaryOperation( Property(Ident("x1"), "age"), >=,

    Constant(21)) ) scala> pprint.pprintln(ctx.run(q("drinks")).string) "SELECT x1.id, x1.name, x1.age FROM Person x1 WHERE x1.age >= 21"
  69. scala> pprint.pprintln(q("whatever").ast) Entity("Person", List()) scala> pprint.pprintln(ctx.run(q("whatever")).string) "SELECT x1.id, x1.name, x1.age

    FROM Person x1"
  70. Can we optimize lifted values?

  71. scala> val v = "other" scala> pprint.pprintln(q(lift(v)).ast) If( BinaryOperation( ScalarValueLift("$line16.$read.$iw.$iw.$iw.$iw.$iw.$iw.v","other",

    MirrorEncoder(<function3>)), ==, Constant("drinks") ), Filter( Entity("Person", List()), Ident("x1"), BinaryOperation(Property(Ident("x1"), "age"), >=, Constant(21)) ), Map( Entity("Person", List()), Ident("x2"), Property(Ident("x2"), "name") ) )
  72. Conclusion

  73. Quill is a "Domain Specific Compiler" Define/manipulate your own domain

    Abstract Syntax Trees Enforce correctness Type checking Transform and build outputs Normalization Improve how code works! Optimization
  74. https://getquill.io "This is the project you are looking for"

  75. Questions?

  76. Juliano Alves @vonjuliano juliano-alves.com Manipulating Abstract Syntax Trees (ASTs) to

    generate safe SQL Queries with Quill. Thank you!