Slide 1

Slide 1 text

Talking SQL to strangers Oleksii Kachaiev, @kachayev Principal Engineer @ Riot Games, League of Legends June, 2020

Slide 2

Slide 2 text

SQL as a Language • old invention, popularized by RDMS in 70s • a few years of being "not cool" • collateral damage of NoSQL movement? • what do we have now? • databases that never talked SQL (Druid, Cassandra[1] etc) • not even databases (Spark, Flink, Kafka etc) [1] limitations applied

Slide 3

Slide 3 text

SQL as an API • GraphQL is a query language made it to be an API • (probably for a wrong reason) • Strata talk comparing GraphQL and SQL • (mostly recognized as a smart trolling) • this talk: how to expose SQL API • (dealing with why as we go)

Slide 4

Slide 4 text

SQL as an API: Demo • camille-sql • running SQL queries over local Maven artifacts

Slide 5

Slide 5 text

Maven Artifacts Tree % tree ~/.m2/repository/ | less /Users/okachaiev/.m2/repository ├── aleph │ └── aleph │ └── 0.4.7-alpha5 │ ├── _remote.repositories │ ├── aleph-0.4.7-alpha5.jar │ ├── aleph-0.4.7-alpha5.jar.sha1 │ ├── aleph-0.4.7-alpha5.pom │ └── aleph-0.4.7-alpha5.pom.sha1 ├── ant │ └── ant │ └── 1.6.5 │ ├── _remote.repositories │ ├── ant-1.6.5.jar │ ├── ant-1.6.5.jar.sha1 │ ├── ant-1.6.5.pom │ └── ant-1.6.5.pom.sha1

Slide 6

Slide 6 text

camille-sql: Running Server % bin/camille-server [INFO] Scanning for projects... [INFO] [INFO] -----------------< org.okachaiev.camille:camille-sql >------------------ [INFO] Building Camille SQL 1.0-SNAPSHOT [INFO] --------------------------------[ jar ]--------------------------------- [INFO] [INFO] >>> exec-maven-plugin:1.2.1:java (default-cli) > validate @ camille-sql >>> [INFO] [INFO] <<< exec-maven-plugin:1.2.1:java (default-cli) < validate @ camille-sql <<< [INFO] [INFO] [INFO] --- exec-maven-plugin:1.2.1:java (default-cli) @ camille-sql --- Artifacts repository path: /Users/okachaiev/.m2/repository/ Running server on localhost:26727

Slide 7

Slide 7 text

camille-sql: PSQL Client % PGPASSWORD=nopass psql "sslmode=disable host=localhost port=26727" psql (12.2, server 9.5.0) Type "help" for help. okachaiev=>

Slide 8

Slide 8 text

camille-sql: Queries okachaiev=> select * from artifacts limit 6; uid | group_id | artifact_id | name | url ------------+--------------------------+--------------------------+---------------------+------------------------------------------ 3227713579 | alandipert | desiderata | desiderata | https://github.com/alandipert/desiderata 3382955103 | aopalliance | aopalliance | AOP alliance | http://aopalliance.sourceforge.net 1507835947 | asm | asm-parent | ASM | http://asm.objectweb.org/ 226341444 | backport-util-concurrent | backport-util-concurrent | Backport of JSR 166 | http://backport-jsr166.sourceforge.net/ 1712481681 | biz.aQute | bndlib | BND Library | http://www.aQute.biz/Code/Bnd 2280883480 | biz.aQute.bnd | biz.aQute.bndlib | biz.aQute.bndlib | https://bnd.bndtools.org/ (6 rows) okachaiev=> select * from versions where filesize > 10000 limit 5; uid | version | filesize | last_modified | sha1 ------------+---------+----------+-------------------------+------------------------------------------ 3345961009 | 1.3.2 | 337129 | 2019-07-04 23:36:26.464 | ff84d15cfeb0825935a170d7908fbfae00498050 1053708643 | 1.0.1 | 26514 | 2019-07-04 23:23:20.322 | 49c100caf72d658aca8e58bd74a4ba90fa2b0d70 2740841946 | 1.6.5 | 1034049 | 2019-07-05 05:37:10.953 | 7d18faf23df1a5c3a43613952e0e8a182664564b 925895164 | 0.4.4 | 42645 | 2020-02-01 06:45:59.599 | 2522f7f1b4bab169a2540406eb3eb71f7d6e3003 136773645 | 1.9 | 263965 | 2019-07-04 23:25:30.09 | 9ce04e34240f674bc72680f8b843b1457383161a (5 rows)

Slide 9

Slide 9 text

camille-sql: Queries okachaiev=> SELECT group_id, COUNT(*) AS n_files okachaiev-> FROM artifacts okachaiev-> LEFT JOIN versions ON artifacts.uid=versions.uid okachaiev-> GROUP BY group_id okachaiev-> ORDER BY n_files DESC; group_id | n_files ----------------------------+--------- org.apache.hadoop | 84 org.apache.maven | 54 org.codehaus.plexus | 50 org.apache.commons | 49 com.fasterxml.jackson.core | 45 com.twitter | 44 com.nimbusds | 38 org.typelevel | 37 org.scala-lang | 31 org.apache.maven.doxia | 24 (10 rows)

Slide 10

Slide 10 text

Building Blocks • server/client protocol • SQL query lifecycle • parser • planner • optimizer • execution • bridge between the two

Slide 11

Slide 11 text

Protocol • the goal: to choose wisely to have clients • e.g. you have a good client for HTTP: browser • not the case for SQL, databases use different protocols • JDBC makes them look similar in Java • (but we're looking a layer deeper)

Slide 12

Slide 12 text

Protocol: PGWIRE • originally PostgreSQL • own low-level binary serialization format • own somewhat documented control flow • why? • clients: psql, SQL IDEs, BI • adoption outside PostgreSQL: Cockroach, Materialize • high performance streaming for large datasets

Slide 13

Slide 13 text

Sequence Diagram (simplified)

Slide 14

Slide 14 text

Protocol: Ignoring Today • SSL handshake (custom) • authentication handshake • prepared statements & bindings • data copy • cursors, suspensions, cancellation • (and a lot more)

Slide 15

Slide 15 text

Protocol: Implementation • Netty to run TCP server • custom Codec to deal with network serialization • sequence of Handlers • mapping between transport and application level logic • Handler to orchestrate main loop: get query, execute, send result set • async i/o is an absolute overkill in this case

Slide 16

Slide 16 text

Protocol: Example Serialization public ByteBuf toByteBuf(ByteBufAllocator allocator) { final ByteBuf buf = allocator.buffer(12 + this.totalLength); buf.writeByte('E'); // not included in length buf.writeInt(11 + this.totalLength); // length, self-included buf.writeByte('S'); // magical constant buf.writeBytes(this.severity.getBytes()); buf.writeZero(1); // null-termination buf.writeByte('C'); // magical constant buf.writeBytes(this.code.getBytes()); buf.writeZero(1); // null-termination buf.writeByte('M'); // magical constant buf.writeBytes(this.message.getBytes()); buf.writeZero(2); // double null-termination return buf; }

Slide 17

Slide 17 text

Life of a Query: Theory • parse • plan (compile) • optimize • execute

Slide 18

Slide 18 text

Life of a Query: Practice • leverage Apache Calcite • not the easiest framework to work with • alternatives: • Catalyst (too much Spark) • ZetaSQL (too young)

Slide 19

Slide 19 text

Life of a Query: Calcite • parse ← lexer, dialect • plan (compile) ← catalog (databases, "tables") • optimize ← statistics (CBO), rules folding • execution ← scan definitions

Slide 20

Slide 20 text

Life of a Query: Lexer • SQL is standard, right? well... it depends • Lex is a set of rules that defines parsing • using MYSQL_ANSI: double quotes, case insensitive • still problematic to deal with • expressions like E"\n" • ; as a query termination (standard but optional)

Slide 21

Slide 21 text

Life of a Query: Parser • Babel-based parser • Default Calcite dialect (uppercase, "`" for quoting) SELECT `group_id`, COUNT(*) AS `n_files` FROM `versions` GROUP BY `group_id` ORDER BY `n_files` DESC

Slide 22

Slide 22 text

Life of a Query: Catalog • schema: tables, functions, types etc • table: tables, views, temp views etc • table is anything that implements Table interface • declare "statically": • 2 tables • each table defines own row type • enough information for compilation

Slide 23

Slide 23 text

Register Tables (Statically) this.resolver = new MavenArtifactsResolver(baseFolder); this.artifactsTable = new MavenArtifactsTable(this.resolver); this.versionsTable = new MavenArtifactVersionsTable(this.resolver); ... SchemaPlus rootSchema = calciteConnection.getRootSchema(); rootSchema.add("artifacts", artifactsTable); rootSchema.add("versions", versionsTable);

Slide 24

Slide 24 text

Table Defines Row Type protected final RelProtoDataType protoRowType = new RelProtoDataType() { public RelDataType apply(RelDataTypeFactory typeFactory) { return new RelDataTypeFactory.Builder(typeFactory) .add("uid", SqlTypeName.BIGINT) .add("group_id", SqlTypeName.VARCHAR, 1023) .add("artifact_id", SqlTypeName.VARCHAR, 255) .add("name", SqlTypeName.VARCHAR) .add("url", SqlTypeName.VARCHAR) .build(); } };

Slide 25

Slide 25 text

Life of a Query: Logical Plan select group_id, name from artifacts ↓ Root { kind: SELECT, rel: rel#5:LogicalProject#5, rowType: RecordType(VARCHAR(1023) group_id, VARCHAR name), fields: [<0, group_id>, <1, name>], } ↓ LogicalProject(group_id=[$1], name=[$3]) LogicalTableScan(table=[[artifacts]])

Slide 26

Slide 26 text

Life of a Query: Logical Plan select group_id, name from artifacts where group_id = 'nrepl' order by artifact_id limit 20 ↓ LogicalProject(group_id=[$0], name=[$1]) LogicalSort(sort0=[$2], dir0=[ASC], fetch=[20]) LogicalProject(group_id=[$1], name=[$3], artifact_id=[$2]) LogicalFilter(condition=[=($1, 'nrepl')]) LogicalTableScan(table=[[artifacts]])

Slide 27

Slide 27 text

Life of a Query: Logical Plan select group_id, count(*) as n_files from artifacts left join versions on artifacts.uid=versions.uid group by group_id order by n_files desc ↓ LogicalSort(sort0=[$1], dir0=[DESC]) LogicalAggregate(group=[{0}], n_files=[COUNT()]) LogicalProject(group_id=[$1]) LogicalJoin(condition=[=($0, $5)], joinType=[left]) LogicalTableScan(table=[[artifacts]]) LogicalTableScan(table=[[versions]])

Slide 28

Slide 28 text

Life of a Query: Execution • minimal requirement: definition of "full scan" @Override public Enumerable scan(DataContext root) { // better replace this with something interesting :) return Linq4j.emptyEnumerable(); }

Slide 29

Slide 29 text

Life of a Query: Execution • result is a Linq4j enumerable (collection or dynamic) • analog of C# LINQ (Language Integrated Queries) • Items.Select(*).Where(Item.Version > 1).Take(5) • a few years later, Spark popularized the idea for Big Data • Linq4j does heavy-lifting (folding relational operators)

Slide 30

Slide 30 text

Life of a Query: Optimizations • generic optimizations provided by the framework • cost-based optimizations • opt out by setting Statistics.UNKNOWN • specifics: • skip calculating file size (projection pruning) • skip reading checksum (projection pruning) • jump to subfolder (predicate push-down)

Slide 31

Slide 31 text

Projection Pruning • extremely common technique for big data analytics • (parquet, hehe) @Override public Enumerable scan( DataContext root, int[] projects) { // now we know what columns are necessary return Linq4j.emptyEnumerable(); }

Slide 32

Slide 32 text

Predicate Push-Down • Calcite supports basic filters push-down • analysis of predicate is... non trivial @Override public Enumerable scan(DataContext root, List filters, int[] projects) { // now we also know predicates return Linq4j.emptyEnumerable(); }

Slide 33

Slide 33 text

Predicate Push-Down Exact match: select * from artifacts where group_id = 'nrepl'; Exact prefix: select * from artifacts where group_id LIKE 'com.apache.%';

Slide 34

Slide 34 text

Predicate Push-Down What if... select * from artifacts where group_id = 'nrepl' OR group_id = 'maven'; select * from artifacts where group_id = 'nrepl' AND group_id = 'maven'; What if... select * from artifacts where LOWER(group_id) + CAST("spark" AS VARCHAR) = 'com.apache.spark';

Slide 35

Slide 35 text

Predicates • do not need to read the code • analysis is too rigit and inflexible • might be easily "broken" by user or even compiler • imperative code quickly turns into a mess • visitor pattern helps, still :( • pattern matching, pls • logical PL, ideally

Slide 36

Slide 36 text

Optimization: toRel • table as a relational expression • opt-out from being a TableScan • e.g. artifacts might be defined as DISTINCT over versions • or, both artifacts and version tables are Project + Filter applied to virtual files table • in both cases, JOIN between them might be folded

Slide 37

Slide 37 text

Optimization: toRel • nice part of relational algebra: it's an algebra • hardest part of relational algebra: it's an algebra

Slide 38

Slide 38 text

Stitch Them Together • leverage Avatica (part of Calcite project) • standard JDBC interface • registers Calcite as a driver • local & remote connections

Slide 39

Slide 39 text

HAVING Q and A Oleksii Kachaiev camille-sql