Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ToroDB Internals: How to Create a NoSQL Database on Top of SQL

8Kdata
November 18, 2015

ToroDB Internals: How to Create a NoSQL Database on Top of SQL

ToroDB. New NoSQL database. Open source. Not reinventing the wheel, runs on top of PostgreSQL. MongoDB-compatible. With transactions. Does not suffer dirty-reads. With replication & sharding. This talk will show to the audience that SQL databases are not dead --definitely not. They shine. Even competing face to face with NoSQL. Now it's possible to build a NoSQL database on top of a SQL database. And probably, a better one. Discover how PostgreSQL can become a NoSQL database: how to map NoSQL documents to relational (no jsonb needed!); how to emulate MongoDB replication with PostgreSQL's awesome logical decoding; how to map MongoDB queries to SQL; how to benchmark and compare ToroDB & MongoDB. And how to turn a MongoDB-compatible database into a big data and analytics animal, in pure SQL.

8Kdata

November 18, 2015
Tweet

More Decks by 8Kdata

Other Decks in Programming

Transcript

  1. ToroDB @NoSQLonSQL About *8Kdata* • Research & Development in databases

    • Consulting, Training and Support in PostgreSQL • Founders of PostgreSQL España, 5th largest PUG in the world (>500 members as of today) • About myself: CTO at 8Kdata: @ahachete http://linkd.in/1jhvzQ3 www.8kdata.com
  2. ToroDB @NoSQLonSQL ToroDB in one slide • Document-oriented, JSON, NoSQL

    db • Open source (AGPL) • MongoDB compatibility (wire protocol level) • Uses PostgreSQL as a storage backend
  3. ToroDB @NoSQLonSQL How to make a NoSQL database • Varying

    schema (aka “schema-less”) • Document-oriented API • Replication • Horizontal scalability
  4. ToroDB @NoSQLonSQL ToroDB's Document 2 Relational transformation • Data is

    stored in tables. No blobs • JSON documents are split by hierarchy levels into “subdocuments”, which contain no nested structures. Each subdocument level is stored separately • Subdocuments are classified by “type”. Each “type” maps to a different table
  5. ToroDB @NoSQLonSQL • A “structure” table keeps the subdocument “schema”

    • Keys in JSON are mapped to attributes, which retain the original name • Tables are created dinamically and transparently to match the exact types of the documents ToroDB's Document 2 Relational transformation
  6. ToroDB @NoSQLonSQL ToroDB storage internals { "name": "ToroDB", "data": {

    "a": 42, "b": "hello world!" }, "nested": { "j": 42, "deeper": { "a": 21, "b": "hello" } } }
  7. ToroDB @NoSQLonSQL ToroDB storage internals The document is split into

    the following subdocuments: { "name": "ToroDB", "data": {}, "nested": {} } { "a": 42, "b": "hello world!"} { "j": 42, "deeper": {}} { "a": 21, "b": "hello"}
  8. ToroDB @NoSQLonSQL ToroDB storage internals select * from demo.t_3 ┌─────┬───────┬────────────────────────────┬────────┐

    │ did │ index │ _id │ name │ ├─────┼───────┼────────────────────────────┼────────┤ │ 0 │ ¤ │ \x5451a07de7032d23a908576d │ ToroDB │ └─────┴───────┴────────────────────────────┴────────┘ select * from demo.t_1 ┌─────┬───────┬────┬──────────────┐ │ did │ index │ a │ b │ ├─────┼───────┼────┼──────────────┤ │ 0 │ ¤ │ 42 │ hello world! │ │ 0 │ 1 │ 21 │ hello │ └─────┴───────┴────┴──────────────┘ select * from demo.t_2 ┌─────┬───────┬────┐ │ did │ index │ j │ ├─────┼───────┼────┤ │ 0 │ ¤ │ 42 │ └─────┴───────┴────┘
  9. ToroDB @NoSQLonSQL ToroDB storage internals select * from demo.structures ┌─────┬────────────────────────────────────────────────────────────────────────────┐

    │ sid │ _structure │ ├─────┼────────────────────────────────────────────────────────────────────────────┤ │ 0 │ {"t": 3, "data": {"t": 1}, "nested": {"t": 2, "deeper": {"i": 1, "t": 1}}} │ └─────┴────────────────────────────────────────────────────────────────────────────┘ select * from demo.root; ┌─────┬─────┐ │ did │ sid │ ├─────┼─────┤ │ 0 │ 0 │ └─────┴─────┘
  10. ToroDB @NoSQLonSQL Why not jsonb? • jsonb is really cool

    • But keys (metadata) are repeated most of the time (storage, I/O savings) • Docuements are not classified (normalized) • ToroDB does all that
  11. ToroDB @NoSQLonSQL Metadata repetition { “a”: 1, “b”: 2 }

    { “a”: 3 } { “a”: 4, “c”: 5 } { “a”: 6, “b”: 7 } { “b”: 8 } { “a”: 9, “b”: 10 } { “a”: 11, “b”: 12, “j”: 13 } { “a”: 14, “c”: 15 } Counting “document types” in collections of millions: at most, 1000s of different types
  12. ToroDB @NoSQLonSQL ToroDB storage and I/O savings 29% - 68%

    storage required, compared to Mongo 2.6
  13. ToroDB @NoSQLonSQL ToroDB: query “by structure” • ToroDB is effectively

    partitioning by type • Structures (schemas, partitioning types) are cached in ToroDB memory • Queries only scan a subset of the data • Negative queries are served directly from memory
  14. ToroDB @NoSQLonSQL MongoDB WP and API • ToroDB speaks MongoDB

    protocol natively, and implements Mongo API • All MongoDB tools, drivers, GUIs, whould work as is on ToroDB • Offer a widely used API on top of your current infrastructure
  15. ToroDB @NoSQLonSQL • NoSQL is trying to get back to

    SQL • ToroDB is SQL native! • Insert with Mongo, query with SQL! • Powerful PostgreSQL SQL: window functions, recursive queries, hypothetical aggregates, lateral joins, CTEs, etc ToroDB: native SQL
  16. ToroDB @NoSQLonSQL db.toroviews.insert({ id:5, person: { name: "Alvaro", surname: "Hernandez",

    contact: { email: "[email protected]", verified: true } } }) db.toroviews.insert({ id:6, person: { name: "Name", surname: "Surname", age: 31, contact: { email: "[email protected]" } } }) ToroDB VIEWs
  17. ToroDB @NoSQLonSQL torodb$ select * from toroviews.person ; ┌─────┬───────────┬────────┬─────┐ │

    did │ surname │ name │ age │ ├─────┼───────────┼────────┼─────┤ │ 0 │ Hernandez │ Alvaro │ ¤ │ │ 1 │ Surname │ Name │ 31 │ └─────┴───────────┴────────┴─────┘ (2 rows) torodb$ select * from toroviews."person.contact"; ┌─────┬──────────┬────────────────────────┐ │ did │ verified │ email │ ├─────┼──────────┼────────────────────────┤ │ 0 │ t │ [email protected] │ │ 1 │ ¤ │ [email protected] │ └─────┴──────────┴────────────────────────┘ (2 rows) ToroDB VIEWs
  18. ToroDB @NoSQLonSQL Replication protocol choice • ToroDB is based on

    PostgreSQL • PostgreSQL has either binary streaming replication (async or sync) or logical replication • MongoDB has logical replication • ToroDB uses MongoDB's protocol
  19. ToroDB @NoSQLonSQL MongoDB's replication protocol • Every change is recorded

    in JSON documents, idempotent format (collection: local.oplog.rs) • Slaves pull these documents from master (or other slaves) asynchronously • Changes are applied and feedback is sent upstream
  20. ToroDB @NoSQLonSQL ToroDB v0.4 • ToroDB works as a secondary

    slave of a MongoDB master (or slave, chained rep) • Implements the full replication protocol (not as an oplog tailable query) • Replicates from Mongo to a PostgreSQL • Open source github.com/torodb/torodb (repl branch, version 0.4-SNAPSHOT)
  21. ToroDB @NoSQLonSQL MongoDB's oplog & tailable cursors Oplog is a

    system, capped collection which contains idempotent DML commands Tailable cursors are “open queries” where data is streamed as it appears on the database Both are used a lot (replication, Meteor...)
  22. ToroDB @NoSQLonSQL How to do oplog & tailable with PG?

    Thanks to 9.4's logical decoding, it's easy Replication (master) is just plain logical decoding, transformed to MongoDB's json format Tailable cursors would be logical decoding plus filtering (to make sure streamed data matches query filters)
  23. ToroDB @NoSQLonSQL Write scalability (sharding) • MongoDB's sharding API not

    implemented yet (roadmap: ToroDB 0.8) • Will use MongoDB's mongos without modification, as well as config servers • That might change in the future (pg_shard?)
  24. ToroDB @NoSQLonSQL Horizontal scalability (storage level) • Another non-exclusive option

    is to have ToroDB store data in a distributed database • Requires a distributed database like GreenPlum, CitusDb or RedShift • Paired with replication as a slave: DW in NoSQL enabler
  25. ToroDB @NoSQLonSQL • GreenPlum was open sourced Oct/15 • ToroDB

    already runs on GP! CitusDB 5.0 port coming soon! • Goal: ToroDB v0.4 replicates from a MongoDB master onto Toro-Greenplum and runs the reporting queries using distributed SQL ToroDB on GP: SQL DW for MongoDB
  26. ToroDB @NoSQLonSQL • Amazon reviews dataset Image-based recommendations on styles

    and substitutes J. McAuley, C. Targett, J. Shi, A. van den Hengel SIGIR, 2015 • AWS c4.xlarge (4vCPU, 8GB RAM) 4KIOPS SSD EBS • 4x shards, 3x config; 4x segments GP • 83M records, 65GB plain json Benchmark
  27. ToroDB @NoSQLonSQL Disk usage Mongo 3.0, WT, Snappy GP columnar,

    zlib level 9 table size index size total size 0 10000000000 20000000000 30000000000 40000000000 50000000000 60000000000 70000000000 80000000000 Storage requirements MongoDB vs ToroDB on Greenplum Mongo ToroDB on GP bytes
  28. ToroDB @NoSQLonSQL SELECT count( distinct( "reviewerID" ) ) FROM reviews;

    Queries: which one is easier? db.reviews.aggregate([ { $group: { _id: "reviewerID"} }, { $group: {_id: 1, count: { $sum: 1}} } ])
  29. ToroDB @NoSQLonSQL SELECT "reviewerName", count(*) as reviews FROM reviews GROUP

    BY "reviewerName" ORDER BY reviews DESC LIMIT 10; Queries: which one is easier? db.reviews.aggregate( [ { $group : { _id : '$reviewerName', r : { $sum : 1 } } }, { $sort : { r : -1 } }, { $limit : 10 } ], {allowDiskUse: true} )
  30. ToroDB @NoSQLonSQL Query times 3 different queries Q3 on MongoDB:

    aggregate fails 27,95 74,87 0 0 200 400 600 800 1000 1200 969 1007 0 35 13 31 Query duration (s) MongoDB vs ToroDB on Greenplum MongoDB ToroDB on GP speedup seconds
  31. ToroDB @NoSQLonSQL Atomic operations • There is no support for

    atomic bulk insert/update/delete operations • Not even with $isolated: “Prevents a write operation that affects multiple documents from yielding to other reads or writes […] You can ensure that no client sees the changes until the operation completes or errors out. The $isolated isolation operator does not provide “all-or-nothing” atomicity for write operations.” http://docs.mongodb.org/manual/reference/operator/update/isolated/
  32. ToroDB @NoSQLonSQL Cheap single-node durability • Without journaling, MongoDB is

    not durable nor crash-safe • MongoDB requires “j: true” for true single-node durability. But who guarantees its consistent usage? Who uses it by default? j:true creates I/O storms equivalent to SQL CHECKPOINTs
  33. ToroDB @NoSQLonSQL “Clean” reads http://docs.mongodb.org/manual/reference/write-concern/#read-isolation-behavior “MongoDB will allow clients to

    read the results of a write operation before the write operation returns.” “If the mongod terminates before the journal commits, even if a write returns successfully, queries may have read data that will not exist after the mongod restarts.” Thus, MongoDB suffers from dirty reads. But let's call them just “tainted reads”.
  34. ToroDB @NoSQLonSQL “Clean” reads What about $snapshot? Nope: “The snapshot()

    does not guarantee that the data returned by the query will reflect a single moment in time nor does it provide isolation from insert or delete operations.” http://docs.mongodb.org/manual/faq/developers/#faq-developers-isolate-cursors Cursors in ToroDB run in repeatable read, read-only mode: globalCursorDataSource.setTransactionIsolation("TRANSACTIO N_REPEATABLE_READ"); globalCursorDataSource.setReadOnly(true);
  35. ToroDB @NoSQLonSQL MongoDB's dirty and stale reads (repl) Dirty reads

    A primary in minority accepts a write that other clients see, but it later steps down, write is rolled back (fixed in 3.2?) Stale reads A primary in minority serves a value that ought to be current, but a newer value was written to the other primary in minority
  36. ToroDB @NoSQLonSQL Data discoverability, SQL connectors • They are two

    of the major announcements for MongoDB 3.2 • To discover data, MongoDB samples data. ToroDB: just look at table structures! (and join with root if you want a count) • SQL connectors: native, no emulation