Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Everything You Need to Know About NewSQL in 2020

Everything You Need to Know About NewSQL in 2020

The database is usually the heart of a software system. And there are many database technologies that we can pick from. In this talk, we’ll explore where RDBMS and NoSQL fall short and how NewSQL fills the gap. We’ll see what types of NewSQL databases exist and how they work. And we’ll also go over different NewSQL solutions that we can pick for our projects. By the end of the talk, we’ll have a good understanding of when and how to apply a NewSQL database in our big scale applications.

Nikolay Stoitsev

October 29, 2020
Tweet

More Decks by Nikolay Stoitsev

Other Decks in Programming

Transcript

  1. Everything You Need to Know About NewSQL in 2020
    Nikolay Stoitsev, Engineering Manager @ Halo DX

    View full-size slide

  2. Relational DB
    NoSQL

    View full-size slide

  3. database size
    fetched per second
    updated per second

    View full-size slide

  4. ACID
    transactions

    View full-size slide

  5. Scaling
    How to scale it???

    View full-size slide

  6. Horizontal
    vs
    Vertical

    View full-size slide

  7. “Bigger servers
    don’t exist”
    The SRE team

    View full-size slide

  8. Horizontal Scaling
    10TB

    View full-size slide

  9. Horizontal Scaling
    2.5TB 2.5TB 2.5TB 2.5TB

    View full-size slide

  10. Partitioning
    & Routing

    View full-size slide

  11. Replications
    ID 1 - 100k ID 1 - 100k
    ID 100k+ ID 100k+

    View full-size slide

  12. Failure handling
    ID 1 - 100k ID 1 - 100k
    ID 100k+ ID 100k+
    copy

    View full-size slide

  13. Rebalancing
    2.5TB 2.5TB 2.5TB 2.5TB
    500GB
    500GB
    500GB
    500GB

    View full-size slide

  14. Horizontal scaling out of the box
    Meet NoSQL

    View full-size slide

  15. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T.
    Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for
    structured data. ACM Trans. Comput. Syst., 26:4:1–4:26, June 2008.

    View full-size slide

  16. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin,
    S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon’s highly
    available key-value store. SIGOPS Oper. Syst. Rev., 41:205–220, October 2007.

    View full-size slide

  17. Our choice
    Apache
    Cassandra

    View full-size slide

  18. Why Apache Cassandra?
    Sharding,
    Replication,
    Fault Tolerance
    Decentralized
    Multiple Data
    Centers
    Performance
    Proven in
    Production
    Gives Control

    View full-size slide

  19. But no
    transactions :(

    View full-size slide

  20. Consistent Hashing

    View full-size slide

  21. Consistent Hashing
    Two Phase Commit

    View full-size slide

  22. Consistent Hashing
    Two Phase Commit
    Three Phase
    Commit

    View full-size slide

  23. Consistent Hashing
    Two Phase Commit
    Three Phase
    Commit
    Quorum

    View full-size slide

  24. Consistent Hashing
    Two Phase Commit
    Three Phase
    Commit
    Quorum
    Eventual
    Consistency

    View full-size slide

  25. Transactions + Scalability
    Meet NewSQL

    View full-size slide

  26. https://ai.google/research/pubs/pub27898

    View full-size slide

  27. Partitioning tables in tablets
    Tablet #1
    Tablet #2
    Tablet #3

    View full-size slide

  28. Tablets are replicated
    Node Node
    Node
    Node
    Tablet #1
    Tablet #1
    Tablet #1

    View full-size slide

  29. One leader for each tablet
    Leader Follower
    Follower
    Leader for
    tablet #1

    View full-size slide

  30. Paxos to pick the leader
    Leader Follower
    Follower

    View full-size slide

  31. Paxos to pick the leader
    Leader Follower
    Follower
    New
    Leader

    View full-size slide

  32. Read queries

    View full-size slide

  33. Read queries
    Leader Follower
    Follower

    View full-size slide

  34. Read queries
    Leader Follower
    Follower
    Can I read?
    Here is the data
    I have. Is it
    latest?

    View full-size slide

  35. Read queries
    Leader Follower
    Follower
    Can I read?
    Yes

    View full-size slide

  36. Read queries
    Leader Follower
    Follower
    Can I read?
    Yes

    View full-size slide

  37. Write queries

    View full-size slide

  38. Write query
    Leader Follower
    Follower

    View full-size slide

  39. Write query
    Leader Follower
    Follower

    View full-size slide

  40. Write query
    Leader Follower
    Follower
    Write
    Write

    View full-size slide

  41. Write query
    Leader Follower
    Follower
    Write
    Write
    Done
    Done

    View full-size slide

  42. Write query
    Leader Follower
    Follower
    Write
    Write
    Done
    Done
    Done

    View full-size slide

  43. Distributed transactions

    View full-size slide

  44. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower

    View full-size slide

  45. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Paxos Magic

    View full-size slide

  46. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader

    View full-size slide

  47. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader

    View full-size slide

  48. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader
    Write

    View full-size slide

  49. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader
    Write
    Done

    View full-size slide

  50. Are transactions ACID?

    View full-size slide

  51. Isolation :(

    View full-size slide

  52. True Time API
    Atomic Clock GPS Clock

    View full-size slide

  53. TrueTime API
    time

    View full-size slide

  54. Types of NewSQL databases



    View full-size slide

  55. Databases with novel architecture
    CockroachDB
    ● Open Source - https://github.com/cockroachdb/cockroach
    ● Easy to setup
    ● Horizontal scalability and high availability
    ● Geo-partitioning and distribution of data
    ● ACID transactions

    View full-size slide

  56. How it works?

    View full-size slide

  57. Calvin and FaunaDB
    ● Scalable
    ● ACID Transactions
    ● One global consensus protocol
    ● GraphQL
    ● User defined functions
    ● Commercial

    View full-size slide

  58. Databases with novel architecture
    NuoDB
    ● Horizontal scalability
    ● High availability
    ● ACID transactions
    ● Separate Transaction
    Management and Storage
    Management
    ● Used in banking
    ● https://nuodb.com
    VoltDB
    ● Horizontal scalability with Geo
    Replication
    ● High availability
    ● ACID transactions (Serialized)
    ● In-memory database
    ● Low latency
    ● Used in many domains
    ● https://www.voltdb.com/

    View full-size slide

  59. Middlewares
    Apache Ignite
    ● Peer-to-peer mesh network
    ● Distributed queries
    ● Distributed caching
    ● Storage and processing
    framework
    ● Horizontally scalable
    ● Atomic types
    ● Replicated data structures
    ● https://ignite.apache.org
    Apache Trafodion
    ● SQL query language on Apache
    HBase
    ● Big data workloads
    ● ACID transactions
    ● https://trafodion.apache.org

    View full-size slide

  60. Managed Cloud Databases
    Cloud Spanner
    ● Horizontal scalability
    ● High availability
    ● ACID transactions
    ● Planet scale
    ● Widely used
    ● https://cloud.google.com/s
    panner
    Azure Cosmos DB
    ● Horizontal scalability
    ● High availability
    ● ACID transactions
    ● Planet scale
    ● Widely used
    ● Multi-Model
    ● https://azure.microsoft.com
    /en-us/services/cosmos-db

    View full-size slide

  61. When to use NewSQL



    View full-size slide

  62. Summary
    RDBMS
    Easy to run
    Has
    transactions
    Can’t scale
    NoSQL
    Can scale
    Provides
    availability
    No transactions
    NewSQL
    Can Scale
    Has
    transactions
    Lack maturity

    View full-size slide

  63. Nikolay Stoitsev
    Engineering Manager at Halo DX

    View full-size slide