Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Everything You Need to Know About NewSQL in 2020

Everything You Need to Know About NewSQL in 2020

The database is usually the heart of a software system. And there are many database technologies that we can pick from. In this talk, we’ll explore where RDBMS and NoSQL fall short and how NewSQL fills the gap. We’ll see what types of NewSQL databases exist and how they work. And we’ll also go over different NewSQL solutions that we can pick for our projects. By the end of the talk, we’ll have a good understanding of when and how to apply a NewSQL database in our big scale applications.

Nikolay Stoitsev

October 29, 2020
Tweet

More Decks by Nikolay Stoitsev

Other Decks in Programming

Transcript

  1. Everything You Need to Know About NewSQL in 2020
    Nikolay Stoitsev, Engineering Manager @ Halo DX

    View Slide

  2. Journey

    View Slide

  3. Relational DB
    NoSQL

    View Slide

  4. database size
    fetched per second
    updated per second

    View Slide

  5. ACID
    transactions

    View Slide

  6. Atomicity

    View Slide

  7. Consistency

    View Slide

  8. Isolation

    View Slide

  9. Durability

    View Slide

  10. Scaling
    How to scale it???

    View Slide

  11. Horizontal
    vs
    Vertical

    View Slide

  12. View Slide

  13. View Slide

  14. View Slide

  15. “Bigger servers
    don’t exist”
    The SRE team

    View Slide

  16. Horizontal Scaling
    10TB

    View Slide

  17. Horizontal Scaling
    2.5TB 2.5TB 2.5TB 2.5TB

    View Slide

  18. Partitioning
    & Routing

    View Slide

  19. Replications
    ID 1 - 100k ID 1 - 100k
    ID 100k+ ID 100k+

    View Slide

  20. Failure handling
    ID 1 - 100k ID 1 - 100k
    ID 100k+ ID 100k+
    copy

    View Slide

  21. Rebalancing
    2.5TB 2.5TB 2.5TB 2.5TB
    500GB
    500GB
    500GB
    500GB

    View Slide

  22. Horizontal scaling out of the box
    Meet NoSQL

    View Slide

  23. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T.
    Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for
    structured data. ACM Trans. Comput. Syst., 26:4:1–4:26, June 2008.

    View Slide

  24. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin,
    S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon’s highly
    available key-value store. SIGOPS Oper. Syst. Rev., 41:205–220, October 2007.

    View Slide

  25. Our choice
    Apache
    Cassandra

    View Slide

  26. Why Apache Cassandra?
    Sharding,
    Replication,
    Fault Tolerance
    Decentralized
    Multiple Data
    Centers
    Performance
    Proven in
    Production
    Gives Control

    View Slide

  27. View Slide

  28. But no
    transactions :(

    View Slide

  29. View Slide

  30. Consistent Hashing

    View Slide

  31. Consistent Hashing
    Two Phase Commit

    View Slide

  32. Consistent Hashing
    Two Phase Commit
    Three Phase
    Commit

    View Slide

  33. Consistent Hashing
    Two Phase Commit
    Three Phase
    Commit
    Quorum

    View Slide

  34. Consistent Hashing
    Two Phase Commit
    Three Phase
    Commit
    Quorum
    Eventual
    Consistency

    View Slide

  35. Transactions + Scalability
    Meet NewSQL

    View Slide

  36. https://ai.google/research/pubs/pub27898

    View Slide

  37. Partitioning tables in tablets
    Tablet #1
    Tablet #2
    Tablet #3

    View Slide

  38. Tablets are replicated
    Node Node
    Node
    Node
    Tablet #1
    Tablet #1
    Tablet #1

    View Slide

  39. One leader for each tablet
    Leader Follower
    Follower
    Leader for
    tablet #1

    View Slide

  40. Paxos to pick the leader
    Leader Follower
    Follower

    View Slide

  41. Paxos to pick the leader
    Leader Follower
    Follower
    New
    Leader

    View Slide

  42. Read queries

    View Slide

  43. Read queries
    Leader Follower
    Follower

    View Slide

  44. Read queries
    Leader Follower
    Follower
    Can I read?
    Here is the data
    I have. Is it
    latest?

    View Slide

  45. Read queries
    Leader Follower
    Follower
    Can I read?
    Yes

    View Slide

  46. Read queries
    Leader Follower
    Follower
    Can I read?
    Yes

    View Slide

  47. Write queries

    View Slide

  48. Write query
    Leader Follower
    Follower

    View Slide

  49. Write query
    Leader Follower
    Follower

    View Slide

  50. Write query
    Leader Follower
    Follower
    Write
    Write

    View Slide

  51. Write query
    Leader Follower
    Follower
    Write
    Write
    Done
    Done

    View Slide

  52. Write query
    Leader Follower
    Follower
    Write
    Write
    Done
    Done
    Done

    View Slide

  53. Distributed transactions

    View Slide

  54. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower

    View Slide

  55. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Paxos Magic

    View Slide

  56. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader

    View Slide

  57. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader

    View Slide

  58. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader
    Write

    View Slide

  59. Distributed transaction
    Leader
    Follower
    Follower
    Leader
    Follower
    Follower
    Transaction
    Leader
    Write
    Done

    View Slide

  60. Are transactions ACID?

    View Slide

  61. Atomicity

    View Slide

  62. Durability

    View Slide

  63. Consistency

    View Slide

  64. Isolation :(

    View Slide

  65. True Time API
    Atomic Clock GPS Clock

    View Slide

  66. TrueTime API
    time

    View Slide

  67. Types of NewSQL databases



    View Slide

  68. Databases with novel architecture
    CockroachDB
    ● Open Source - https://github.com/cockroachdb/cockroach
    ● Easy to setup
    ● Horizontal scalability and high availability
    ● Geo-partitioning and distribution of data
    ● ACID transactions

    View Slide

  69. How it works?

    View Slide

  70. Calvin and FaunaDB
    ● Scalable
    ● ACID Transactions
    ● One global consensus protocol
    ● GraphQL
    ● User defined functions
    ● Commercial

    View Slide

  71. Databases with novel architecture
    NuoDB
    ● Horizontal scalability
    ● High availability
    ● ACID transactions
    ● Separate Transaction
    Management and Storage
    Management
    ● Used in banking
    ● https://nuodb.com
    VoltDB
    ● Horizontal scalability with Geo
    Replication
    ● High availability
    ● ACID transactions (Serialized)
    ● In-memory database
    ● Low latency
    ● Used in many domains
    ● https://www.voltdb.com/

    View Slide

  72. Middlewares
    Apache Ignite
    ● Peer-to-peer mesh network
    ● Distributed queries
    ● Distributed caching
    ● Storage and processing
    framework
    ● Horizontally scalable
    ● Atomic types
    ● Replicated data structures
    ● https://ignite.apache.org
    Apache Trafodion
    ● SQL query language on Apache
    HBase
    ● Big data workloads
    ● ACID transactions
    ● https://trafodion.apache.org

    View Slide

  73. Managed Cloud Databases
    Cloud Spanner
    ● Horizontal scalability
    ● High availability
    ● ACID transactions
    ● Planet scale
    ● Widely used
    ● https://cloud.google.com/s
    panner
    Azure Cosmos DB
    ● Horizontal scalability
    ● High availability
    ● ACID transactions
    ● Planet scale
    ● Widely used
    ● Multi-Model
    ● https://azure.microsoft.com
    /en-us/services/cosmos-db

    View Slide

  74. When to use NewSQL



    View Slide

  75. Summary
    RDBMS
    Easy to run
    Has
    transactions
    Can’t scale
    NoSQL
    Can scale
    Provides
    availability
    No transactions
    NewSQL
    Can Scale
    Has
    transactions
    Lack maturity

    View Slide

  76. Nikolay Stoitsev
    Engineering Manager at Halo DX

    View Slide