NewSQL: what, when and how

NewSQL: what, when and how

The database has always been one of the key components in every architecture. There is a great variety of tradeoffs we should consider and implementation that we can pick from. If we need consistency and correctness in exchange of availability and performance, we should pick a relational database. If we need scale and increased availability by sacrificing transactional and consistency guarantees, we should use a NoSQL database. And if we need both horizontal scalability and transactions, we need to pick a NewSQL database. During this talk we’ll explore what guarantees a NewSQL system provides. We’ll go over the different approaches in building such a system. And we’ll see some open source projects that implements each approach. At the end of the talk we’ll have a good understanding of when and how to apply a NewSQL database in our big scale applications.

Eb44761e0fb3a5ec8e23ec28048dd7a5?s=128

Nikolay Stoitsev

November 03, 2018
Tweet

Transcript

  1. None
  2. Cover Title Goes here Month XX, 20XX NewSQL: What, When

    and How Nikolay Stoitsev @ Uber Engineering Sofia
  3. RDBMS → NoSQL → NewSQL

  4. RDBMS

  5. ACID Transactions

  6. ACID Transactions

  7. ACID Transactions

  8. ACID Transactions

  9. ACID Transactions

  10. None
  11. DC1 DC2

  12. DC1 DC2

  13. Availability

  14. Consistency

  15. PSQL can really scale

  16. 2K Database size 10+TB Rows fetched per second 500 Rows

    updated per second Our cluster
  17. Vertical scaling

  18. “Bigger servers don’t exist”

  19. Sharding Data 1. Migrating to NoSQL 2.

  20. None
  21. None
  22. None
  23. NoSQL

  24. Vertical scaling

  25. Sharding and Replication

  26. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D.

    A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26:4:1–4:26, June 2008.
  27. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman,

    A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev., 41:205–220, October 2007.
  28. Apache HBase Apache Cassandra

  29. We picked Apache Cassandra

  30. None
  31. Configurable Consistency

  32. No ACID Transactions

  33. NewSQL

  34. Scalability of NoSQL + ACID Transactions

  35. Data is partitioned and replicated

  36. https://ai.google/research/pubs/pub27898

  37. True Time API

  38. Decetralized transaction management

  39. Tablet 1 Tablet 2 Tablet 3

  40. Leader Follower Follower Paxos

  41. Leader Follower Follower Read

  42. Leader Follower Follower Read Can I read?

  43. Leader Follower Follower Write

  44. Leader 1 Leader 2 Distributed Write Transaction Manager

  45. https://github.com/cockroachdb/cockroach

  46. https://www.cockroachlabs.com/blog/living-without-atomic-clocks/

  47. Apache Ignite

  48. Used at Uber for in-house framework for massive distributed apps

  49. In-memory distributed database

  50. Distributed data structures

  51. SQL Support

  52. Brings the computation to the data

  53. https://dzone.com/articles/apache-ignite- transactions-architecture-ignite-per

  54. https://ignite.apache.org/features/sql.html

  55. Which database to pick?

  56. No one size fits all database

  57. RDBMS Easy to run it Has transactions Can scale Provides

    availability NoSQL NewSQL Can scale Has Transactions
  58. Thanks! Nikolay Stoitsev Uber Engineering Sofia