Upgrade to Pro — share decks privately, control downloads, hide ads and more …

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-Peris at Big Data Spain 2017

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-Peris at Big Data Spain 2017

The talk will focus on explaining why operational databases do not scale due to limitations in legacy transactional management.

https://www.bigdataspain.org/2017/talk/end-of-the-myth-ultra-scalable-transactional-management

Big Data Spain 2017
November 16th - 17th Kinépolis Madrid

Big Data Spain

December 01, 2017
Tweet

More Decks by Big Data Spain

Other Decks in Technology

Transcript

  1. The End of a Myth: Ultra- Scalable Transactional Management Presented

    by: Ricardo Jimenez-Peris CEO & Co-founder @ LeanXcale
  2. About the Speaker Top researcher on scalable transactional management and

    distributed data management with 100+ publications in top conferences and journals Co-author of a book on Database Replication Professor on distributed systems and data management for over 25 years Co-inventor of two granted patents and 8 new patent applications Invited speaker to top-tech companies in Silicon Valley, such as Facebook, Twitter, Salesforce, Heroku, EMC-Pivotal (when it was EMC-Greenplum), HP, Microsoft
  3. About LeanXcale Vendor of a NewSQL ultra-scalable database, Full ACID,

    Full SQL LeanXcale – HTAP Database: blending Operational and Analytical capabilities delivering real-time data LeanXcale leverages an ultra-efficient storage engine, which is a relational key-value data store Product Team 45 % 30% 15 Awards Total number PhD Holders 10-25 years of Industry expertise Engineers from Top technical universities
  4. The Myth ”Operational databases can not scale” WHY? Nobody managed

    to scale them in three decades. Some say that is due to the CAP Theorem. - vendors that do not provide ACID properties
  5. C - Consistency A - Availability P – Partitions The

    CAP theorem states something very well known in distributed systems, i.e. if you want to tolerate partitions, choose: Availability at all nodes and no consistency OR Consistency and no Availability at all nodes The CAP Theorem Q: Where is the S of Scalability? A: Nowhere
  6. Solved how to scale transactions to large scale (i.e. 100

    million update transactions per second) in a fully seamless way Breakthrough result of 15+ years of research by a tenacious team The End of the Myth: Ultra-Scalable Transactions
  7. Evaluation without data manager/logging to see how much throughput can

    attain the transactional processing 2.35 Million transactio ns per second Scalability
  8. LeanXcale Process & commits transaction s in parallel Tim e

    Provides a consistent view Traditional systems have a single-node bottleneck vs Tim e Traditional transactional DB Ultra-Scalable Transactions LeanXcale
  9. Centra l TM Atomicity Isolation Writes Durability Isolation Reads Centralized

    Transaction Manager Traditional Approach Single-Node Bottleneck
  10. Snapshot Server Commit Sequencer Isolation Reads Conflict Managers Isolation Writes

    Loggers Durabilit y Local TMs Atomicit y Scaling ACID Properties
  11. Separation of commit from the visibility of committed data Proactive

    pre-assignment of commit timestamps to committing transactions Transactions can commit in parallel due to: • They do not conflict • They have their commit timestamp already assigned that will determine its serialization order • Visibility is regulated separately to guarantee the reading of fully consistent states Detection and resolution of conflicts before commit Main Principles
  12. Snapsh ot Server Current consistent snapshot The local txn mng

    gets the “start TS” from the snapshot server. Get start TS Local Txn Manager Transactional Life Cycle: Start
  13. Local Transaction Manager Get start TS Run on start TS

    snapshot Conflict Manag er The transaction will read the state as of “start TS”. Write-write conflicts are detected by conflict managers on the fly. Transactional Life Cycle: Execution
  14. Get start TS Run on start TS snapshot Commit The

    local transaction manager orchestrates the commit. Local Txn Manager Transactional Life Cycle: Commit
  15. Logger Commit Sequencer Data Store Snapshot Server Commit TS writese

    t writese t Commit TS Local Transaction Manager Get Commit TS Log Public Updates Report Snaps Serv Transactional Life Cycle: Commit
  16. TIMESTAMP 11 TIMESTAMP 15 TIMESTAMP 12 TIMESTAMP 14 TIMESTAMP 13

    Time Sequence of timestamps received by the Snapshot Server Evolution of the current snapshot at the Snapshot Server TIMESTAMP 11 TIMESTAMP 12 TIMESTAMP 12 TIMESTAMP 15 TIMESTAMP 11 1 1 1 5 1 2 1 4 1 3 1 1 1 1 1 2 1 2 1 5 Transactional Life Cycle: Commit
  17. The described approach so far is the original reactive approach

    It results in multiple messages per update transaction. The adopted approach is proactive: • The local transaction managers report periodically about the number of committed update transactions per second • The commit sequencer distributes batches of commit timestamps to the local transaction managers • The snapshot server gets periodically batches of timestamps (both used and discarded) from local transaction managers • The snapshot server reports periodically to local transaction managers the most current consistent snapshot Increasing Efficiency
  18. The transactional management provides ultra-scalability Fully transparent: • No sharding.

    • No required a priori knowledge about rows to be accessed. • Syntactically: no changes required in the application. • Semantically: equivalent behavior to a centralized system. Provides Snapshot Isolation (the isolation level provided by Oracle when set to “Serializable” isolation). + + Transactional Processing
  19. KiVi Key-Value Data Store OLTP & OLAP Query Engine Storage

    Transaction Manager SQL Engine Ultra-Scalable Transactions Architecture
  20.  Cutting costs of business analytics by 80%  Real-time

    Analytical Queries  No more ETLs Analytical Queries on Operational Data Operational Database OLTP Data Warehouse OLAP OLTP + OLAP Blending OLTP & OLAP: Making Decisions at the Right Time
  21. LeanXcale is the first database technology that can substitute the

    mainframe. It can bear the operational workloads of a mainframe, but at the same time provide real-time analytics over the operational data. It can be deployed by the mainframe to be loaded/updated in real-time, and applications can be offloaded from the mainframe one by one. LeanXcale is partnering with Bull Atos to provide a database appliance that will provide the substitute of the mainframe. Offloading/Substituting Mainframe
  22. Enabling to implement the Customer Experience Management (CEM) halving the

    number of nodes. Leveraging the computation of aggregates in real-time as raw KPIs are inserted. Analytical aggregation queries become simple single-row queries. Elasticity enables to substantially reduce the operation personnel cost during the non-working hours with low loads. Reducing Cost of Ownership at Telcos
  23. Using the key-value interface for large data ingestion of IoT

    applications while still accessible through SQL and reducing by several times the infrastructure needed. Real-time analytics. Computation of aggregates in real-time to reduce the cost of aggregation analytical queries, e.g., for the smart grid. Elasticity enable to adjust the consumption of resources to the load received. Large IoT Applications
  24. Using the key-value interface to reduce the footprint needed to

    get clicks Real-time analytics for implementing availability checking Elasticity enable to adjust the consumption of resources to the load received Full ACIDity to guarantee the consistency of the truth of sales and actual availability Disrupting Travel Tech