Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL Matters keynote, Dublin - June 4th 2015

NoSQL Matters keynote, Dublin - June 4th 2015

What matters when choosing a data processing platform, and when getting the most out of that platform. And what will matter most as we take things to the next level...

6fb292826ed5ca167629b80525873651?s=128

Adrian Colyer

June 04, 2015
Tweet

Transcript

  1. NoSQL Matters @adriancolyer

  2. 1. when choosing a data store / processing platform 2.

    when it comes to getting the most out of that platform 3. when we take things to the next level What really matters...
  3. None
  4. The 13 horsemen of the apocalypse... Your application(s) Anomaly (Prevented

    By) Tolerable? Mitigation (M,G,A…) Dirty Writes Read Uncommitted Dirty Reads Read Committed Fuzzy Reads (non-repeat- able) Item-Cut Isolation Phantoms Predicate-Cut Isolation ...
  5. Your application(s) Anomaly (Prevented By) Tolerable? Mitigation Read Skew MAV

    Isolation + item-cut Lost Update Repeatable Read Cursor Lost Update Cursor Stability Write Skew Repeatable Read Stale Reads Partition-intolerance
  6. Your application(s) Anomaly (Prevented By) Tolerable? Mitigation Non-monotonic read Monotonic

    reads Non-monotonic write Monotonic writes Invisible cause Writes-follow-reads Disappearing writes Read-your-writes (for sessions)
  7. Your Developers “we believe there is considerable work to be

    done to improve the programmability of highly- available systems” - Bailis et al. 2014 (HAT)
  8. Your Developers “...an unacceptable burden to place on developers” -

    Google 2012 (F1)
  9. Consistency and all that... If you accept a weaker consistency

    model make sure it’s a genuine trade-off and you’re getting something (you need) in return. You can have causal consistency with (C)AC
  10. PACELC (pass-elk)

  11. Operations & all the other use cases …it is important

    to consider the data accesses that don’t use the API. These include back-ups, bulk import and deletion of data, bulk migrations from one data format to another, replica creation, asynchronous replication, consistency monitoring tools, and operational debugging. An alternate store would also have to provide atomic write transactions, efficient granular writes, and few latency outliers. - Facebook 2013 (TAO) “ ”
  12. it tears you apart with suspense! “ ”

  13. None
  14. Why is it so hard? “We have found that the

    standard verification techniques in industry are necessary but not sufficient. We use deep design reviews, code reviews, static code analysis, stress testing, fault-injection testing, and many other techniques, but we still find that subtle bugs can hide in complex concurrent fault-tolerant systems.” - Amazon 2014
  15. In the ALPS...

  16. … or a walk in the park?

  17. (Web)Scale The USL Source : McSherry et al. 2015 Credit:

    Neil Gunther
  18. (Web)Scale Source : McSherry et al. 2015

  19. Big?!

  20. How Big? “Working sets are Zipf-distributed. We can therefore store

    in memory all but the very largest datasets, which we avoid storing in memory altogether. For example, the distribution of input sizes of MapReduce jobs at Facebook is heavy- tailed. Furthermore, 96% of active jobs can have their entire data simultaneously fit in the corresponding clusters’ memory” - Tachyon, Lie et al. 2014
  21. Musketeer

  22. Performance 40-80% of all MR jobs would perform better on

    a single machine! (and cost less, and be easier to operate, and have many fewer failures…)
  23. COST The Configuration that Outperforms a Single Thread “You can

    have a second computer once you’ve shown you know how to use the first one.” - Paul Barham
  24. vs a single thread...

  25. FlashGraph vs Pregel • Pregel: 1B vertices, 127B edges, 300

    machines • FlashGraph: 3.4B vertices, 129B edges, 1 machine
  26. ApproxHadoop

  27. BlinkDB

  28. Sometimes it pays to wait (a little bit)

  29. What’s the bottleneck? • Network I/O? • Disk I/O? •

    CPU? Measure before optimising… and avoid excessive serialization and deserialization!
  30. X (multi-core) Distributed X In-memory X Flash Optimised X NVMM

    X NVMM & RDMA X X (establish baseline COST)
  31. None
  32. ALPS, ACID 2.0, CRDTs, CAC, COPS, CRON, CALM, CAP, &

    CRAP!
  33. Coordination Avoidance Invariant-Confluence for application level constraints • NOT NULL

    • PRIMARY KEY (read & delete, but not insert) • UNIQUE (read & delete, insert?) • FOREIGN KEY (insert, cascade delete, but delete)
  34. None
  35. Life Beyond... “In recent years, many ‘NoSQL’ designs have avoided

    cross-partition transactions entirely, effectively providing Read Uncommitted isolation…” - Bailis et al. 2014 From: “Life Beyond Distributed Transactions”, To: “Read-Atomic Multiple Partition” Transactions (RAMP)
  36. None
  37. Your application(s) From anomalies to invariants... Invariant Type Affected Txns

    I-Confluent?
  38. Some closing thoughts • Do you need eventual? • Have

    you planned for anomalies? • Does it actually work? • Are you distributing for the right reasons? (AL…) • Do you need exact? • Do you need it ASAP? • Can you keep CALM? • Do you understand your application’s invariants?
  39. http://blog.acolyer.org @adriancolyer

  40. References • Highly Available Transactions, Virtues & Limitations - Bailis

    et al. 2014 http: //blog.acolyer.org/2014/11/07/highly-available-transactions-virtues-and- limitations/ • Building on Quicksand - Helland 2009 http://blog.acolyer. org/2015/03/23/building-on-quicksand/ • F1: A Distributed SQL Database that Scales - Google 2012 http://blog. acolyer.org/2015/01/06/f1-a-distributed-sql-database-that-scales/ • Scalability! But at what COST? - McSherry et al. 2015 http://blog.acolyer. org/?p=941 (to appear, June 5th 2015) • Applying the Universal Scalability Law to Organisations - Colyer 2015 http: //blog.acolyer.org/2015/04/29/applying-the-universal-scalability-law-to- organisations/
  41. References • Don’t Settle for Eventual: Scalable Causal Consistency for

    Wide-Area Storage with COPS - LLoyd et al. 2011 http://blog.acolyer. org/2015/03/17/consistency-availability-and-convergence-cops/ • Consistency, Availability, and Convergence - Mahajan et al. 2014 http: //blog.acolyer.org/2015/03/17/consistency-availability-and-convergence- cops/ • Tachyon: Reliable, Memory-Speed Storage for Cluster Computing - Lie et al. 2014 http://blog.acolyer.org/2014/12/04/tachyon-reliable-memory- speed-storage-for-cluster-computing/
  42. References • Musketeer: all for one, one for all in

    data processing systems - Gog et al. 2015 http://blog.acolyer.org/2015/04/27/musketeer-part-i-whats-the-best- data-processing-system/ and http://blog.acolyer. org/2015/04/28/musketeer-part-ii-one-for-all-and-all-for-one/ • Pregel: A System for Large-Scale Graph Processing - Google 2010 http: //blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph- processing/ • FlashGraph: Processing Billion Node Graphs on an array of commodity SSDs - Zheng et al. 2015 http://blog.acolyer.org/?p=935
  43. References • ApproxHadoop: Bringing Approximations to Hadoop Frameworks - Goiri

    2015 http://blog.acolyer.org/2015/04/16/approxhadoop-bringing- approximations-to-mapreduce-frameworks/ • BlinkDB: http://blinkdb.org/ • Making Sense of Performance in Data Analytics Frameworks - Ousterhout et al 2015 http://blog.acolyer.org/2015/04/20/making-sense-of- performance-in-data-analytics-frameworks/ • A Comprehensive Study of Convergent and Commutative Replicated Data Types - Shapiro et al. 2011 http://blog.acolyer.org/2015/03/18/a- comprehensive-study-of-convergent-and-commutative-replicated-data- types/
  44. References • The Declarative Imperative: Experiences and Conjectures in Distributed

    Logic - Hellerstein 2010 http://blog.acolyer.org/2014/11/13/the-declarative- imperative-experiences-and-conjectures-in-distributed-logic/ • Fast Remote Memory - Dragojevic et al. 2014 http://blog.acolyer. org/2015/05/20/farm-fast-remote-memory/ • Mojim: A Reliable and Highly-Available Non-Volatile Memory System - Zhang et al. 2015 http://blog.acolyer.org/2015/04/14/mojim-a-reliable-and- highly-available-non-volatile-memory-system/
  45. References • Consistency Analysis in Bloom: A Calm and Collected

    Approach - Alvaro et al. 2011 http://blog.acolyer.org/2015/03/16/consistency-analysis-in-bloom- a-calm-and-collected-approach/ • Edelweiss: Automatic Storage Reclamation for Distributed Programming - Conway et al. 2014 http://blog.acolyer.org/2015/02/20/edelweiss- automatic-storage-reclamation-for-distributed-programming/ • Scalable Atomic Visibility with RAMP Transactions - Bailis et al. 2014 http: //blog.acolyer.org/2015/03/27/scalable-atomic-visibility-with-ramp- transactions/
  46. References • Coordination Avoidance in Database Systems - Bailis et

    al. 2014 http: //blog.acolyer.org/2015/03/19/coordination-avoidance-in-database- systems/ • Putting Consistency Back into Eventual Consistency - Balegas et al. 2015 http://blog.acolyer.org/2015/05/04/putting-consistency-back-into-eventual- consistency/ • Use of Formal Methods at Amazon Web Services - Newcombe et al. 2014 http://blog.acolyer.org/2014/11/24/use-of-formal-methods-at-amazon-web- services/ • Consistency Trade-offs in Modern Distributed Database Systems Design - Abadi 2012 http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf
  47. References • Life Beyond Distributed Transactions - Helland 2007 http://blog.acolyer.

    org/2014/11/20/life-beyond-distributed-transactions/
  48. Image Credits • ALPS + Dublin Park: Wikimedia Commons •

    Movies: IMDB • Monotone Commuters: http://www.yenko.net/ubbthreads/ubbthreads. php/topics/312207/re-old-street-scenes • Elk picture by Jim Richmond: http://commons.wikimedia.org/wiki/File:Rm- elk-locking-antlers.jpg