Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Managing state in realtime distributed systems

mloughran
November 08, 2011

Managing state in realtime distributed systems

mloughran

November 08, 2011
Tweet

More Decks by mloughran

Other Decks in Programming

Transcript

  1. Pusher: • Is a web service which helps developers add

    real-time functionality to their web applications • Scales the last mile delivery to the browser, adding higher level concepts • Is a real time distributed system
  2. “A distributed system is a collection of independent computers that

    appears to its users as a single coherent system” Distributed Systems: Principles and Paradigms, Tanenbaum and Steen 2006
  3. How to build a distributed system: • Decouple the application

    so that each function is handled by a separate component • Scale components horizontally, and independently • Make components tolerant to failure
  4. Easier said than done • Components need to share state,

    which constantly changes • Components need to communicate • Handling failure is hard
  5. SQL

  6. It is impossible for a distributed computer system to simultaneously

    provide all three of the following guarantees: - Consistency (all nodes see the same data at the same time) - Availability (a guarantee that every request receives a response about whether it was successful or failed) - Partition tolerance (the system continues to operate despite arbitrary message loss) http://en.wikipedia.org/wiki/CAP_theorem CAP theorem
  7. “Do not communicate by sharing memory; instead, share memory by

    communicating.” Effective Go, Google State Messaging
  8. AMQP • Centralised message broker • Complex • Hard to

    scale • Hard to get high availability
  9. ZeroMQ: What is it? • Socket abstraction designed for messaging

    (not bytes) • Sockets include queuing • Abstracts the underlying sockets • Connect the sockets to form topologies • Messaging patterns • Devices
  10. Problems with this approach • Scaling - how to shard

    your state • Event publishers and consumers easily become coupled • Handling failure is hard • Testing the stack is hard • ZeroMQ is still too low level most of the time
  11. The wish list • Make sharding state easy • Define

    the messaging problem, rather than configuring ZMQ • Handle failure automatically • Make testing easy
  12. Storm • Stream processing • Topologies - describe computation as

    a graph • Graph built from sensible primitives: • Spouts & Bolts • Stream groupings: shuffle, fields, all, global • Automatic management of workers • Handles failure
  13. In Conclusion • Storing state is a mess of compromises

    • Share state by communicating • Package up state in bundles to solve problems • Real-time map reduce is a world of new possibilities