ZooKeeper. Taming your server farm.

ZooKeeper. Taming your server farm.

The presentation was given on the local gathering of Ulyanovsk software engineers — ULCAMP::DEV (http://ulcamp.ru/dev) on April 5th, 2013.

Bf9e2ff97f5f5739fda92f21850c8b55?s=128

Alexander Zhuravlev

April 05, 2013
Tweet

Transcript

  1. ZOOKEEPER Taming your server farm Alexander Zhuravlev, aboutecho.com ULCAMP::Dev —

    2013-04-05
  2. @zaa

  3. aboutecho.com

  4. ECHO PLATFORM IN NUMBERS

  5. ~25 programmers ~450 servers ~50 database servers ~45000 requests per

    second
  6. DISTRIBUTED SYSTEMS

  7. “A distributed system is one in which the failure of

    a computer you didn't even know existed can render your own computer unusable.” — Laslie Lamport
  8. FALLACIES OF DISTRIBUTED COMPUTING

  9. 1. The network is reliable. 2. Latency is zero. 3.

    Bandwidth is infinite. 4. The network is secure. 5. Topology doesn't change. 6. There is one administrator. 7. Transport cost is zero. 8. The network is homogeneous.
  10. DR. ERIC BREWER’S CAP THEOREM

  11. Consistency Availability Partition tolerance

  12. “You can’t sacrifice partition tolerance. In the event of failures,

    which will this system sacrifice? Consistency or availability?” — Coda Hale
  13. BASE Basically Available Soft-state services with Eventual-consistency

  14. ACID Atomicity Consistency Isolation Durability

  15. APACHE ZOOKEEPER

  16. “ZooKeeper is a centralized service for maintaining configuration information, naming,

    providing distributed synchronization, and providing group services”
  17. TIMELINE June 2007: First version, Yahoo Oct 2007: Sourceforge June

    2008: Move to Apache Nov 2010: Top level project
  18. WIDELY USED Yahoo Web Crawler LinkedIn Apache Kafka Twitter Storm

    Facebook Haystack Cloudera Apache Hadoop
  19. THROUGHPUT

  20. LATENCY

  21. TAO OF ZOOKEEPER ZooKeepers keep order ZooKeepers are reliable ZooKeepers

    are efficient ZooKeepers are timely ZooKeepers avoid contention ZooKeepers are ambition free
  22. ZOOKEEPER ARCHITECTURE

  23. METADATA FILESYSTEM

  24. / /a /a/2 /a/1 /b

  25. ZOOKEEPER ATOMIC BROADCAST PROTOCOL

  26. 2F+1 NODES?

  27. SESSIONS

  28. FOLLOWER FOLLOWER LEADER ENSEMBLE CLIENT CLIENT CLIENT SESSIONS

  29. SIMPLE API

  30. API create delete exists getData setData getChildren

  31. WATCHES

  32. WATCHES getData getChildren exists

  33. READ PATH

  34. FOLLOWER FOLLOWER LEADER ENSEMBLE CLIENT CLIENT CLIENT X=9 X=9 X=9

    getData “/X” return 9 READ
  35. WRITE PATH

  36. FOLLOWER FOLLOWER LEADER ENSEMBLE CLIENT CLIENT CLIENT X=10 X=10 X=10

    setData “/X”, 10 ACK WRITE
  37. WATCH PATH

  38. FOLLOWER FOLLOWER LEADER ENSEMBLE CLIENT CLIENT CLIENT X=9 X=9 X=9

    getData “/X”, true SET WATCH return 9
  39. FOLLOWER FOLLOWER LEADER ENSEMBLE CLIENT CLIENT CLIENT X=10 X=10 X=10

    setData “/X”, 10 ACK WRITE HAPPENS
  40. FOLLOWER FOLLOWER LEADER ENSEMBLE CLIENT CLIENT CLIENT X=10 X=10 X=10

    WATCH NOTIFICATION NOTIFICATION
  41. DEMO TIME

  42. USE CASES

  43. USE CASE LOCK

  44. USE CASE LEADER ELECTION

  45. USE CASE GROUP MEMBERSHIP

  46. USE CASE CONFIGURATION MANAGEMENT

  47. HOW DOES ECHO USE ZOOKEEPER?

  48. / /postgresql /postgresql/client1 /postgresql/client2 /postgresql/clientN ...

  49. SHAMELESS PLUG If you are interested in learning about topics

    like this, get in touch with us, we are looking for interns jobs@aboutecho.com
  50. THANKS

  51. QUESTIONS? Alexander Zhuravlev http://twitter.com/zaa zaa@aboutecho.com

  52. ATTRIBUTION Photos from "The Field Museum Library" http://www.flickr.com/photos/field_museum_library/3405476048/ http://www.flickr.com/photos/field_museum_library/3405475952/ http://www.flickr.com/photos/field_museum_library/4586895529/

    http://www.flickr.com/photos/field_museum_library/3404663989/ http://www.flickr.com/photos/field_museum_library/4986435461/ http://www.flickr.com/photos/field_museum_library/3795473195/ http://www.flickr.com/photos/field_museum_library/4986457001/ Photos from "Powerhouse Museum Collection" http://www.flickr.com/photos/powerhouse_museum/2759437054/ Throughput and latency graphs from Flavio Junqueira's presentation "Distributed Coordination via ZooKeeper": https://cwiki.apache.org/confluence/download/attachments/24193445/keynote-hic-2011-web.pdf