Slide 1

Slide 1 text

Jones Configuration with ZooKeeper @mwhooker Saturday, November 10, 12

Slide 2

Slide 2 text

about me • worked at Digg where we had a system very similar to Jones (built by Rich Schumacher). • Then at Disqus where I worked on Jones Saturday, November 10, 12

Slide 3

Slide 3 text

problem • I want my app to be configurable Saturday, November 10, 12

Slide 4

Slide 4 text

problem • I want to be able to change config values without redeploying Saturday, November 10, 12

Slide 5

Slide 5 text

problem • I want my app to see these new values as soon as they change Saturday, November 10, 12

Slide 6

Slide 6 text

always ship trunk by Paul Hammond[1] • Argues that web apps are not like shipped software • You only have one user (you), so usually only a single copy of your code is in use at a time. • except when you’re deploying • branch management doesn’t apply Saturday, November 10, 12

Slide 7

Slide 7 text

always ship trunk • “You can deploy the code for a feature long before you launch it and nobody will know” • “You can completely rewrite your infrastructure and keep the UI the same and nobody will know” “Idea one: separate feature launches from infrastructure launches” Saturday, November 10, 12

Slide 8

Slide 8 text

always ship trunk • “You can repeatedly switch between two backend systems and keep the UI the same and nobody will know” • “You can deploy a non-user facing change to only a small percentage of servers and nobody will know” “Idea two: run multiple versions of your code at once” Saturday, November 10, 12

Slide 9

Slide 9 text

Flickr’s Flipper • Flickr has implemented this idea with Flipper • unfortunately it’s Flickr only Saturday, November 10, 12

Slide 10

Slide 10 text

Jones enables “always ship trunk” https://github.com/mwhooker/jones Saturday, November 10, 12

Slide 11

Slide 11 text

• Jones gives you a way to make configuration changes to your app in real time • It manages different types of environments – staging, production, development • It can also manage config on a host-by-host basis Saturday, November 10, 12

Slide 12

Slide 12 text

ZooKeeper In order to really understand how Jones works, we need to understand ZK. Saturday, November 10, 12

Slide 13

Slide 13 text

ZooKeeper “ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.” [3] Saturday, November 10, 12

Slide 14

Slide 14 text

ZooKeeper • Hierarchical namespaces (like filesystems) • Data stored in “znodes” – vertices in a data graph Saturday, November 10, 12

Slide 15

Slide 15 text

Reading from ZK • You address znodes with a string representing the path to the znode you wish to access >>> zookeeper.get( ... '/services/pycon/conf')[0] '{ "locale": "Canada", "times_talk_given": 0, "is_awesome": true }' Saturday, November 10, 12

Slide 16

Slide 16 text

Reading from ZK • You can also list immediate children of a node >>> zookeeper.get_children('/friends') [u'Matt', u'Marry', u'Mark'] Saturday, November 10, 12

Slide 17

Slide 17 text

Reading from ZK • When accessing data, you can optionally be notified if the znode ever changes >>> def cb(data, stat): ... print "I changed. New value: ", data ... >>> kc.create('/test', 'foobar') u'/test' >>> kc.get('/test', watch=cb) ('foobar', ...) >>> kc.set('/test', 'baz') I changed. New value: baz Saturday, November 10, 12

Slide 18

Slide 18 text

Writing to ZK • Each znode is versioned • ZooKeeper supports MVCC • Multiversion concurrency control • a way of enforcing consistency • ensures multiple writers don’t clobber each other Saturday, November 10, 12

Slide 19

Slide 19 text

Suddenly, code >>> # zk is our handle to ZooKeeper >>> # stat holds metadata about the znode >>> config, stat = zk.get('/test') >>> # let's look at the current version >>> stat.version 1 >>> # try updating the znode with the current version >>> zk.set('/test', 'foobar', version=stat.version) >>> # success >>> zk.get('/test')[0] 'foobar' >>> # we can also choose to overwrite any value >>> zk.set('/test', 'baz', version=-1) >>> zk.get('/test')[0] 'baz' >>> # let's see what happens if we pass a wrong version >>> zk.set('/test', 'foobaz', version=9000) >>> # we get an exception because version must be the >>> # current version of the znode you're trying to change kazoo.exceptions.BadVersionError: ((), {}) Saturday, November 10, 12

Slide 20

Slide 20 text

Now that we all know a little about ZooKeeper (hopefully!), how does Jones work? Saturday, November 10, 12

Slide 21

Slide 21 text

Saturday, November 10, 12

Slide 22

Slide 22 text

Saturday, November 10, 12

Slide 23

Slide 23 text

Jones • Config is stored as JSON object • Enter values here and they’ll immediately be reflected in the client • Uses Jos de Jong’s json editor[2] Saturday, November 10, 12

Slide 24

Slide 24 text

Saturday, November 10, 12

Slide 25

Slide 25 text

service • highest level of granularity Saturday, November 10, 12

Slide 26

Slide 26 text

Saturday, November 10, 12

Slide 27

Slide 27 text

environment tree • environments inherit from their parents • The actual config for that environment is shown in the “inherited view” box Saturday, November 10, 12

Slide 28

Slide 28 text

root Saturday, November 10, 12

Slide 29

Slide 29 text

queue_enabled inherited from root Saturday, November 10, 12

Slide 30

Slide 30 text

value of queue overrides parent value Saturday, November 10, 12

Slide 31

Slide 31 text

associations • Connect an environment to a physical host using associations • Any string you want, but defaults to fqdn • All hosts associated with the root node by default Saturday, November 10, 12

Slide 32

Slide 32 text

Saturday, November 10, 12

Slide 33

Slide 33 text

Jones client >>> from jones.client import JonesClient >>> jones = JonesClient(zookeeper_client, 'pycon') >>> jones['locale'] u'Canada' service name Saturday, November 10, 12

Slide 34

Slide 34 text

use-cases • configuration • define database slave membership • service endpoints • tune algorithms • switches Saturday, November 10, 12

Slide 35

Slide 35 text

switches # toggle features if jones.get('enable_flux_capacitor'): flux_capacitate() # enable features for a percentage of users if jones.get('pct_new_queue', 0) > random(): queue = new_queue # enable features by user bucket buckets = jones['macguffin_buckets'] if (user.id % bucket) in jones['macguffin_enabled_for']: user.wants_macguffin = True Saturday, November 10, 12

Slide 36

Slide 36 text

switches • Commit features early. Hide it behind a switch until it’s ready • Public betas • Turn off buggy or expensive features under heavy load • A/B testing Saturday, November 10, 12

Slide 37

Slide 37 text

Design Saturday, November 10, 12

Slide 38

Slide 38 text

• Jones was designed with 3 goals in mind • Clients MUST only talk to ZooKeeper • Accessing configuration MUST be simple (i.e. no computation) • Unique views of the config MUST be available on a host- by-host basis Saturday, November 10, 12

Slide 39

Slide 39 text

• Wanted clients to be as simple as possible to make porting clients easy • So server has to do all the work Saturday, November 10, 12

Slide 40

Slide 40 text

• environments map directly to the znode graph. • each service has a root containing • environment config • associations • materialize views environments Saturday, November 10, 12

Slide 41

Slide 41 text

data model /services /{service name} # root /conf /nodemaps {host} -> {path to view} /views Saturday, November 10, 12

Slide 42

Slide 42 text

/services/{service name}/conf {"queue_enabled": true} /production {"queue": "rabbit"} /new_queue {"queue": "0mq"} root environment Saturday, November 10, 12

Slide 43

Slide 43 text

/services/{service name}/nodemaps web2.example.com -> /services/example/views/production/new_queue web3.example.com -> /services/example/views/production/new_queue web1.example.com -> /services/example/views/production/new_queue Saturday, November 10, 12

Slide 44

Slide 44 text

/services/{service name}/views {"queue_enabled": true} /production {"queue": "rabbit", "queue_enabled": true} /new_queue {"queue": "0mq", "queue_enabled": true} root view Saturday, November 10, 12

Slide 45

Slide 45 text

Views • The final config data is materialized so only a single read is required. • This dramatically simplifies any client. • However the server becomes more complex. Saturday, November 10, 12

Slide 46

Slide 46 text

Jones server • Simple flask app • Sentry support • optional ZK ACLs to ensure consistency • Jones class deals with complexities of materializing views and managing associations. Saturday, November 10, 12

Slide 47

Slide 47 text

Jones client • Initialize with service name • Sets watches on nodemaps and environment view • nodemaps watch makes sure we always know what environment is ours • view watch keeps config dict up to date. Can optionally invoke callback • Simple! Saturday, November 10, 12

Slide 48

Slide 48 text

DEMO TIME let’s hope this works Saturday, November 10, 12

Slide 49

Slide 49 text

What you should have seen • create new service • set some root config • show that we can get value from client • change config. show that it reflects in client • add child env. associate to my laptop • show that config changes & inheritance works Saturday, November 10, 12

Slide 50

Slide 50 text

in summary... • Jones doesn’t really do all that much • provides a hierarchy of configurations, with children inheriting from parents • a web UI for managing config as a JSON object • a way to peg certain configurations to specific hosts/ processes/clusters/etc. Saturday, November 10, 12

Slide 51

Slide 51 text

Roadmap • UI needs help: error messages, stress test • Web App auth/ACLs for compartmentalization • Audit log • Ability to peg to versions • i.e. this service always needs version N • see github issues Saturday, November 10, 12

Slide 52

Slide 52 text

It’s a golden age for ZooKeeper in Python • Ben Bangert & co. are diligently working on Kazoo, a pure python ZK client. Full featured and well written [4]. Saturday, November 10, 12

Slide 53

Slide 53 text

Kazoo Patterns • Lock • Party • Partitioner • Election • Counter • Barrier Saturday, November 10, 12

Slide 54

Slide 54 text

Lock • Serialize access to a shared resource zk  =  KazooClient() lock  =  zk.Lock("/macguffin",  "mwhooker") with  lock:    #  blocks  waiting  for  lock  acquisition        use_macguffin() Saturday, November 10, 12

Slide 55

Slide 55 text

Party • Determine members of a party • Who’s currently providing service X? zk = KazooClient() party = zk.Party('/birthday', 'matt') party.join() list(party) ['matt'] # =( Saturday, November 10, 12

Slide 56

Slide 56 text

Partitioner • Divide up resources amongst participants zk = KazooClient() qp = client.SetPartitioner( path='/birthday_cake', set=('piece-1', 'piece-2', 'piece-3') ) while True: if qp.failed: raise Exception("no more cake left") elif qp.acquired: for cake_piece in qp: nomnom(cake_piece) elif qp.allocating: qp.wait_for_acquire() Saturday, November 10, 12

Slide 57

Slide 57 text

Election • Elect a leader from a party • Who’s going to perform this bit of work? zk = KazooClient() election = zk.Election( "/election2012", "obama-biden" ) # blocks until the election is won, then calls # swear_in() election.run(swear_in) Saturday, November 10, 12

Slide 58

Slide 58 text

Counter • Store a count in ZK • Relies on MVCC and retry, so may time out zk = KazooClient() counter = zk.Counter("/int") counter += 2 counter -= 1 counter.value == 1 Saturday, November 10, 12

Slide 59

Slide 59 text

Barrier • Block clients until a condition is met # coffee master zk = KazooClient() barrier = zk.Barrier('/coffee-barrier') barrier.create() brew_coffee() barrier.remove() # coffee slave zk = KazooClient() barrier = zk.Barrier('/coffee-barrier') barrier.wait() drink_coffee() Saturday, November 10, 12

Slide 60

Slide 60 text

More on ZooKeeper if we have time Saturday, November 10, 12

Slide 61

Slide 61 text

znode types • So far we’ve only talked about data nodes, but there are 2 other types • ephemeral • sequence • they can be mixed Saturday, November 10, 12

Slide 62

Slide 62 text

Ephemeral Nodes • Only exist as long as the creator maintains a connection to ZK • How the Party, Lock, and Barrier recipes are achieved Saturday, November 10, 12

Slide 63

Slide 63 text

Sequence Nodes • When creating a sequence znode, ZK appends a monotonically increasing counter to the end of path. • e.g. 2 calls to create sequence znodes at /lock- will result in • /lock-0 • /lock-1 • sequences are unique to the parent Saturday, November 10, 12

Slide 64

Slide 64 text

ZooKeeper is highly available* • ZK is distributed • a ZK cluster is known as an ensemble • * unless there is a network partition Saturday, November 10, 12

Slide 65

Slide 65 text

• Writes to ZK are committed to a majority (aka quorum) of nodes before success is communicated • some nodes may have old data • Reads happen from any node • Writes are forwarded through master • As ensemble size grows, read performance increases while write performance decreases. Saturday, November 10, 12

Slide 66

Slide 66 text

• ZooKeeper can only work if a majority of servers are correct (i.e., with 2f + 1 server we can tolerate f failures).[5] • Means we need to run an odd number of server • 3 is the minimum, 5 is recommended • with 5 we can tolerate 2 failures Saturday, November 10, 12

Slide 67

Slide 67 text

References 1. http://www.paulhammond.org/2010/06/trunk/ 2. http://jsoneditoronline.org 3. http://zookeeper.apache.org/ 4. https://github.com/python-zk/kazoo 5. http://static.usenix.org/event/usenix10/tech/ full_papers/Hunt.pdf Saturday, November 10, 12

Slide 68

Slide 68 text

Thank you! Matthew Hooker I’m looking for a job mwhooker@gmail.com twitter @mwhooker github https://github.com/mwhooker https://speakerdeck.com/mwhooker/jones https://github.com/mwhooker/jones Saturday, November 10, 12