Building to Scale (PyCon TW 2013)

Slide 1

Slide 1 text

BUILDING TO SCALE David Cramer twitter.com/zeeg Sunday, May 26, 13

Slide 2

Slide 2 text

Who am I? Sunday, May 26, 13

Slide 3

Slide 3 text

Sunday, May 26, 13

Slide 4

Slide 4 text

Sunday, May 26, 13

Slide 5

Slide 5 text

Sunday, May 26, 13

Slide 6

Slide 6 text

What do we mean by scale? Sunday, May 26, 13

Slide 7

Slide 7 text

DISQUS Massive traﬃc with a long tail Sentry Counters and event aggregation tenXer More stats than we can count Sunday, May 26, 13

Slide 8

Slide 8 text

Does one size ﬁt all? Sunday, May 26, 13

Slide 9

Slide 9 text

NoSQL? Sunday, May 26, 13

Slide 10

Slide 10 text

Postgres is the foundation of DISQUS Sunday, May 26, 13

Slide 11

Slide 11 text

MySQL powers the tenXer graph store Sunday, May 26, 13

Slide 12

Slide 12 text

Sentry is built on SQL Sunday, May 26, 13

Slide 13

Slide 13 text

SQL is extremely powerful and versatile Sunday, May 26, 13

Slide 14

Slide 14 text

Compromise Sunday, May 26, 13

Slide 15

Slide 15 text

Scaling is about Predictability Sunday, May 26, 13

Slide 16

Slide 16 text

Augment SQL with [technology] Sunday, May 26, 13

Slide 17

Slide 17 text

Sunday, May 26, 13

Slide 18

Slide 18 text

Simple solutions using Redis (I like Redis) Sunday, May 26, 13

Slide 19

Slide 19 text

Counters Sunday, May 26, 13

Slide 20

Slide 20 text

Counters are everywhere Sunday, May 26, 13

Slide 21

Slide 21 text

Counters in SQL UPDATE table SET counter = counter + 1; Sunday, May 26, 13

Slide 22

Slide 22 text

Counters in Redis INCR counter 1 >>> redis.incr('counter') The key doesn't have to exist! Sunday, May 26, 13

Slide 23

Slide 23 text

Counters in Sentry event ID 1 event ID 2 event ID 3 Redis INCR Redis INCR Redis INCR SQL Update Buﬀers! event ID 1 Sunday, May 26, 13

Slide 24

Slide 24 text

Counters in Sentry ‣ INCR event_id in Redis ‣ Queue buﬀer incr task ‣ 5 - 10s explicit delay ‣ Task does atomic GET event_id and DEL event_id (Redis pipeline) ‣ No-op If GET is not > 0 ‣ One SQL UPDATE per unique event per delay Sunday, May 26, 13

Slide 25

Slide 25 text

Counters in Sentry (cont.) Pros ‣ Solves database row lock contention ‣ Redis nodes are horizontally scalable ‣ Easy to implement Cons ‣ Too many dummy (no-op) tasks Sunday, May 26, 13

Slide 26

Slide 26 text

Alternative Counters event ID 1 event ID 2 event ID 3 Redis ZINCRBY Redis ZINCRBY Redis ZINCRBY SQL Update Sunday, May 26, 13

Slide 27

Slide 27 text

Sorted Sets in Redis > ZINCRBY events ad93a 1 {ad93a: 1} > ZINCRBY events ad93a 1 {ad93a: 2} > ZINCRBY events d2ow3 1 {ad93a: 2, d2ow3: 1} Sunday, May 26, 13

Slide 28

Slide 28 text

Alternative Counters ‣ ZINCRBY events event_id in Redis ‣ Cron buffer flush ‣ ZRANGE events to get pending updates ‣ Fire individual task per update ‣ Atomic ZSCORE events event_id and ZREM events event_id to get and flush count. Sunday, May 26, 13

Slide 29

Slide 29 text

Alternative Counters (cont.) Pros ‣ Removes (most) no-op tasks ‣ Works without a complex queue due to no required delay on jobs Cons ‣ Single Redis key stores all pending updates Sunday, May 26, 13

Slide 30

Slide 30 text

Activity Streams Sunday, May 26, 13

Slide 31

Slide 31 text

Streams are everywhere Sunday, May 26, 13

Slide 32

Slide 32 text

Streams in SQL class Activity: SET_RESOLVED = 1 SET_REGRESSION = 6 TYPE = ( (SET_RESOLVED, 'set_resolved'), (SET_REGRESSION, 'set_regression'), ) event = ForeignKey(Event) type = IntegerField(choices=TYPE) user = ForeignKey(User, null=True) datetime = DateTimeField() data = JSONField(null=True) Sunday, May 26, 13

Slide 33

Slide 33 text

Streams in SQL (cont.) >>> Activity(event, SET_RESOLVED, user, now) "David marked this event as resolved." >>> Activity(event, SET_REGRESSION, datetime=now) "The system marked this event as a regression." >>> Activity(type=DEPLOY_START, datetime=now) "A deploy started." >>> Activity(type=SET_RESOLVED, datetime=now) "All events were marked as resolved" Sunday, May 26, 13

Slide 34

Slide 34 text

Stream == View == Cache Sunday, May 26, 13

Slide 35

Slide 35 text

Views as a Cache TIMELINE = [] MAX = 500 def on_event_creation(event): global TIMELINE TIMELINE.insert(0, event) TIMELINE = TIMELINE[:MAX] def get_latest_events(num=100): return TIMELINE[:num] Sunday, May 26, 13

Slide 36

Slide 36 text

Views in Redis class Timeline(object): def __init__(self): self.db = Redis() def add(self, event): score = float(event.date.strftime('%s.%m')) self.db.zadd('timeline', event.id, score) def list(self, offset=0, limit=-1): return self.db.zrevrange( 'timeline', offset, limit) Sunday, May 26, 13

Slide 37

Slide 37 text

Views in Redis (cont.) MAX_SIZE = 10000 def add(self, event): score = float(event.date.strftime('%s.%m')) # increment the key and trim the data to avoid # data bloat in a single key with self.db.pipeline() as pipe: pipe.zadd(self.key, event.id, score) pipe.zremrange(self.key, event.id, MAX_SIZE, -1) Sunday, May 26, 13

Slide 38

Slide 38 text

Queuing Sunday, May 26, 13

Slide 39

Slide 39 text

Celery Sunday, May 26, 13

Slide 40

Slide 40 text

RabbitMQ or Redis Sunday, May 26, 13

Slide 41

Slide 41 text

Asynchronous Tasks # Register the task @task(queue=”event_creation”) def on_event_creation(event_id): counter.incr('events', event_id) # Delay execution on_event_creation.delay(event.id) Sunday, May 26, 13

Slide 42

Slide 42 text

Fanout @task(queue=”counters”) def incr_counter(key, id=None): counter.incr(key, id) @task(queue=”event_creation”) def on_event_creation(event_id): incr_counter.delay('events', event_id) incr_counter.delay('global') # Delay execution on_event_creation.delay(event.id) Sunday, May 26, 13

Slide 43

Slide 43 text

Keep jobs small! Sunday, May 26, 13

Slide 44

Slide 44 text

@task def add_everything_poorly(): results = Event.objects.all() for event in results: add_event(event.id) Sunday, May 26, 13

Slide 45

Slide 45 text

@task def add_everything(offset=0, limit=1000): results = chunked(Event, offset, limit) for event in results: add_event.delay(event.id) if len(results) == limit: return # finished! add_everything.delay( offset=offset + limit, limit=limit, ) Sunday, May 26, 13

Slide 46

Slide 46 text

Object Caching Sunday, May 26, 13

Slide 47

Slide 47 text

Object Cache Prerequisites ‣ Your database can't handle the read-load ‣ Your data changes infrequently ‣ You can handle slightly worse performance Sunday, May 26, 13

Slide 48

Slide 48 text

Distributing Load with Memcache Memcache 1 Memcache 2 Memcache 3 Event ID 01 Event ID 04 Event ID 07 Event ID 10 Event ID 13 Event ID 02 Event ID 05 Event ID 08 Event ID 11 Event ID 14 Event ID 03 Event ID 06 Event ID 09 Event ID 12 Event ID 15 Sunday, May 26, 13

Slide 49

Slide 49 text

Querying the Object Cache def make_key(model, id): return '{}:{}'.format(model.__name__, id) def get_by_ids(model, id_list): model_name = model.__name__ keys = map(make_key, id_list) res = cache.get_multi(keys) pending = set() for id, value in res.iteritems(): if value is None: pending.add(id) if pending: mres = model.objects.in_bulk(pending) cache.set_multi({make_key(o.id): o for o in mres}) res.update(mres) return res Sunday, May 26, 13

Slide 50

Slide 50 text

Pushing State def save(self): cache.set(make_key(type(self), self.id), self) def delete(self): cache.delete(make_key(type(self), self.id) # or use a tombstone cache.set(make_key(type(self), self.id, DELETED) Sunday, May 26, 13

Slide 51

Slide 51 text

Planning for the Future Sunday, May 26, 13

Slide 52

Slide 52 text

One of the largest problems for Disqus is network-wide moderation Sunday, May 26, 13

Slide 53

Slide 53 text

The DISQUS API has more than 90 complex endpoints Sunday, May 26, 13

Slide 54

Slide 54 text

Be Mindful of Features Sunday, May 26, 13

Slide 55

Slide 55 text

Sentry's Team Dashboard ‣ Data limited to a single team ‣ Simple views which could be materialized ‣ Only entry point for "data for team" Sunday, May 26, 13

Slide 56

Slide 56 text

Sentry's Stream View ‣ Data limited to a single project ‣ Each project could map to a diﬀerent DB Sunday, May 26, 13

Slide 57

Slide 57 text

Preallocate Resources Sunday, May 26, 13

Slide 58

Slide 58 text

Redis data must ﬁt in memory Sunday, May 26, 13

Slide 59

Slide 59 text

DB5 DB6 DB7 DB8 DB9 DB0 DB1 DB2 DB3 DB4 redis-1 Sunday, May 26, 13

Slide 60

Slide 60 text

redis-2 DB5 DB6 DB7 DB8 DB9 DB0 DB1 DB2 DB3 DB4 redis-1 When a physical machine becomes overloaded migrate a chunk of shards to another machine. Sunday, May 26, 13