Building to Scale (PyCon TW 2013)

BUILDING TO SCALE David Cramer twitter.com/zeeg Sunday, May 26, 13

Who am I? Sunday, May 26, 13

Sunday, May 26, 13

What do we mean by scale? Sunday, May 26, 13

DISQUS Massive traﬃc with a long tail Sentry Counters and
event aggregation tenXer More stats than we can count Sunday, May 26, 13

Does one size ﬁt all? Sunday, May 26, 13

NoSQL? Sunday, May 26, 13

Postgres is the foundation of DISQUS Sunday, May 26, 13

MySQL powers the tenXer graph store Sunday, May 26, 13

Sentry is built on SQL Sunday, May 26, 13

SQL is extremely powerful and versatile Sunday, May 26, 13

Compromise Sunday, May 26, 13

Scaling is about Predictability Sunday, May 26, 13

Augment SQL with [technology] Sunday, May 26, 13

Sunday, May 26, 13

Simple solutions using Redis (I like Redis) Sunday, May 26,
13

Counters Sunday, May 26, 13

Counters are everywhere Sunday, May 26, 13

Counters in SQL UPDATE table SET counter = counter +
1; Sunday, May 26, 13

Counters in Redis INCR counter 1 >>> redis.incr('counter') The key
doesn't have to exist! Sunday, May 26, 13

Counters in Sentry event ID 1 event ID 2 event
ID 3 Redis INCR Redis INCR Redis INCR SQL Update Buﬀers! event ID 1 Sunday, May 26, 13

Counters in Sentry ‣ INCR event_id in Redis ‣ Queue
buﬀer incr task ‣ 5 - 10s explicit delay ‣ Task does atomic GET event_id and DEL event_id (Redis pipeline) ‣ No-op If GET is not > 0 ‣ One SQL UPDATE per unique event per delay Sunday, May 26, 13

Counters in Sentry (cont.) Pros ‣ Solves database row lock
contention ‣ Redis nodes are horizontally scalable ‣ Easy to implement Cons ‣ Too many dummy (no-op) tasks Sunday, May 26, 13

Alternative Counters event ID 1 event ID 2 event ID
3 Redis ZINCRBY Redis ZINCRBY Redis ZINCRBY SQL Update Sunday, May 26, 13

Sorted Sets in Redis > ZINCRBY events ad93a 1 {ad93a:
1} > ZINCRBY events ad93a 1 {ad93a: 2} > ZINCRBY events d2ow3 1 {ad93a: 2, d2ow3: 1} Sunday, May 26, 13

Alternative Counters ‣ ZINCRBY events event_id in Redis ‣ Cron
buffer flush ‣ ZRANGE events to get pending updates ‣ Fire individual task per update ‣ Atomic ZSCORE events event_id and ZREM events event_id to get and flush count. Sunday, May 26, 13

Alternative Counters (cont.) Pros ‣ Removes (most) no-op tasks ‣
Works without a complex queue due to no required delay on jobs Cons ‣ Single Redis key stores all pending updates Sunday, May 26, 13

Activity Streams Sunday, May 26, 13

Streams are everywhere Sunday, May 26, 13

Streams in SQL class Activity: SET_RESOLVED = 1 SET_REGRESSION =
6 TYPE = ( (SET_RESOLVED, 'set_resolved'), (SET_REGRESSION, 'set_regression'), ) event = ForeignKey(Event) type = IntegerField(choices=TYPE) user = ForeignKey(User, null=True) datetime = DateTimeField() data = JSONField(null=True) Sunday, May 26, 13

Streams in SQL (cont.) >>> Activity(event, SET_RESOLVED, user, now) "David
marked this event as resolved." >>> Activity(event, SET_REGRESSION, datetime=now) "The system marked this event as a regression." >>> Activity(type=DEPLOY_START, datetime=now) "A deploy started." >>> Activity(type=SET_RESOLVED, datetime=now) "All events were marked as resolved" Sunday, May 26, 13

Stream == View == Cache Sunday, May 26, 13

Views as a Cache TIMELINE = [] MAX = 500
def on_event_creation(event): global TIMELINE TIMELINE.insert(0, event) TIMELINE = TIMELINE[:MAX] def get_latest_events(num=100): return TIMELINE[:num] Sunday, May 26, 13

Views in Redis class Timeline(object): def __init__(self): self.db = Redis()
def add(self, event): score = float(event.date.strftime('%s.%m')) self.db.zadd('timeline', event.id, score) def list(self, offset=0, limit=-1): return self.db.zrevrange( 'timeline', offset, limit) Sunday, May 26, 13

Views in Redis (cont.) MAX_SIZE = 10000 def add(self, event):
score = float(event.date.strftime('%s.%m')) # increment the key and trim the data to avoid # data bloat in a single key with self.db.pipeline() as pipe: pipe.zadd(self.key, event.id, score) pipe.zremrange(self.key, event.id, MAX_SIZE, -1) Sunday, May 26, 13

Queuing Sunday, May 26, 13

Celery Sunday, May 26, 13

RabbitMQ or Redis Sunday, May 26, 13

Asynchronous Tasks # Register the task @task(queue=”event_creation”) def on_event_creation(event_id): counter.incr('events',
event_id) # Delay execution on_event_creation.delay(event.id) Sunday, May 26, 13

Fanout @task(queue=”counters”) def incr_counter(key, id=None): counter.incr(key, id) @task(queue=”event_creation”) def on_event_creation(event_id):
incr_counter.delay('events', event_id) incr_counter.delay('global') # Delay execution on_event_creation.delay(event.id) Sunday, May 26, 13

Keep jobs small! Sunday, May 26, 13

@task def add_everything_poorly(): results = Event.objects.all() for event in results:
add_event(event.id) Sunday, May 26, 13

@task def add_everything(offset=0, limit=1000): results = chunked(Event, offset, limit) for
event in results: add_event.delay(event.id) if len(results) == limit: return # finished! add_everything.delay( offset=offset + limit, limit=limit, ) Sunday, May 26, 13

Object Caching Sunday, May 26, 13

Object Cache Prerequisites ‣ Your database can't handle the read-load
‣ Your data changes infrequently ‣ You can handle slightly worse performance Sunday, May 26, 13

Distributing Load with Memcache Memcache 1 Memcache 2 Memcache 3
Event ID 01 Event ID 04 Event ID 07 Event ID 10 Event ID 13 Event ID 02 Event ID 05 Event ID 08 Event ID 11 Event ID 14 Event ID 03 Event ID 06 Event ID 09 Event ID 12 Event ID 15 Sunday, May 26, 13

Querying the Object Cache def make_key(model, id): return '{}:{}'.format(model.__name__, id)
def get_by_ids(model, id_list): model_name = model.__name__ keys = map(make_key, id_list) res = cache.get_multi(keys) pending = set() for id, value in res.iteritems(): if value is None: pending.add(id) if pending: mres = model.objects.in_bulk(pending) cache.set_multi({make_key(o.id): o for o in mres}) res.update(mres) return res Sunday, May 26, 13

Pushing State def save(self): cache.set(make_key(type(self), self.id), self) def delete(self): cache.delete(make_key(type(self),
self.id) # or use a tombstone cache.set(make_key(type(self), self.id, DELETED) Sunday, May 26, 13

Planning for the Future Sunday, May 26, 13

One of the largest problems for Disqus is network-wide moderation
Sunday, May 26, 13

The DISQUS API has more than 90 complex endpoints Sunday,
May 26, 13

Be Mindful of Features Sunday, May 26, 13

Sentry's Team Dashboard ‣ Data limited to a single team
‣ Simple views which could be materialized ‣ Only entry point for "data for team" Sunday, May 26, 13

Sentry's Stream View ‣ Data limited to a single project
‣ Each project could map to a diﬀerent DB Sunday, May 26, 13

Preallocate Resources Sunday, May 26, 13

Redis data must ﬁt in memory Sunday, May 26, 13

DB5 DB6 DB7 DB8 DB9 DB0 DB1 DB2 DB3 DB4
redis-1 Sunday, May 26, 13

redis-2 DB5 DB6 DB7 DB8 DB9 DB0 DB1 DB2 DB3
DB4 redis-1 When a physical machine becomes overloaded migrate a chunk of shards to another machine. Sunday, May 26, 13

Takeaways Sunday, May 26, 13

Enhance your database Don't replace it Sunday, May 26, 13

Queue Everything Sunday, May 26, 13

Learn to say no (to features) Sunday, May 26, 13

Complex problems do not require complex solutions Sunday, May 26,
13

Thank You! Sunday, May 26, 13

Building to Scale (PyCon TW 2013)

Building to Scale (PyCon TW 2013)

More Decks by David Cramer

Other Decks in Programming

Featured

Transcript