Inside
Architecture
Lanyrd's
Andrew Godwin
Web Engineer, Lanyrd
@andrewgodwin
Slide 2
Slide 2 text
WHO AM I?
Andrew Godwin
Web developer
Systems administrator
Technical architect
Django core developer
Slide 3
Slide 3 text
No content
Slide 4
Slide 4 text
LANYRD: THE EARLY YEARS
The Origin Story
Slide 5
Slide 5 text
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
June 2010
Slide 6
Slide 6 text
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
August 2010
Good music on, an orange juice and some
CSS fun in front of me, we have an apartment
in Casablanca! (for a week or two anyway :)
”
”
@natbat
7:19 pm, 18 August 2010
Slide 7
Slide 7 text
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
August 2010
We launched lanyrd.com/ ! Go easy on it,
the log files are going a bit nuts,
who knew Twitter was viral?
”
”
@simonw
10:52 am, 31 August 2010
Slide 8
Slide 8 text
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
August 2010
Right... this clearly isn't sustainable. Going to
have to switch the site in to read only mode
for a few hours, sorry everyone!
”
”
@simonw
11:35 am, 31 August 2010
Slide 9
Slide 9 text
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
January 2011
Natalie and Simon start three months of
YCombinator, in California.
Slide 10
Slide 10 text
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
September 2011
Lanyrd closes a $1.4 million seed funding
round, moves back to London.
LANYRD TODAY
2010 2011 2012 2013
March 2013
Key dynamic parts:
Users tracking/attending events
Users tracking each other
Users tracking topics and guides
Slide 18
Slide 18 text
THE STACK TODAY
What we run on
Slide 19
Slide 19 text
THE STACK TODAY
Browser
Nginx
HAProxy
Varnish
Gunicorn
Main site runtime
Amazon S3
Celery
Task workers
Redis
PostgreSQL Solr
SSL Termination
Web Cache
Load balancer
Static files & uploads
Tasks, Set calcs
Search and faceting
Main data store
Memcached
Fragment caching
Slide 20
Slide 20 text
THE STACK TODAY
Lanyrd is almost entirely Django (Python)
Background tasks use Celery, a Django task queue
Management tasks/cron jobs also run inside the framework
The Django application is served by Gunicorn containers
Slide 21
Slide 21 text
THE STACK TODAY
Main data store for everything except uploads
We run a master and a replicated slave
Around 80GB of data in five databases
Each server runs on a RAID 1 disk array
PostgreSQL
Slide 22
Slide 22 text
THE STACK TODAY
Task queue transport for Celery and tweet listeners
Contains user sets for every conference, user and topic
Used for efficient narrowing of queries before Solr is hit
Redis
Slide 23
Slide 23 text
THE STACK TODAY
Stores conferences, users, sessions and more
Very rich metadata on each item
Heavy use of sharding thoroughout the site
Solr
We run a master and a replicated slave
Slide 24
Slide 24 text
THE STACK TODAY
First point of call for all requests
Caches most anonymous requests
Enforces read-only mode if enabled
Varnish
One used and one hot spare at all times
Slide 25
Slide 25 text
THE STACK TODAY
Sits behind Varnish
Distributes load amongst frontend servers
Re-routes requests during deploys
HAProxy
Two in use at all times, identically configured
Slide 26
Slide 26 text
THE STACK TODAY
Stores all uploaded files from users
Upload forms post directly to S3
Serves all static assets for the site (images, CSS, JS)
S3
Static assets are versioned with hash to help cache break
Slide 27
Slide 27 text
THE STACK TODAY
Browser
Nginx
HAProxy
Varnish
Gunicorn
Main site runtime
Amazon S3
Celery
Task workers
Redis
PostgreSQL Solr
SSL Termination
Web Cache
Load balancer
Static files & uploads
Tasks, Set calcs
Search and faceting
Main data store
Memcached
Fragment caching
Slide 28
Slide 28 text
THE STACK BEFORE
What we've eliminated
Slide 29
Slide 29 text
THE STACK BEFORE
Stored analytics, logs and some other data
Lack of schema meant some bad data persisted
Poor complex query performance
MongoDB
Useful for quick prototyping
Slide 30
Slide 30 text
THE STACK BEFORE
Primary data store for things not in MongoDB
Very poor complex query performance
No advanced field types
MySQL
Full database locks during schema changes
Slide 31
Slide 31 text
A TALE OF TWO DBS
The Great Move of 2012
Slide 32
Slide 32 text
A TALE OF TWO DBS
Amazon EC2
MySQL
Softlayer
PostgreSQL
Slide 33
Slide 33 text
A TALE OF TWO DBS
Why?
Predictable loading means EC2 unnecessary
Better I/O throughput
Both moves required database downtime
Slide 34
Slide 34 text
A TALE OF TWO DBS
How?
Replicate Solr and Redis across to new servers
Enter read-only mode
Dump MySQL data
Convert MySQL dump into PostgreSQL dump
Load PostgreSQL dump
Re-point DNS, proxy requests from old servers
Exit read-only mode
Slide 35
Slide 35 text
A TALE OF TWO DBS
Time in read-only mode: 1 ½ hours
Downtime: 0 hours
Slide 36
Slide 36 text
CONTENT IS KING
The Advantages of Content
Slide 37
Slide 37 text
CONTENT IS KING
Read-only mode is entirely viable
An hour or two at most
Everyone logged out
Varnish blocks POSTs, caches everything aggressively
Slide 38
Slide 38 text
CONTENT IS KING
Indexing delay is acceptable
Most site views are driven by Solr
1 or 2 minute indexing delay
Some views add in recent changes directly
Slide 39
Slide 39 text
FEATURE FLAGS
Always be deploying
Slide 40
Slide 40 text
FEATURE FLAGS
Continuous Deployment
We deploy at least 5 times a day, if not 20
Nearly all code goes into master or short-lived branches
Anything unreleased is feature flagged
Slide 41
Slide 41 text
FEATURE FLAGS
Feature flags
Simple named boolean toggles
Settable by user, user tag, or conference
Can change templates, view code, URLs, etc.
Slide 42
Slide 42 text
FEATURE FLAGS
Flag management
User tag management
Slide 43
Slide 43 text
WHO WROTE THAT? OH, ME
Legacy code & decisions
Slide 44
Slide 44 text
WHO WROTE THAT? OH, ME
Technical Debt
It's fine to have some - it can speed things up
A good chunk of ours is gone, some remains
Big schema changes get harder and harder
Slide 45
Slide 45 text
SMALL AND NIMBLE
The power of small teams
Slide 46
Slide 46 text
SMALL AND NIMBLE
Six people
Slide 47
Slide 47 text
SMALL AND NIMBLE
Six people
2.5
Back-end
developers
1.75
Front-end
developers
1.5
Designers
0.75
System
administrators
0.75
Business
operations
0.5
Mobile
developers
Slide 48
Slide 48 text
SMALL AND NIMBLE
Awareness
Everyone knows everything that's happening
Daily stand-ups
Weekly show-and-tell sessions
Slide 49
Slide 49 text
SMALL AND NIMBLE
Always deployable
Master branch always shippable
Large development behind feature flags
Code review for nastier changes
Slide 50
Slide 50 text
LESSONS LEARNED
What's important here?
Slide 51
Slide 51 text
LESSONS LEARNED
Small and nimble
Continuous deployment and development style allows
easy project changing
No long approval processes
Less than ½ hour from report to shipped fix
Slide 52
Slide 52 text
LESSONS LEARNED
Content is great
Read-only mode allows less painful downtimes
Heavy caching smooths out our load
Learnable load patterns
Slide 53
Slide 53 text
LESSONS LEARNED
Fix it while you can
The bigger you get, the harder a fix
We moved to PostgreSQL just in time
Big schema changes now take days of coding
Slide 54
Slide 54 text
LESSONS LEARNED
Six amazing people
You don't need a big team to write a complex product
Communication is absolutely key
Using Open Source well is also crucial
Slide 55
Slide 55 text
Thank you.
Andrew Godwin
Sponsor or promote your company using events?
Get in touch:
@andrewgodwin
http://aeracode.org
[email protected]