Slide 1

Slide 1 text

Inside Architecture Lanyrd's Andrew Godwin Web Engineer, Lanyrd @andrewgodwin

Slide 2

Slide 2 text

WHO AM I? Andrew Godwin Web developer Systems administrator Technical architect Django core developer

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

LANYRD: THE EARLY YEARS The Origin Story

Slide 5

Slide 5 text

LANYRD: THE EARLY YEARS 2010 2011 2012 2013 June 2010

Slide 6

Slide 6 text

LANYRD: THE EARLY YEARS 2010 2011 2012 2013 August 2010 Good music on, an orange juice and some CSS fun in front of me, we have an apartment in Casablanca! (for a week or two anyway :) ” ” @natbat 7:19 pm, 18 August 2010

Slide 7

Slide 7 text

LANYRD: THE EARLY YEARS 2010 2011 2012 2013 August 2010 We launched lanyrd.com/ ! Go easy on it, the log files are going a bit nuts, who knew Twitter was viral? ” ” @simonw 10:52 am, 31 August 2010

Slide 8

Slide 8 text

LANYRD: THE EARLY YEARS 2010 2011 2012 2013 August 2010 Right... this clearly isn't sustainable. Going to have to switch the site in to read only mode for a few hours, sorry everyone! ” ” @simonw 11:35 am, 31 August 2010

Slide 9

Slide 9 text

LANYRD: THE EARLY YEARS 2010 2011 2012 2013 January 2011 Natalie and Simon start three months of YCombinator, in California.

Slide 10

Slide 10 text

LANYRD: THE EARLY YEARS 2010 2011 2012 2013 September 2011 Lanyrd closes a $1.4 million seed funding round, moves back to London.

Slide 11

Slide 11 text

LANYRD TODAY 2010 2011 2012 2013 March 2013 ∙ Conferences ∙ Profile pages ∙ Emails ∙ Coverage ∙ Topics ∙ Guides ∙ Mobile app ∙ Dashboard

Slide 12

Slide 12 text

LANYRD TODAY 2010 2011 2012 2013 March 2013

Slide 13

Slide 13 text

LANYRD TODAY 2010 2011 2012 2013 March 2013

Slide 14

Slide 14 text

LANYRD TODAY 2010 2011 2012 2013 March 2013

Slide 15

Slide 15 text

LANYRD TODAY 2010 2011 2012 2013 March 2013

Slide 16

Slide 16 text

LANYRD TODAY 2010 2011 2012 2013 March 2013

Slide 17

Slide 17 text

LANYRD TODAY 2010 2011 2012 2013 March 2013 Key dynamic parts: Users tracking/attending events Users tracking each other Users tracking topics and guides

Slide 18

Slide 18 text

THE STACK TODAY What we run on

Slide 19

Slide 19 text

THE STACK TODAY Browser Nginx HAProxy Varnish Gunicorn Main site runtime Amazon S3 Celery Task workers Redis PostgreSQL Solr SSL Termination Web Cache Load balancer Static files & uploads Tasks, Set calcs Search and faceting Main data store Memcached Fragment caching

Slide 20

Slide 20 text

THE STACK TODAY Lanyrd is almost entirely Django (Python) Background tasks use Celery, a Django task queue Management tasks/cron jobs also run inside the framework The Django application is served by Gunicorn containers

Slide 21

Slide 21 text

THE STACK TODAY Main data store for everything except uploads We run a master and a replicated slave Around 80GB of data in five databases Each server runs on a RAID 1 disk array PostgreSQL

Slide 22

Slide 22 text

THE STACK TODAY Task queue transport for Celery and tweet listeners Contains user sets for every conference, user and topic Used for efficient narrowing of queries before Solr is hit Redis

Slide 23

Slide 23 text

THE STACK TODAY Stores conferences, users, sessions and more Very rich metadata on each item Heavy use of sharding thoroughout the site Solr We run a master and a replicated slave

Slide 24

Slide 24 text

THE STACK TODAY First point of call for all requests Caches most anonymous requests Enforces read-only mode if enabled Varnish One used and one hot spare at all times

Slide 25

Slide 25 text

THE STACK TODAY Sits behind Varnish Distributes load amongst frontend servers Re-routes requests during deploys HAProxy Two in use at all times, identically configured

Slide 26

Slide 26 text

THE STACK TODAY Stores all uploaded files from users Upload forms post directly to S3 Serves all static assets for the site (images, CSS, JS) S3 Static assets are versioned with hash to help cache break

Slide 27

Slide 27 text

THE STACK TODAY Browser Nginx HAProxy Varnish Gunicorn Main site runtime Amazon S3 Celery Task workers Redis PostgreSQL Solr SSL Termination Web Cache Load balancer Static files & uploads Tasks, Set calcs Search and faceting Main data store Memcached Fragment caching

Slide 28

Slide 28 text

THE STACK BEFORE What we've eliminated

Slide 29

Slide 29 text

THE STACK BEFORE Stored analytics, logs and some other data Lack of schema meant some bad data persisted Poor complex query performance MongoDB Useful for quick prototyping

Slide 30

Slide 30 text

THE STACK BEFORE Primary data store for things not in MongoDB Very poor complex query performance No advanced field types MySQL Full database locks during schema changes

Slide 31

Slide 31 text

A TALE OF TWO DBS The Great Move of 2012

Slide 32

Slide 32 text

A TALE OF TWO DBS Amazon EC2 MySQL Softlayer PostgreSQL

Slide 33

Slide 33 text

A TALE OF TWO DBS Why? Predictable loading means EC2 unnecessary Better I/O throughput Both moves required database downtime

Slide 34

Slide 34 text

A TALE OF TWO DBS How? Replicate Solr and Redis across to new servers Enter read-only mode Dump MySQL data Convert MySQL dump into PostgreSQL dump Load PostgreSQL dump Re-point DNS, proxy requests from old servers Exit read-only mode

Slide 35

Slide 35 text

A TALE OF TWO DBS Time in read-only mode: 1 ½ hours Downtime: 0 hours

Slide 36

Slide 36 text

CONTENT IS KING The Advantages of Content

Slide 37

Slide 37 text

CONTENT IS KING Read-only mode is entirely viable An hour or two at most Everyone logged out Varnish blocks POSTs, caches everything aggressively

Slide 38

Slide 38 text

CONTENT IS KING Indexing delay is acceptable Most site views are driven by Solr 1 or 2 minute indexing delay Some views add in recent changes directly

Slide 39

Slide 39 text

FEATURE FLAGS Always be deploying

Slide 40

Slide 40 text

FEATURE FLAGS Continuous Deployment We deploy at least 5 times a day, if not 20 Nearly all code goes into master or short-lived branches Anything unreleased is feature flagged

Slide 41

Slide 41 text

FEATURE FLAGS Feature flags Simple named boolean toggles Settable by user, user tag, or conference Can change templates, view code, URLs, etc.

Slide 42

Slide 42 text

FEATURE FLAGS Flag management User tag management

Slide 43

Slide 43 text

WHO WROTE THAT? OH, ME Legacy code & decisions

Slide 44

Slide 44 text

WHO WROTE THAT? OH, ME Technical Debt It's fine to have some - it can speed things up A good chunk of ours is gone, some remains Big schema changes get harder and harder

Slide 45

Slide 45 text

SMALL AND NIMBLE The power of small teams

Slide 46

Slide 46 text

SMALL AND NIMBLE Six people

Slide 47

Slide 47 text

SMALL AND NIMBLE Six people 2.5 Back-end developers 1.75 Front-end developers 1.5 Designers 0.75 System administrators 0.75 Business operations 0.5 Mobile developers

Slide 48

Slide 48 text

SMALL AND NIMBLE Awareness Everyone knows everything that's happening Daily stand-ups Weekly show-and-tell sessions

Slide 49

Slide 49 text

SMALL AND NIMBLE Always deployable Master branch always shippable Large development behind feature flags Code review for nastier changes

Slide 50

Slide 50 text

LESSONS LEARNED What's important here?

Slide 51

Slide 51 text

LESSONS LEARNED Small and nimble Continuous deployment and development style allows easy project changing No long approval processes Less than ½ hour from report to shipped fix

Slide 52

Slide 52 text

LESSONS LEARNED Content is great Read-only mode allows less painful downtimes Heavy caching smooths out our load Learnable load patterns

Slide 53

Slide 53 text

LESSONS LEARNED Fix it while you can The bigger you get, the harder a fix We moved to PostgreSQL just in time Big schema changes now take days of coding

Slide 54

Slide 54 text

LESSONS LEARNED Six amazing people You don't need a big team to write a complex product Communication is absolutely key Using Open Source well is also crucial

Slide 55

Slide 55 text

Thank you. Andrew Godwin Sponsor or promote your company using events? Get in touch: @andrewgodwin http://aeracode.org [email protected]