Slide 1

Slide 1 text

Release Workflow Diego Muñoz [email protected] http://twitter.com/Kartones V1.1

Slide 2

Slide 2 text

Agenda • Numbers • Release workflow • Tools

Slide 3

Slide 3 text

Numbers • +13M users (~10,4M active users dec 2011) • +100 usage minutes / day (avg) • +400M chat messages / day • +4M photos uploaded / day (peaks) • +41,000M page views / month • +35K requests / sec (peaks) • +1,3K servers • +250 employees (~60% techies) • +15K files in the repositories • +10K Tests

Slide 4

Slide 4 text

Release Workflow Branch Code Test Integrate Release Stabilize

Slide 5

Slide 5 text

Release Workflow: Branch Branch Code Test Integrate Release Stabilize • Avg. 15 branches per release • Current record: 29 branches • Repository per functional area (be, fe, stats, …) • Avg. lines modified per release: 63K

Slide 6

Slide 6 text

Release Workflow: Code + Test Branch Code Test Integrate Release Stabilize • Scrum (or at least Agile) • As TDD as possible • Labs • A/B Testing • PoCs • Dark launch

Slide 7

Slide 7 text

Release Workflow: Integrate Branch Code Test Integrate Release Stabilize • Repo always available • Specific release date given by devops – Merge & wait for target • Only merge if 100% tests ok or specific approval • QA Regression & manual tests • Fix possible integration problems ASAP

Slide 8

Slide 8 text

Release Workflow: Release Branch Code Test Integrate Release Stabilize • 3 releases per week – DevOps goal: All weekdays • Latest stable changeset from Integration taken previous working day morning • Release doc, pre-release meetings • Staging servers to test with live data

Slide 9

Slide 9 text

Release Workflow: Stabilize Branch Code Test Integrate Release Stabilize • First code push: 8 AM – DevOps Goal: single push + release closed • Release window: 1-2 h – DevOps goal: < 30 minutes • Error stabilization or release rollback • Representatives from all involved teams

Slide 10

Slide 10 text

Tools

Slide 11

Slide 11 text

DVCS: Mercurial • http://mercurial.selenic.com/ • Syntax similar to SVN (our old system) • Easy API to plug our plugins and hooks • Cross-platform • Tuenti Addons: – Commit hooks to check syntax, push ticket #... • Problems: – Push/pulls through VPN are slow – Handling multiple repos still slow – Only one level of rollback!

Slide 12

Slide 12 text

Issue Tracking: Trac • http://trac.edgewall.org/ • User Stories tasks + Bugs • Wiki (now also internal Google Sites) • Plugins and extensible • Tuenti Addons: – Master/Slave architecture – Tons of tweaks and source code integration hooks • Problems: – Slow, limited, code viewing sucks • Migration to JIRA planned

Slide 13

Slide 13 text

Testing: PHPUnit • http://www.phpunit.de • Some caveats – Mocking just ‘works’ – PHP process spawning PHP tests • Tuenti Addons: – Vastly improved mocking framework – Shell scripts that isolate test batteries – Better integration with Selenium • Problems: – Our current FEFW does not cope perfectly with PHPUnit/Selenium

Slide 14

Slide 14 text

Testing: Selenium • http://seleniumhq.org/ • Running browser tests in FFox and IE • Tuenti Addons: –Custom build with some fixes • Problems: –Javascript handling/detection not perfect –AJAX far from optimal –IE runner is an iframe • Planned migration to Webdriver

Slide 15

Slide 15 text

CI: Jenkins • http://jenkins-ci.org/ • Previously Hudson too • Specialized farm (master + 22 nodes) • Tuenti Addons: • Parallelization (up to 6 nodes) • Special reports • “Smart” runs (try first last failed tests, etc.) • Problems: • Browser tests slow (due to Selenium) • Unstable (mainly due to Selenium)

Slide 16

Slide 16 text

Storage: MySQL • http://www.mysql.com/ | http://www.percona.com • Live site storage • Dev. env. storage – 1 DB per user (to run tests) – 1 shared DB (common faked data) • Clusters of master/slave DBs • Problems: – Slow when running tests – Shared dev DB has old-time inconsistencies

Slide 17

Slide 17 text

Storage: Hadoop • http://hadoop.apache.org/ • Dedicated cluster • Pig scripts: Stats, other non-realtime data • HBase: Async. data storage • Hive: SQL-like querying • Problems: –Complex configuration for newcomers

Slide 18

Slide 18 text

Caching: Memcached • http://memcached.org/ • Avg. DB querys/pageview: 0.3 • Dev. Behaviour == live behaviour • Tuenti Addons (https://github.com/tuenti): – UDP + multi-ports • Problems: – 32GB RAM / machine practical limit – Remember to warm-up data or MC will kill the DB!

Slide 19

Slide 19 text

Configuration: Puppet • http://puppetlabs.com/ • Production machines • Jenkins nodes • VM management / Dev web servers config • Problems: – Wipes user config if not puppetized

Slide 20

Slide 20 text

Search: Sphinx • http://sphinxsearch.com/ • Non-realtime (index based) • Very fast • Problems – Index re-generation on dev & test env. – Could be more friendly to add new data

Slide 21

Slide 21 text

Build: Our build script • http://ant.apache.org/ • Localization • Minification + Bundling + Versioning • Statics deployment to CDNs • Fast: 2-3 minutes full build – Multithreading + parallelization • Allows partial, component based builds • Problems: – Under heavy CPU load, build time goes up :(

Slide 22

Slide 22 text

Build: RSync • http://rsync.samba.org/ • Deployment of code (live & dev) • Sends deltas/diffs • Really fast

Slide 23

Slide 23 text

Statics bundling: YUI • Less files == faster download & deploy • Big text file == better HTTP Gzip • HTML/JS/CSS Minification • Ultra fast build: ~4 seconds in Dev. • On demand JS loading! • Nice typical framework features • Wonderful & simple events system

Slide 24

Slide 24 text

Statics bundling: YUI (II) • Tuenti Addons: – Caching builds – Line breaks each # characters (easier IE debugging) – CDNs handling • Problems: – Change in JS requires rebuilding even in dev. – Requires small migrations/changes in existing JS

Slide 25

Slide 25 text

Chat Server: Ejabberd • http://www.ejabberd.im/ • Erlang XMPP (Jabber) server • 400M msgs/day, 1M concurrent users peak,… • 20 machines, ~5 instances per machine • Tuenti Addons: – Ejjabberd codebase tweaked (3,5x faster) – Protocol tweaks to optimize for our architecture • Problems: – Same behaviour dev/live is critical

Slide 26

Slide 26 text

Compilation: HipHop • Migrating old code to fully support HipHop – With PHP 5.3 • Obvious speed improvements • Also nice for static code analysis

Slide 27

Slide 27 text

Sounds interesting? http://jobs.tuenti.com exit(0);