Slide 1

Slide 1 text

SCALING A HIGHLY TRANSACTIONAL BUSINESS

Slide 2

Slide 2 text

2 @maraujop @ticketeaeng

Slide 3

Slide 3 text

Who are we? 3 @JavierHdez3 @maraujop @igalarzab @sullymorland @patoroco @imanolcg @RafaRM20 @iamcarlosedo @andrea_mgr @Mc_Arena_pr @javitxudedios @Maquert @gnufede @willyfrog_ @ShideShugo

Slide 4

Slide 4 text

Some numbers, peak hour stats 4 2014 6582 tickets 2015 21287 tickets

Slide 5

Slide 5 text

FRONTEND HISTORY

Slide 6

Slide 6 text

“Microservices are about scaling the number of engineers not the number of requests” Jay Kreps (Co-Creator of kafka)

Slide 7

Slide 7 text

BALANCE YOUR SYSTEM

Slide 8

Slide 8 text

A little bit of history 8

Slide 9

Slide 9 text

frontend Django 9

Slide 10

Slide 10 text

Pros 10 •5 coders worked at the company •Fast to build, easy to test and deploy •Simple stack PROS

Slide 11

Slide 11 text

Cons 11 •Workers blocked on API connections •Errors in backoffice could crash the site •Slow API means nothing loads CONS

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Afrodita sync system 13

Slide 14

Slide 14 text

Afrodita connection system 14

Slide 15

Slide 15 text

use js

Slide 16

Slide 16 text

2014 (350 frontends) 100x Django Frontend

Slide 17

Slide 17 text

2015 (20 frontends) 3.5x Afrodita

Slide 18

Slide 18 text

SYSTEMS

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

CLOUD PROVIDER TICKETEA

Slide 21

Slide 21 text

Why the cloud? 21 •High availability despite of load-spikes •Scale from 3 servers to 400 in minutes •Awesome managed services

Slide 22

Slide 22 text

Why the cloud? 22 •Easier to be fault tolerant •They have great uptime •We build AMIs on deploys and use autoscaling •Preheating

Slide 23

Slide 23 text

BACKEND

Slide 24

Slide 24 text

MY LEGACY CODE IS SLOW

Slide 25

Slide 25 text

Scaling 25 MY LEGACY CODE IS SLOW •Scale In: Bigger machine •Scale Out: More machines behind a load balancer

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

Things you can do 27 •Optimize where your real bottleneck is •Cache more things •Speed up your database usage •Denormalize your database •Execute less code •Speed up your code

Slide 28

Slide 28 text

“The central enemy of reliability is complexity” Andy Bechtolsheim (Co-founder Sun Microsystems)

Slide 29

Slide 29 text

Execute less code 29 •Each time your workers are executing your code they are not able to attend new requests •Move everything not essential to background tasks, specially I/O bound jobs •Watch your timeouts

Slide 30

Slide 30 text

Database 30 •Relational databases are “almost” never the problem •Check your slowlog (create indexes) •Optimize slow or recurrent queries •Use read replicas (careful with replication lag)

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Database contingency 32

Slide 33

Slide 33 text

Database contingency 33

Slide 34

Slide 34 text

Database contingency 34 •Wrapping everything in a transaction is not cost effective •Deadlocks are tricky to spot •Caching data in this situation can be counter-productive

Slide 35

Slide 35 text

Database contingency 35 •Shard counters •Use another technology, like REDIS (consistency)

Slide 36

Slide 36 text

Stop unnecessary background jobs

Slide 37

Slide 37 text

Find more about us 37 http://www.infoq.com/articles/monolith-to-multilith https://speakerdeck.com/ticketeaeng https://twitter.com/ticketeaeng http://engineering.ticketea.com

Slide 38

Slide 38 text

THANKS