Slide 1

Slide 1 text

YesSQL, Process and Tooling at Scale Rocio Delgado Universe 2016

Slide 2

Slide 2 text

whoami? @rokkzy @rocio !

Slide 3

Slide 3 text

FIN

Slide 4

Slide 4 text

FIN

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

One system fits all? !

Slide 7

Slide 7 text

One system fits all !

Slide 8

Slide 8 text

" # $ 16+ M Users 125+ M Issues 38+ M Repos % 78+ M Pull Requests* * Since 2010

Slide 9

Slide 9 text

Team Growth

Slide 10

Slide 10 text

Team Growth

Slide 11

Slide 11 text

! Performance as a Feature

Slide 12

Slide 12 text

Why care about performance?

Slide 13

Slide 13 text

" & ' Everybody’s responsibility Process & Tooling Metrics Performance Culture

Slide 14

Slide 14 text

https://www.flickr.com/photos/wocintechchat/

Slide 15

Slide 15 text

" & ' Everybody’s responsibility Process & Tooling Metrics Performance Culture " & ' Everybody’s responsibility Process & Tooling Metrics

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Process & Tooling

Slide 19

Slide 19 text

Process & Tooling

Slide 20

Slide 20 text

Deploying code

Slide 21

Slide 21 text

+ ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

GitHub’s Online Schema Transmogrifier Transfiguration Transformer Thingy

Slide 24

Slide 24 text

✴ Triggerless ✴ Lightweight ✴ Pauseable ✴ Dynamically controllable ✴ Auditable ✴ Testable ✴ Trustable

Slide 25

Slide 25 text

Process & Tooling

Slide 26

Slide 26 text

Process & Tooling

Slide 27

Slide 27 text

Process & Tooling

Slide 28

Slide 28 text

Process & Tooling

Slide 29

Slide 29 text

Process & Tooling

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

A process alone won’t solve all the problems.

Slide 32

Slide 32 text

https://www.flickr.com/photos/wocintechchat/ Communication

Slide 33

Slide 33 text

Collaboration

Slide 34

Slide 34 text

Tooling

Slide 35

Slide 35 text

Speaking of tools…

Slide 36

Slide 36 text

Process & Tooling - Peek

Slide 37

Slide 37 text

Process & Tooling - Peek

Slide 38

Slide 38 text

Finding N+1’s

Slide 39

Slide 39 text

Process & Tooling - Haystack

Slide 40

Slide 40 text

Performance Culture " & ' Everybody’s responsibility Process & Tooling Metrics

Slide 41

Slide 41 text

Measure all the things

Slide 42

Slide 42 text

Measure all the things

Slide 43

Slide 43 text

Measure all the things

Slide 44

Slide 44 text

Putting it all together…

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

VividCortex

Slide 47

Slide 47 text

Cloning the table to staging

Slide 48

Slide 48 text

Testing a new index and open PR

Slide 49

Slide 49 text

Migration is approved

Slide 50

Slide 50 text

Schedule migration

Slide 51

Slide 51 text

Run migration

Slide 52

Slide 52 text

Ship it

Slide 53

Slide 53 text

.explain

Slide 54

Slide 54 text

Profit

Slide 55

Slide 55 text

Verify

Slide 56

Slide 56 text

No more needles

Slide 57

Slide 57 text

! Using Science

Slide 58

Slide 58 text

Scientist https://github.com/github/scientist

Slide 59

Slide 59 text

Performance test between 2 indexes

Slide 60

Slide 60 text

Performance Culture " & ' Everyone’s responsibility Process & Tooling Metrics

Slide 61

Slide 61 text

! MySQL as primary data store

Slide 62

Slide 62 text

MySQL ✴ Predictability ✴ Scalability ✴ High performance ✴ High availability ✴ Operational experience ✴ Optimized for fast reads

Slide 63

Slide 63 text

K,V table in MySQL

Slide 64

Slide 64 text

Storing users dismissal notices in K,V

Slide 65

Slide 65 text

Storing users dismissal notices in K,V

Slide 66

Slide 66 text

Checking for presence

Slide 67

Slide 67 text

SQL queries on KV table per second

Slide 68

Slide 68 text

! What IS a good index strategy?

Slide 69

Slide 69 text

✴Build indexes for performance critical queries ✴Build index order that benefits more queries ✴SELECT * FROM issues WHERE user_id = 2 AND repository_id = 2; ✴SELECT * FROM issues WHERE user_id > 2 AND repository_id= 2; ✴INDEX ON (repository_id, user_id) is best ✴Prefer to extend an existing index rather than create a new one ✴Favor multi column indexes

Slide 70

Slide 70 text

! What’s NOT a good index strategy?

Slide 71

Slide 71 text

Lack of indexes

Slide 72

Slide 72 text

Unnecessary indexes ✴ Indexes require space, the more you have the bigger the table. ✴ Write operations will be slower.

Slide 73

Slide 73 text

Finding unused indexes

Slide 74

Slide 74 text

! What’s coming next in MySQL?

Slide 75

Slide 75 text

You had me at emoji

Slide 76

Slide 76 text

! It’s work in progress

Slide 77

Slide 77 text

Hubot https://hubot.github.com gh-ost https://github.com/github/gh-ost Haystack http://githubengineering.com/exception-monitoring-and-response Peek https://github.com/peek/peek Scientist https://github.com/github/scientist Graphite http://graphite.wikidot.com

Slide 78

Slide 78 text

FIN @rokkzy @rocio ! Gracias!