Slide 1

Slide 1 text

CO-workers as customers Lessons from Airbnb & Etsy Harrison Shoff Daniel Schauenberg [email protected] [email protected] @hshoff @mrtazz

Slide 2

Slide 2 text

Optional and somewhat awesome footnote text goes here. • StatsD • Deployinator • Morgue • chef-whitelist 2 Daniel Schauenberg Senior Software Engineer at Etsy’s Infrastructure Team

Slide 3

Slide 3 text

New York City

Slide 4

Slide 4 text

Portal Earrings by SVCharms

Slide 5

Slide 5 text

Adventure Time Crochet Dolls by chichirevolver

Slide 6

Slide 6 text

Etsy Stats 6 25M 18M 60M 1.5B members items listed monthly visitors page views per month

Slide 7

Slide 7 text

Optional and somewhat awesome footnote text goes here. Etsy Stack • PHP/Ruby/Node.js/Python/Perl/Go • MySQL/Postgres/Vertica/Hadoop/Solr • MemcacheD/Redis/Gearman • StatsD/Logster/Graphite/Ganglia • Chef/Nagios/IRC 7

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

• Chronos • Photography Tool • JavaScript Style Guide 9 Harrison Shoff Design Engineer at Airbnb

Slide 10

Slide 10 text

San Francisco

Slide 11

Slide 11 text

Krakow, Poland 11

Slide 12

Slide 12 text

Treehouse - Vermont 12

Slide 13

Slide 13 text

1 sq m House - Berlin, Germany 13

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

As of Dec 19, 2013 Airbnb Stats 15 10M 550k 175 25 6 Million in past year 250,000 past year nights booked properties countries languages

Slide 16

Slide 16 text

Airbnb Stack • Ruby/Java/Node • Hadoop/Hive/Redshift/Mesos/Chronos • StatsD & Graphite (thanks Daniel) • MySQL/Redis/Dyson/S3/Memcached • Chef/Synapse/Nerve/Stemcell 16

Slide 17

Slide 17 text

Why build tools?

Slide 18

Slide 18 text

Enable people

Slide 19

Slide 19 text

Enable people 1x 10x

Slide 20

Slide 20 text

Reproducibility

Slide 21

Slide 21 text

No dependency on providers

Slide 22

Slide 22 text

Nothing says undocumented like, “run this command on this box”

Slide 23

Slide 23 text

Not Invented Here Syndrome be careful about this

Slide 24

Slide 24 text

Getting Started

Slide 25

Slide 25 text

3 warning signs that you’re missing a tool

Slide 26

Slide 26 text

Outage or Incident

Slide 27

Slide 27 text

It sucks because we have to do X then Y then Z X Y Z

Slide 28

Slide 28 text

Simplify the process X Y Z

Slide 29

Slide 29 text

“There should be a tool for this”

Slide 30

Slide 30 text

30 Team Structure • 5 Engineers + 1 Manager • 1 Engineer per project • Consulting with developers, ops engineers • Most projects have at least 2 people who occasionally work on it • Each team is also building tools in their field • 12 engineers + 2 Managers • Data Infrastructure • 4 Engineers + 1 Manager • Ops Tools • 8 Engineers + 1 Manager • All building tools to support employees

Slide 31

Slide 31 text

Open Source

Slide 32

Slide 32 text

Why Open Source? Should we give away tools for free? 32

Slide 33

Slide 33 text

You probably owe most of your stack to Open Source

Slide 34

Slide 34 text

Don't reinvent the wheel

Slide 35

Slide 35 text

Item by WillowAndPoppy Give back to the community

Slide 36

Slide 36 text

Item by XStitchMyHeart Contributions and Improvements by the community

Slide 37

Slide 37 text

Open Source The good parts

Slide 38

Slide 38 text

38 Good Examples Where it worked

Slide 39

Slide 39 text

• Exist in almost any stack in some form • > 680 commits from > 120 contributors • Clients in almost any language • Only custom part is the config file • Pluggable backends • Flexible • Easy to get started • Switched over to the public version early • There is no internal repo 39 StatsD The Poster Child

Slide 40

Slide 40 text

Morgue

Slide 41

Slide 41 text

• Post Mortem Keeper • Timelines • IRC logs • Graphs • Images • Jira Tickets • Added feature API for Open Sourcing • Internal repo with Etsy features • Main version deployed from github.com 41 Morgue

Slide 42

Slide 42 text

http://www.flickr.com/photos/tasselflower/95151097 Kale (Skyline/Oculus)

Slide 43

Slide 43 text

• Anomaly detection engine • Based on Graphite input data • Clearly defined interface/API • Started as a hack week project • Running the Open Source version from the start 43 Kale (Skyline/Oculus)

Slide 44

Slide 44 text

Chronos JavaScript Style Guide How we write JavaScript

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

• Mostly opinions • Helps with code reviews, on boarding • Big team, one codebase => consistency matters • Template for other developer teams to use • +1,400 forks • Only gets better over time 46 JavaScript Style Guide How we write JavaScript

Slide 47

Slide 47 text

Chronos

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

Optional and somewhat awesome footnote text goes here. • Information and stuff • So much information I can’t handle it • Look at all this data and stats and important findings • I can barely contain all this content! 49 • 10,000+ lines of Bash • 25 sleep statements • Single point of failure Broken Data Pipeline Before Chronos

Slide 50

Slide 50 text

Optional and somewhat awesome footnote text goes here. • Information and stuff • So much information I can’t handle it • Look at all this data and stats and important findings • I can barely contain all this content! 50 • Mesos as a common platform for data infra (spark, shark, hadoop) • Express and visualize dependencies • Retries • < 2,000 lines of Scala • REST api • Web UI for analysts A Better Way Chronos

Slide 51

Slide 51 text

51

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

Open Source The bad parts

Slide 54

Slide 54 text

54 Bad Examples Where we wish we did things differently

Slide 55

Slide 55 text

Deployinator 55

Slide 56

Slide 56 text

• Does all the deploys • One button deploys • Makes it easy to add new stacks • Open Sourced during a conference in 2010 • We wanted to get it out there • Had to rip too much stuff out to make it work internally • Never took the time to get back in sync 56 Deployinator The Rushed One

Slide 57

Slide 57 text

Dashboards 57

Slide 58

Slide 58 text

• Powers all Etsy dashboards • Makes it extremely easy to add new ones • Supports Graphite and Ganglia • Heavily tied to Etsy web code • No clear abstraction • Open sourced a snapshot • Periodic updates 58 Dashboards The Deep Integration

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

• Parses loglines and emits metrics • Support for Graphite and Ganglia • Individual parsers are just Python classes • Open sourced without switching over to the public version • Popular project so it diverged quickly • Internal deploy not set up to sync easily • On-going effort to run the Open Source version 60 Logster The missed opportunity

Slide 61

Slide 61 text

61

Slide 62

Slide 62 text

62 +750 jobs

Slide 63

Slide 63 text

63 +750 jobs We only tested the UI with 300 jobs…

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

Internal Fork

Slide 66

Slide 66 text

Internal Fork Supporting 2 projects Maintenance priority

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

Back on track

Slide 69

Slide 69 text

Difficulties Why open source?

Slide 70

Slide 70 text

http://www.flickr.com/photos/allthosedetails/7665289260 Extra Work

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

The fork is always the easier route

Slide 73

Slide 73 text

You might be solving a very specific problem

Slide 74

Slide 74 text

People, deal with it!

Slide 75

Slide 75 text

• Enforces some good constraints about testability and separate of concerns • Great for recruiting • New hires are familiar with your tools • Freenode/GitHub does a lot of work for you 75 Experiences Why Open Source?

Slide 76

Slide 76 text

• Have a Freenode IRC channel (#codeascraft) • Have an active GitHub presence • Think about Open Source right from the start • Do not ever run an internal fork • Seriously don’t 76 Tips Why Open Source?

Slide 77

Slide 77 text

• All about enablement • Decrease the time and frustration spent waiting • Engineer happiness++ • Fast feedback (continuous integration, continuous deployment) 77 Tooling Culture Why is a tooling culture important?

Slide 78

Slide 78 text

• Bootcamp with Tools teams • Tools are for the people, not the individual • Let others contribute and extend them • Have an open ear/IRC channel • Make it easy to create new tools 78 Tips How to build a tooling culture?

Slide 79

Slide 79 text

Build Internal Tools http://www.flickr.com/photos/zzpza/3269784239

Slide 80

Slide 80 text

Enable a tooling culture http://www.flickr.com/photos/usacehq/4920584963

Slide 81

Slide 81 text

Open source needs maintenance Item by LovelyLittleTrove

Slide 82

Slide 82 text

Open source is great Item by SteelPetalPress

Slide 83

Slide 83 text

Thanks! Questions? Harrison Shoff Daniel Schauenberg [email protected] [email protected] @hshoff @mrtazz