Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Mesos at Yelp: Building a production ready PaaS.
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Rob Johnson
October 08, 2015
Programming
150
2
Share
Mesos at Yelp: Building a production ready PaaS.
Rob Johnson
October 08, 2015
More Decks by Rob Johnson
See All by Rob Johnson
More bang for your buck: How Yelp autoscales Mesos & Marathon on AWS Spotfleet
robjohnson
0
28
PaaSTA - Yelp's Platform as a Service
robjohnson
0
90
Other Decks in Programming
See All in Programming
RTSPクライアントを自作してみた話
simotin13
0
420
These Five Tricks Can Make Your Apps Greener, Cheaper, & Nicer
hollycummins
0
260
oxlintはeslint/typescript-eslintを置き換えられるのか
shomafujita
2
300
軽量Java基盤の設計 DIコンテナに頼らない、長期保守と1秒起動の実現 JJUG CCC 2026 Spring
macha64
0
360
メソッドのジェネリクスでGoの夢は広がるか? / Kyoto.go #65
utgwkk
2
380
New "Type" system on PicoRuby
pocke
1
410
Technical Debt: Understanding it Rightly, Engaging it Rightly #LaravelLiveJP
shogogg
0
190
タクシーアプリ『GO』の バックエンド開発のおける AI利活用と若者のすべて
pyama86
3
1.8k
関係性から理解する"同一性"の型用語たち
pvcresin
2
630
The Arts and Crafts of Work in the AI Era — Toward Mastery in Software Development
kuranuki
1
700
正しくソフトウェアを作る、前提を疑うための認知の視点 / doubt-premise
minodriven
13
4.6k
Swiftのレキシカルスコープ管理
kntkymt
0
210
Featured
See All Featured
A Soul's Torment
seathinner
6
2.9k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.2k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
287
14k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
380
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
240
Evolving SEO for Evolving Search Engines
ryanjones
0
210
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
28
3.5k
Navigating Weather and Climate Data
rabernat
0
210
Building the Perfect Custom Keyboard
takai
2
780
The browser strikes back
jonoalderson
0
1.1k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
10k
Transcript
Mesos at Yelp: Building a production ready PaaS Rob Johnson
[email protected]
/@rob_johnson_
- Rob Johnson - Operations Team at Yelp - Spend
most of my time working on PaaSTA Who Am I:
Yelp’s Mission: Connecting people with great local businesses.
Yelp Stats: As of Q2 2015 83M 32 68% 83M
PaaSTA
Yelp’s homegrown Platform- as-a-Service
What’s the problem we’re trying to solve here?
- Yelp’s monolith is ~3 million LoC (that’s just the
Python). * - Increasing number of developers. *as of 28/09/2015
- Code deployments become increasingly difficult to coordinate. - Surface
area for impact of a bug greatly increases.
What’s the solution?
SOA
Solves everything, right?
SOA: Round 1
- Statically defined list of hosts to deploy a service
on. - Operations handle deciding which hosts to deploy to.
- Manually configure Nagios for each service. - Manual deployment
system. Lots of rsync wrappers to push code around.
This doesn’t scale well.
PaaSTA
- Built on the shoulders of established tools. - ‘Glue
Code’ that coordinates these tools.
Components
Mesos
Marathon
Chronos (almost)
My work here is done, right?
Not Quite.
Services != Production
What makes a service production ready?
- easy deployment for developers
- easy deployment for developers - discovery
- easy deployment for developers - discovery - monitoring
- easy deployment for developers - discovery - monitoring -
highly available
- easy deployment for developers - discovery - monitoring -
highly available - operational support
- easy deployment for developers - discovery - monitoring -
highly available - operational support
Services at Yelp tend to be: - http api -
Python - uWSGI
We want to be stack agnostic; developers shouldn’t be constrained
by dependencies on a server.
- PaaSTA only runs Docker containers. - Developers own the
creation of the image.
PaaSTA currently has Java, Golang and Python apps in production.
PaaSTA provides tooling to automate the build and deployment of
images via Jenkins.
PaaSTA uses Git as its control plane.
git push make itest push to registry performance check deploy
to dev (repeat for each dev env) manual intervention prod
Once a given image is marked for deployment in production,
PaaSTA ‘bounces’ the app, gracefully upgrading the version.
- Reduces operational overhead of deploying service. - Removes bottleneck
of going through operations to deploy.
- easy deployment for developers - discovery - monitoring -
highly available - operational support
Smartstack
- Originally written by Airbnb - Yelp now has maintainers
working on it.
s2 s1 s3 s4 s2 s1 s3 s4 s2 s1
s3 s4 H H H S N N S S N ZK
s2 s1 s3 s4 s2 s1 s3 s4 s2 s1
s3 s4 H H H S N N S S N ZK
s2 s1 s3 s4 s2 s1 s3 s4 s2 s1
s3 s4 H H H S N N S S N ZK
s2 s1 s3 s4 s2 s1 s3 s4 s2 s1
s3 s4 H H H S N N S S N ZK
There’s no place like 127.0.0.1 169.254.255.254
Why Smartstack?
- ZK/synapse/nerve dying doesn’t wipe us out. - HAProxy has
its own health checking system we can fall back to.
- HAProxy is a proven load balancer and http proxy.
- We can use Smartstack with non-PaaSTA services.
Zero-downtime HAProxy reloads: http://bit.ly/1RsctGi
- easy deployment for developers - discovery - monitoring -
highly available - operational support
None
- API allows us to send event data. - Flexibility
to assign alerts to service authors, rather than forcing it on operations team.
$ cat monitoring.yaml -- team: search_infra notification_email:
[email protected]
page: true
runbook: 'y/rb-myservice' alert_after: 5m realert_every: 10m tip: 'The federator service is in the critical path for search, you should be fixing this'
./check_marathon_services_replication
./check_hung_setup_marathon_jobs
- easy deployment for developers - discovery - monitoring -
highly available - operational support
Yelp organises machines into latency zones.
Superregion Region Habitat
$ cat smartstack.yaml --- main: advertise: [superregion] discover: superregion proxy_port:
20603
By choosing a more specific latency zone, service owners optimize
for RTT over availability.
- By being aware of these latency zones, PaaSTA can
make smarter decisions on how to constrain applications.
Without this coupling, Marathon wouldn’t balance apps evenly amongst the
latency zones.
- easy deployment for developers - discovery - monitoring -
highly available - operational support
PaaSTA comes with a cli for managing PaaSTA services.
None
None
None
- easy deployment for developers - discovery - monitoring -
highly available - operational support
Questions?
@YelpEngineering YelpEngineers engineeringblog.yelp.com github.com/yelp