Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building systems with etcd @ SCALE x14
Search
Brandon Philips
January 24, 2016
Technology
2
490
Building systems with etcd @ SCALE x14
SCALE x14
Brandon Philips
January 24, 2016
Tweet
Share
More Decks by Brandon Philips
See All by Brandon Philips
Node.js Workflow with Minikube and Skaffold
philips
0
270
Manage the App on Kubernetes
philips
0
350
Production Backbone Monitoring Containerized Apps
philips
0
200
KubeCon EU 2017: Dancing on the Edge of a Volcano
philips
1
790
rkt - KubeCon EU keynote - 2017
philips
1
280
FOSDEM_Keynote_2017-_.pdf
philips
0
140
Tectonic Summit Day 2 Keynote
philips
0
370
Kubernetes: Simple to Manage Anywhere (self-hosted, Tectonic upgrade demo)
philips
0
410
KubeCon Keynote 2016- Distributed Systems Simplified on Kubernetes
philips
2
570
Other Decks in Technology
See All in Technology
Amazon Quick Suite で始める手軽な AI エージェント
shimy
0
180
寫了幾年 Code,然後呢?軟體工程師必須重新認識的 DevOps
cheng_wei_chen
1
1.5k
S3を正しく理解するための内部構造の読解
nrinetcom
PRO
2
150
生成AI時代におけるグローバル戦略思考
taka_aki
0
200
AWS re:Invent 2025~初参加の成果と学び~
kubomasataka
0
110
Kiro を用いたペアプロのススメ
taikis
1
170
1人1サービス開発しているチームでのClaudeCodeの使い方
noayaoshiro
1
350
Haskell を武器にして挑む競技プログラミング ─ 操作的思考から意味モデル思考へ
naoya
6
1.6k
エンジニアとPMのドメイン知識の溝をなくす、 AIネイティブな開発プロセス
applism118
4
1.3k
NIKKEI Tech Talk #41: セキュア・バイ・デザインからクラウド管理を考える
sekido
PRO
0
150
Lookerで実現するセキュアな外部データ提供
zozotech
PRO
0
160
WordPress は終わったのか ~今のWordPress の制作手法ってなにがあんねん?~ / Is WordPress Over? How We Build with WordPress Today
tbshiki
1
820
Featured
See All Featured
How STYLIGHT went responsive
nonsquared
100
6k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
710
For a Future-Friendly Web
brad_frost
180
10k
Java REST API Framework Comparison - PWX 2021
mraible
34
9k
Optimizing for Happiness
mojombo
379
70k
Statistics for Hackers
jakevdp
799
230k
Making the Leap to Tech Lead
cromwellryan
135
9.7k
Designing for humans not robots
tammielis
254
26k
Context Engineering - Making Every Token Count
addyosmani
9
530
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
286
14k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Transcript
Fault Tolerant Infrastructure Building Systems with etcd @coreoslinux @brandonphilips
Brandon Philips CTO, CoreOS github.com/philips
What is CoreOS?
What is CoreOS?
None
None
What is CoreOS?
None
The smartest way to run your container infrastructure. tectonic.com @tectonic
QUAY Secure hosting for private Docker repositories quay.io @quayio
Why build CoreOS?
you
you as a sw engineer
your with Ada.Text_IO; procedure Hello_World is use Ada.Text_IO; begin Put_Line("Hello,
world!"); end; #include <stdio.h> int main() { printf("Hello, world!\n"); } package main import "fmt" func main() { fmt.Println("Hello, world!") }
your container image
your /bin/java /opt/app.jar /lib/libc
your /bin/python /opt/app.py /lib/libc
your com.example.app d474e8c57737625c
your d474e8c57737625c Signed By: Alice
you as an ops engineer
your
your com.example.webapp x3
your com.example.webapp x3
your ??? com.example.webapp x3
How do we do it?
architecture in practice cluster operations
worker kubelet worker kubelet worker kubelet scheduler & API worker
kubelet w ku t worker kubelet
machine configuration OS operations
distributed configuration cluster operations
github.com/philips/hacks/tree/master/etcd- demos
etcd
/etc distributed
open source software failure tolerant durable watchable exposed via HTTP
runtime reconfigurable
Data Store API -X GET Get Wait -X PUT Put
Create CAS -X DELETE Delete CAD
clusters etcd basics
Typical Cluster Leader Follower
API etcd basics
fault tolerance etcd basics
Available Leader Follower
Available Leader Follower
Available Leader Follower
Unavailable Leader Follower
leader fault tolerance etcd basics
Available Leader Follower
Available Leader Follower
Temporarily Unavailable Leader Follower
Available Leader Follower
Unavailable Leader Follower
wal, snapshots, backups etcd durability
discovery, static etcd bootstrap
$ curl discovery.etcd.io/new?size=5 discovery.etcd.io/6eadeac2
discovery
discovery
discovery
Leader Follower discovery
live addition and removal etcd reconfig
Leader Follower
Leader Follower
Leader Follower
etcd apps
reboot locksmith etcd apps
None
None
Cluster Wide Reboot Lock • Need to reboot? Decrement the
semaphore key atomically with etcd. • manager.Reboot() and wait... • After reboot increment the semaphore key in etcd atomically.
skydns etcd apps
vulcand etcd apps
None
confd etcd apps
pulling it together kubernetes
k8s/mesos/etc scheduler scheduling
getting work to servers scheduling
$ scp app host:/opt $ ssh host systemd-run /opt/app
$ scp app host:/opt $ ssh host systemd-run /opt/app
$ fab deploy:app
$ fab deploy:app
$ fab deploy:app
$ fab deploy:collector-app
$ fab deploy:collector-app
$ fab deploy:collector-app
$ fab deploy deploy:collector-app
$ fab lowest-loadaverage
$ fab lowest-loadaverage host1
$ fab lowest-loadaverage host1 $ fab -H host1 deploy:job
You Scheduler API Scheduler Machine(s)
while true { todo = diff(desState, curState) schedule(todo) }
while true { todo = diff(desState, curState) schedule(todo) }
while true { todo = diff(desState, curState) schedule(todo) }
while true { todo = diff(desState, curState) schedule(todo) }
dns, LBs, k8s labels services
flexible service discovery k8s labels
pod env=dev app=web pod env=test app=web pod env=prod app=web
pod env=dev app=web pod env=test app=web pod env=prod app=web service
test.example.com select(env=dev,app=web) service beta.example.com select(env=test,app=web) OR select(env=prod,app=web) service example.com select(env=prod,app=web)
github.com/coreos/coreos-kubernetes
etcd.ngrok.io
worker kubelet worker kubelet scheduler & API
worker & API works on 1 node too
coreos.com/careers work with us
@coreoslinux @tectonicstack @brandonphilips thank you