Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Simply Distributed
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Nugroho Herucahyono
October 22, 2015
Technology
120
0
Share
Simply Distributed
Nugroho Herucahyono
October 22, 2015
More Decks by Nugroho Herucahyono
See All by Nugroho Herucahyono
Choosing the right technology
xinuc
0
180
This Talk is so Meta
xinuc
1
140
A Tale of a Happy Programmer
xinuc
0
160
Rails on Wiradipa - Jakarta.rb Februari 2012 - Hafiz Badrie Lubiz
xinuc
1
170
Why Ruby? - View from business aspect - Jakarta.rb Februari 2012 - Fajrin Rasyid
xinuc
1
360
Other Decks in Technology
See All in Technology
AI と創る新たな世界 / A New World Created with AI
ks91
PRO
0
110
JEP 522 Deep Dive - G1 GC同期コスト削減によるスループット向上を徹底検証&解説
tabatad
1
750
データ基盤をDataformで整えた話 〜 開発環境を添えて 〜
takapy
0
110
PHP と TypeScript の型システム比較:AI 時代の「型」は誰のためにあるのか? #frontend_phpcon_do / frontend_phpcon_do_2026
shogogg
1
240
最低限これだけ押さえれ大丈夫_Claude Enterprise/Team企業展開ガバナンス入門
tkikuchi
1
770
[モダンアプリ勉強会]今更聞けないGit/GitHub入門
tsukuboshi
0
240
Sony_KMP_Journey_KotlinConf2026
sony
2
210
正解のないAIプロダクトをどう導くか?dodaが挑む、ユーザーの『本音』を構造化する評価設計と検証のリアル
techtekt
PRO
0
180
個人最適 から 全体最適 へ AI情報共有会・AIギルド・AI-DLC で進める カンリーの組織展開
rfdnxbro
0
1.4k
新規事業を牽引する技術選定 〜フルスタックTypeScript開発の実践事例〜
nullnull
2
310
新規ゲーム開発におけるAI駆動開発のリアル
202409e2
0
2.4k
そのPoC、何を検証したつもりでしたか? AIプロダクトの価値検証で陥った落とし穴
techtekt
PRO
0
140
Featured
See All Featured
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.8k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
11
930
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
520
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
210
Building Adaptive Systems
keathley
44
3k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.3k
Raft: Consensus for Rubyists
vanstee
141
7.5k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.2k
Balancing Empowerment & Direction
lara
6
1.1k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
54k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.4k
Transcript
Simply Distributed KNIF 2015, Bandung
Who? Nugroho Herucahyono @xinuc Programmer @Bukalapak
Keandalan Sistem dalam Mendukung Penyediaan Layanan
“Andal" => reliable & scalable
reliable: fault tolerant scalable: able to grow
How a reliable & scalable system built?
Most systems start small
Typical web application Webserver Database Client
Typical web application • Need more features • Serve more
users • Need to be more reliable
More features Add more code Split the system
More users Need to scale machine limitation add more machines
More business value Need more reliable System should be fault
tolerant Self healing Backup, redundancy
How we do it? Current “Best practice”: • Split system
into smaller services • Communicate with http • Scale independently • Gracefully handle failure
How we do it? Load Balancer Search Engine Client Authentication
Content Management Search Scheduler Transaction Database 2 Database 1 Job Queue
How we do it? Current “Best practice” apparently is not
the best: • Requires massive change to our system • Manual load balancing, replication • Manual resource management • Inefficient communication (http? really?)
How we do it? Load Balancer Search Engine Client Authentication
Content Management Search Scheduler Transaction Database 2 Database 1 Job Queue Too Complicated!!
What would a good computer scientist do?
Introduce a new layer of abstraction!
A new layer of abstraction • Handle resource management •
Handle load balancing • Handle service communication • Handle service failure • Handle replication
A new layer of abstraction We need “Operating System” of
a cluster
A new layer of abstraction Cluster Operating system Operating System
Pod Application Hardware Operating System Pod Application Hardware Operating System Pod Application Hardware
Cluster Operating System • Build in interprocess communication • Build
in monitoring & supervision • Automatic load balancing • Automatic resource management • Scale with little / no system modification
What do we have now? • Erlang VM & OTP
• Docker, Kubernetes
Erlang VM & OTP node 1 erlang vm erlang processes
node 2 erlang vm erlang processes
Erlang VM & OTP Supervisor Supervisor Worker Worker Worker Worker
Worker OTP Supervision Tree
Erlang VM & OTP • Build in interprocess communication √
• Build in monitoring & supervision √ • Automatic load balancing X • Automatic resource management X • Scale with little / no system modification √
Erlang VM & OTP • The building block is too
low level? (erlang processes) • Your application need to be written in erlang (or other erlang vm languages)
Docker • Like virtual machine, but much lighter • Encapsulate
our application into single “executable” • Remove dependencies, development vs production headache
Docker Host OS Docker Container Container Container Server
Kubernetes • Manages & monitors containers • Resource allocation between
containers
Kubernetes Host OS Docker Container Container Pod Host OS Docker
Container Container Pod Node 1 Node 2 Kubernetes
Docker & Kubernetes • Build in interprocess communication X •
Build in monitoring & supervision √ • Automatic load balancing √ • Automatic resource management √ • Scale with little / no system modification X
Docker & Kubernetes • No build in interprocess communication •
Still have to modify the system (split into smaller services) • Too complicated
Can we do better?
Let’s zoom out a bit • Service vs Process •
Node vs Core They’re conceptually the same
Maybe we can push down the abstraction layer?
What if, our “Cluster operating system” is a real Operating
System?
We need a real “Distributed Operating System”
Distributed Operating System Operating System Application Hardware Hardware Hardware
Distributed Operating System • Encapsulate multiple machines as a single
node • Transparent from user / application point of view • Handle load balancing, replication & distribution automatically • Better yet, if we can add more machine on the fly
Is it possible? I have no idea.
We’ve done something similar • Raid • Multiple disk, single
volume • Transparent from applications • Automatic failure handling & replication
We need Raid for CPU & Memory
Or maybe, we can push it down further, to the
hardware level?
We need a real “Distributed Motherboard” :D
Distributed Operating System Operating System Application Hardware
Distributed Motherboard • Node 1, 32 Cores, 32 GB RAM
• Node 2, 32 Cores, 32 GB RAM • Detected by operating system as 1 Node, 64 Cores, 64 GB RAM
Distributed Motherboard • We can add more node, on the
fly • Motherboard will communicate between each other • Abstract their resources as a SINGLE NODE
Again, is it possible? I have no idea.
We’ve done that too • Hardware Raid Controller • Multiple
Disk, detected as a single hardware • Transparent from operating system & application
Too much wishful thinking?
Why does it matter?
Why does it matter? Scalable & Reliable system is a
SOLVED problem We already have Google, Facebook, etc as a prove
Why does it matter? • Scalable & Reliable system is
not easy & cheap • Need a group of highly skilled experts to build
Case Study: WhatsApp
Case Study: WhatsApp • WhatsApp use Erlang VM & OTP
• They can scale it without adding too much complexity
Case Study: WhatsApp
Case Study: WhatsApp We need more companies like WhatsApp
Small Startups? Can 4-fresh-graduate startup create a product used by
a billion users?
Non profits? Can we create non profit system than serve
billons of users?
./bukalapak
more research on this, please :)
Thank you