Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Patroni solved Database Reliability at Gojek
Search
Kumar Abhijeet
March 28, 2024
0
14
How Patroni solved Database Reliability at Gojek
Kumar Abhijeet
March 28, 2024
Tweet
Share
More Decks by Kumar Abhijeet
See All by Kumar Abhijeet
Multi-Region APIOps with Kong
kumar_abhijeet
0
43
Be a Master Chef: Crafting Recipes for Reliable Infrastructure
kumar_abhijeet
0
25
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
Building Flexible Design Systems
yeseniaperezcruz
328
38k
Speed Design
sergeychernyshev
25
720
Product Roadmaps are Hard
iamctodd
PRO
50
11k
How to Ace a Technical Interview
jacobian
276
23k
Thoughts on Productivity
jonyablonski
68
4.4k
Build The Right Thing And Hit Your Dates
maggiecrowley
33
2.5k
Site-Speed That Sticks
csswizardry
2
220
VelocityConf: Rendering Performance Case Studies
addyosmani
327
24k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
112
50k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
Transcript
How Patroni solved Database Reliability at Gojek Kumar Abhijeet Cloud
Platforms
Fabricating DBaaS@Gojek DevOps/Platforms Home Gym Owner Budding Musician
Agenda Gojek - Scale and Microservices Databases & Reliability Patroni
& 5 9s of Availability Deep dive into Patroni Managing Patroni on production - Lessons & Experiences
~600 microservices running on production ~400 have databases
600k RPM 12000 WALs/hour
18Bn record inserts/month 85Bn records fetched/month
Will a conventional master-slave PostgreSQL system be able to support
country-level scale?
App Server Workloads PostgreSQL VMs API Traffic LB
App Server Workloads PostgreSQL VMs API Traffic LB
Cloud Provider’s Compute Uptime >= 99.9% < 8h 41m of
downtime/year Across multiple zones >= 99.99% < 52m of downtime/year
Database Uptime ≅ App Uptime
Target >= 99.999% Uptime Less than 5m of downtime/year
App Server Workloads PostgreSQL VMs API Traffic LB New Master
Old Master Replica
None
App Server Workloads PostgreSQL VMs API Traffic LB New Master
Old Master Replica shared_buffers=16MB shared_buffers=2GB
Enter Patroni!
Patroni Open Source and actively maintained by Zalando Converts PostgreSQL
systems into Highly Available Fault Tolerant Disaster Ready
None
None
None
Patroni Almost instantaneous failovers (~1-2s) Way cheaper than running managed
DB solutions Cluster Management made easy Multi Region HA Deployments
None
HA Loop Flow
None
None
Downtime in Seconds≈0.0000315576
None
Patroni at Gojek 200+ clusters running on Production ~60 TB
of data flows in/out every day Guarantees less than 10MBs of data loss Consul as DCS and service discovery IAC everywhere!
Patroni at Gojek TF Modules for Provisioning/Chef for configuration Sync/Async
replication choices All round observability! Secure and granular role-based access PR based workflow for infra provisioning
None
None
None
Thank you!