Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Achieving repeatable, extensible and self serve...
Search
Tasdik Rahman
November 16, 2019
Programming
2.5k
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Achieving repeatable, extensible and self serve infrastructure
Tasdik Rahman
November 16, 2019
More Decks by Tasdik Rahman
See All by Tasdik Rahman
Resilient Multi-Cloud Strategies: Harnessing Kubernetes, Cluster API and Cell-Based Architecture
tasdikrahman
0
940
How to make pod assignment to thousands of nodes every day easier
tasdikrahman
0
660
Keeping up with Kubernetes cluster upgrades
tasdikrahman
0
970
TDD: An experience report
tasdikrahman
0
1.7k
Ways of enabling Canary deployments in kubernetes
tasdikrahman
0
5.1k
Kingsly - The Cert Manager
tasdikrahman
0
2.6k
kuberception: Self Hosting kubernetes
tasdikrahman
0
8.8k
Diving deep on how imports work in Python
tasdikrahman
0
3.8k
Introduction to Ansible
tasdikrahman
1
17k
Other Decks in Programming
See All in Programming
JJUG CCC 2026 Spring: JSpecify で実現する Kotlin フレンドリーな Java API 設計
ternbusty
1
160
「なぜそう決めたのか」を残し続ける仕組み ― Notion AI カスタムエージェント × Slack連携による設計判断の自動記録 - NIKKEI Tech Talk #47
niftycorp
PRO
0
120
「AIで開発し、AIを届ける」をEvalでつなぐ 〜AIネイティブに始めるプロダクト開発の実践〜 / Connecting "Develop with AI, deliver AI" with Eval
rkaga
4
4.9k
ふつうのFeature Flag実践入門
irof
7
3.7k
Spec Driven Development | AI Summit Lisbon
danielsogl
PRO
0
180
作って学ぶ、 JSX (TSX) ランタイムの基本
syumai
7
1.6k
Modding RubyKaigi for Myself
yui_knk
0
920
ローカルLLMを使ってB2Bサービスを作っていての学び
yaotti
0
160
Webフレームワークの ベンチマークについて
yusukebe
0
160
The ROI of Quarkus for Spring Boot Applications
hollycummins
0
110
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
8
3.2k
PHPで使える日時の表現と、その知り方 #frontend_phpcon_do
o0h
PRO
0
230
Featured
See All Featured
GitHub's CSS Performance
jonrohan
1033
470k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
280
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
23k
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2.1k
Automating Front-end Workflow
addyosmani
1370
210k
Being A Developer After 40
akosma
91
590k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
1.4k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.7k
Code Review Best Practice
trishagee
74
20k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4.1k
Why Your Marketing Sucks and What You Can Do About It - Sophie Logan
marketingsoph
0
170
Transcript
Achieving repeatable, extensible and self serve infrastructure
2 tasdikrahman.me @tasdikrahman • Product Engineer @ Gojek • Contributor
to oVirt • Backpacker • Weekend chef • Chelsea FC!!
What does Gojek do? 3
4 Ref: gojek.io
What am I gonna talk about? 5
6 Ref: shutterstock.com
7 Ref: shutterstock.com Evolution of Infrastructure @ Gojek
Travelling back in time 8
Rapid Demand 9
How to deal with it? 10
Central Infrastructure Team 11
Intent? 12
Abstract out Infrastructure For Product Teams 13
Outcome? 14
Adhoc requests 15
“Measure what is measurable, and make measurable what is not
so” - Galileo 16 Credits: biography.com
Service request tickets 17
18 Example service request in our ticket system by a
team (names redacted)
19 Example service request to increase disk size (names redacted)
Number of service requests kept increasing with scale and more
product groups coming in 20
21 Ref: gunshowcomic.com/648
How does one keep up with service requests? 22
Scale your team vertically and keep doing so 23
Sustainable? 24
Very hard to do, but mostly No 25
Eventually, we noticed we were becoming the bottleneck 26
Give access to someone from the product team? 27
Chances of Security loopholes 28
29 Ref: https://blog.codinghorror.com/the-broken-window-theory/
What do we do then? 30
Quick detour 31
Where did systems administration start? 32
Evolution of Automation at Gojek 33
Evolution of Automation at Gojek 34 • Scripts • Chef-cookbooks
• Rundeck • Deployment scripts
Problems with the earlier solutions 35 • Multiple ways around
building and using automation • Managing dependencies for the automation. Eg: people using gcloud/AWS
Problems with the earlier solutions 36 • Lack of convention
leading to meagre contributions to automation from devs. • Adhoc way of managing access to tools like terraform, knife leading to stray accidents. • No central platform for automation.
Number of tickets getting created still not decreasing 37
Clearing infrastructure debts 38
Moving from maintenance to innovation mode 39
Making infrastructure boring for product teams 40
Proctor: Our automation orchestrator 41 Ref: github.com/gojek/proctor
42
43
Installation 44
45 Helm all the way Reference value: stable/proctor-service/values.yaml
Automation using proctor 46
Sample proc to increase disk 47
Sample proc to increase disk 48
Scripts can be added by developers and they get added
to proctor after our review 49
Sample procs in our ecosystem 50
Demo 51
Profit? 52
Outcome of having proctor? 53
Decrease in number of tickets which were mechanical in nature
54
Having terraform inside CI 55 +
But before that 56
Creating the gcloud project 57
58 Sample directory structure
59 .gitlab-yml for the gcloud project in gitlab
60
61 Plan and apply
Private terraform registry consisting of 90+ modules 62
Outcome? 63
Teams managing and provisioning their own infra with our best
practices baked in terraform modules 64
OSS alternatives? 65
66 Reference: runatlantis.io/
Ideal state? 67
68 Ref: Google SRE book: Eliminating toil
Known caveats? 69
Deletion of infra 70
Teams forget what they are using 71
Lessons learnt? 72
Avoid premature automation 73
High service requests for product teams is a smell 74
No Big bang changes 75
Documentation should go hand in hand, would affect productivity directly
76
Reduce steps for onboarding to your tooling, lesser the better
77
Invisible infrastructure 78
Product managers in Infrastructure teams 79
Prioritizing on innovation 80
Links and References • https://github.com/gojek/proctor • https://blog.gojekengineering.com/olympus-terraforming-repeatabl e-and-extensible-infrastructure-at-go-jek-42ad5b0a4f9a • https://learn.hashicorp.com/terraform/development/running-terrafor
m-in-automation • https://lethain.com/product-management-infra-engineering/ 81
82 @tasdikrahman tasdikrahman.me