Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Peris of Writing a PaaS
Search
Andrew Godwin
May 10, 2011
Programming
0
110
The Peris of Writing a PaaS
A talk I gave at London Devops in May of 2011.
Andrew Godwin
May 10, 2011
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
310
Django Through The Years
andrewgodwin
0
200
Writing Maintainable Software At Scale
andrewgodwin
0
440
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
350
Async, Python, and the Future
andrewgodwin
2
660
How To Break Django: With Async
andrewgodwin
1
720
Taking Django's ORM Async
andrewgodwin
0
720
The Long Road To Asynchrony
andrewgodwin
0
650
The Scientist & The Engineer
andrewgodwin
1
760
Other Decks in Programming
See All in Programming
生成AIコーディングとの向き合い方、AIと共創するという考え方 / How to deal with generative AI coding and the concept of co-creating with AI
seike460
PRO
1
340
第9回 情シス転職ミートアップ 株式会社IVRy(アイブリー)の紹介
ivry_presentationmaterials
1
250
AIエージェントはこう育てる - GitHub Copilot Agentとチームの共進化サイクル
koboriakira
0
460
なんとなくわかった気になるブロックテーマ入門/contents.nagoya 2025 6.28
chiilog
1
230
地方に住むエンジニアの残酷な現実とキャリア論
ichimichi
5
1.4k
『自分のデータだけ見せたい!』を叶える──Laravel × Casbin で複雑権限をスッキリ解きほぐす 25 分
akitotsukahara
1
580
エラーって何種類あるの?
kajitack
5
320
datadog dash 2025 LLM observability for reliability and stability
ivry_presentationmaterials
0
180
なぜ「共通化」を考え、失敗を繰り返すのか
rinchoku
1
570
AWS CDKの推しポイント 〜CloudFormationと比較してみた〜
akihisaikeda
3
320
なぜ適用するか、移行して理解するClean Architecture 〜構造を超えて設計を継承する〜 / Why Apply, Migrate and Understand Clean Architecture - Inherit Design Beyond Structure
seike460
PRO
1
700
ruby.wasmで多人数リアルタイム通信ゲームを作ろう
lnit
2
290
Featured
See All Featured
The Invisible Side of Design
smashingmag
300
51k
Designing for Performance
lara
609
69k
How to Think Like a Performance Engineer
csswizardry
24
1.7k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.5k
Rebuilding a faster, lazier Slack
samanthasiow
82
9.1k
Building Applications with DynamoDB
mza
95
6.5k
Building Adaptive Systems
keathley
43
2.6k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.8k
BBQ
matthewcrist
89
9.7k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
48
5.4k
Measuring & Analyzing Core Web Vitals
bluesmoon
7
490
Practical Orchestrator
shlominoach
188
11k
Transcript
The Perils of Writing a PaaS Andrew Godwin http://www.flickr.com/photos/jannem/2719976702/
Hi, I'm Andrew. Serial Python developer Django core committer Sysadmin
by night
We're ep.io Python Platform-as-a-Service Utility billing PostgreSQL, Redis, Celery, and
more
We built a… prototype. Me and Ben Firshman Three or
four days' hacking at DjangoCon Ran code, had simple deployment
The last 10%... A month or two of hibernation Went
part-time in December Private beta since February Public launch later this year
Why? Why not?
Why? Why not? Lack of good solutions Strong, technical team
Writing backend code is fun
It's a challenge We're still a closed beta 300+ apps,
on 4 servers Some people just have crazy code Security, security, security
Our Architecture
ep.io Cloud Request Sugar XML Response Code Magic
Balancer Runner Runner Runner App 1 App 2 App 3
App 2 App 4 App 1 Databases File Storage
Load Balancer Started with HaProxy Moved to custom Python loadbalancer
Still needs refinement
Runners Daemon on each machine Nginx + gunicorn for each
app instance Output captured, CPU time measured
Coordinator Analyses whole system Juggles apps between servers Detects dead
servers
PostgreSQL Normal PostgreSQL 9 install Daemon to read query logs,
make users
Redis Custom Redis loadbalancer/manager Starts processes on demand Handles multi-user
security
Upload Receiver SSH endpoint for git, hg, commands Wraps VCSs,
extracts uploaded files Creates filesystem images
Other Services Log aggregation UID assignment Calculate costs
Statistics Queued in Redis Consumed asynchronously Currently stored in Redis,
changing soon Graphed and profiled
Configuration Management Puppet for the simpler stuff Daemons handle complex
stuff Don't try to reinvent the wheel
Monitoring Nagios SaaS monitoring Nagios Emails, texts, pager Several custom
checks
Backups Currently just rdiff-backup Moving to btrfs snapshots + DRBD
HA is not a backup solution
Perils
Initial bad design (To be fair, it was a prototype)
Networks really aren't reliable (Well, EC2's, at least.)
Memory pressure is bad (Prepare to have a fallback. And
another.)
Raw file handles are… fun. (As is the PTY subsystem.
Be very careful.)
Write just enough automation (If a server dies, I now
just go and get a drink)
HaProxy doesn't like 500+ backends (it's not exactly common)
Single redundancy is only so good (and remember, HA is
not backups!)
Future Perils
Payment (Already underway, still hard)
Oversized Sites (we need to get a lot bigger first)
European Servers (people really do want them)
More Databases (how on earth do you measure MongoDB use?)
More Languages (easy to get it working, hard to polish)
The Potential Big Outage (quite useful as a motivational tool)
Thank you. Andrew Godwin @andrewgodwin
[email protected]