Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Peris of Writing a PaaS
Search
Andrew Godwin
May 10, 2011
Programming
0
110
The Peris of Writing a PaaS
A talk I gave at London Devops in May of 2011.
Andrew Godwin
May 10, 2011
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
310
Django Through The Years
andrewgodwin
0
200
Writing Maintainable Software At Scale
andrewgodwin
0
440
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
350
Async, Python, and the Future
andrewgodwin
2
660
How To Break Django: With Async
andrewgodwin
1
730
Taking Django's ORM Async
andrewgodwin
0
720
The Long Road To Asynchrony
andrewgodwin
0
660
The Scientist & The Engineer
andrewgodwin
1
760
Other Decks in Programming
See All in Programming
なぜ適用するか、移行して理解するClean Architecture 〜構造を超えて設計を継承する〜 / Why Apply, Migrate and Understand Clean Architecture - Inherit Design Beyond Structure
seike460
PRO
3
790
PHP 8.4の新機能「プロパティフック」から学ぶオブジェクト指向設計とリスコフの置換原則
kentaroutakeda
2
1k
Modern Angular with Signals and Signal Store:New Rules for Your Architecture @enterJS Advanced Angular Day 2025
manfredsteyer
PRO
0
250
AIエージェントはこう育てる - GitHub Copilot Agentとチームの共進化サイクル
koboriakira
0
690
Rubyでやりたい駆動開発 / Ruby driven development
chobishiba
1
750
PHPで始める振る舞い駆動開発(Behaviour-Driven Development)
ohmori_yusuke
2
430
Startups on Rails in Past, Present and Future–Irina Nazarova, RailsConf 2025
irinanazarova
0
210
High-Level Programming Languages in AI Era -Human Thought and Mind-
hayat01sh1da
PRO
0
840
イベントストーミング図からコードへの変換手順 / Procedure for Converting Event Storming Diagrams to Code
nrslib
2
1k
顧客の画像データをテラバイト単位で配信する 画像サーバを WebP にした際に起こった課題と その対応策 ~継続的な取り組みを添えて~
takutakahashi
3
790
TypeScriptでDXを上げろ! Hono編
yusukebe
3
700
「テストは愚直&&網羅的に書くほどよい」という誤解 / Test Smarter, Not Harder
munetoshi
0
190
Featured
See All Featured
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
8
700
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
47
9.6k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
750
[RailsConf 2023] Rails as a piece of cake
palkan
55
5.7k
GitHub's CSS Performance
jonrohan
1031
460k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.3k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
830
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
50
5.5k
Raft: Consensus for Rubyists
vanstee
140
7k
4 Signs Your Business is Dying
shpigford
184
22k
Transcript
The Perils of Writing a PaaS Andrew Godwin http://www.flickr.com/photos/jannem/2719976702/
Hi, I'm Andrew. Serial Python developer Django core committer Sysadmin
by night
We're ep.io Python Platform-as-a-Service Utility billing PostgreSQL, Redis, Celery, and
more
We built a… prototype. Me and Ben Firshman Three or
four days' hacking at DjangoCon Ran code, had simple deployment
The last 10%... A month or two of hibernation Went
part-time in December Private beta since February Public launch later this year
Why? Why not?
Why? Why not? Lack of good solutions Strong, technical team
Writing backend code is fun
It's a challenge We're still a closed beta 300+ apps,
on 4 servers Some people just have crazy code Security, security, security
Our Architecture
ep.io Cloud Request Sugar XML Response Code Magic
Balancer Runner Runner Runner App 1 App 2 App 3
App 2 App 4 App 1 Databases File Storage
Load Balancer Started with HaProxy Moved to custom Python loadbalancer
Still needs refinement
Runners Daemon on each machine Nginx + gunicorn for each
app instance Output captured, CPU time measured
Coordinator Analyses whole system Juggles apps between servers Detects dead
servers
PostgreSQL Normal PostgreSQL 9 install Daemon to read query logs,
make users
Redis Custom Redis loadbalancer/manager Starts processes on demand Handles multi-user
security
Upload Receiver SSH endpoint for git, hg, commands Wraps VCSs,
extracts uploaded files Creates filesystem images
Other Services Log aggregation UID assignment Calculate costs
Statistics Queued in Redis Consumed asynchronously Currently stored in Redis,
changing soon Graphed and profiled
Configuration Management Puppet for the simpler stuff Daemons handle complex
stuff Don't try to reinvent the wheel
Monitoring Nagios SaaS monitoring Nagios Emails, texts, pager Several custom
checks
Backups Currently just rdiff-backup Moving to btrfs snapshots + DRBD
HA is not a backup solution
Perils
Initial bad design (To be fair, it was a prototype)
Networks really aren't reliable (Well, EC2's, at least.)
Memory pressure is bad (Prepare to have a fallback. And
another.)
Raw file handles are… fun. (As is the PTY subsystem.
Be very careful.)
Write just enough automation (If a server dies, I now
just go and get a drink)
HaProxy doesn't like 500+ backends (it's not exactly common)
Single redundancy is only so good (and remember, HA is
not backups!)
Future Perils
Payment (Already underway, still hard)
Oversized Sites (we need to get a lot bigger first)
European Servers (people really do want them)
More Databases (how on earth do you measure MongoDB use?)
More Languages (easy to get it working, hard to polish)
The Potential Big Outage (quite useful as a motivational tool)
Thank you. Andrew Godwin @andrewgodwin
[email protected]