Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Peris of Writing a PaaS
Search
Andrew Godwin
May 10, 2011
Programming
0
95
The Peris of Writing a PaaS
A talk I gave at London Devops in May of 2011.
Andrew Godwin
May 10, 2011
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
260
Django Through The Years
andrewgodwin
0
160
Writing Maintainable Software At Scale
andrewgodwin
0
400
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
310
Async, Python, and the Future
andrewgodwin
2
610
How To Break Django: With Async
andrewgodwin
1
670
Taking Django's ORM Async
andrewgodwin
0
680
The Long Road To Asynchrony
andrewgodwin
0
590
The Scientist & The Engineer
andrewgodwin
1
700
Other Decks in Programming
See All in Programming
20年もののレガシープロダクトに 0からPHPStanを入れるまで / phpcon2024
hirobe1999
0
1k
テストコードのガイドライン 〜作成から運用まで〜
riku929hr
7
1.4k
AIレシート読み取り機能をRuby on Rails on AWSで実現するLLMにまつわるアレコレ / AI-based receipt reading function powered by LLM on Ruby on Rails on AWS
moznion
3
120
선언형 UI에서의 상태관리
l2hyunwoo
0
270
KubeCon NA 2024の全DB関連セッションを紹介
nnaka2992
0
120
GitHub CopilotでTypeScriptの コード生成するワザップ
starfish719
26
5.9k
QA環境で誰でも自由自在に現在時刻を操って検証できるようにした話
kalibora
1
140
CQRS+ES の力を使って効果を感じる / Feel the effects of using the power of CQRS+ES
seike460
PRO
0
240
ErdMap: Thinking about a map for Rails applications
makicamel
1
550
技術的負債と向き合うカイゼン活動を1年続けて分かった "持続可能" なプロダクト開発
yuichiro_serita
0
300
いりゃあせ、PHPカンファレンス名古屋2025 / Welcome to PHP Conference Nagoya 2025
ttskch
1
150
知られざるDMMデータエンジニアの生態 〜かつてツチノコと呼ばれし者〜
takaha4k
0
110
Featured
See All Featured
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
29
950
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
Facilitating Awesome Meetings
lara
51
6.2k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
8
1.2k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
Product Roadmaps are Hard
iamctodd
PRO
50
11k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.6k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
Building Your Own Lightsaber
phodgson
104
6.2k
Building a Scalable Design System with Sketch
lauravandoore
460
33k
Transcript
The Perils of Writing a PaaS Andrew Godwin http://www.flickr.com/photos/jannem/2719976702/
Hi, I'm Andrew. Serial Python developer Django core committer Sysadmin
by night
We're ep.io Python Platform-as-a-Service Utility billing PostgreSQL, Redis, Celery, and
more
We built a… prototype. Me and Ben Firshman Three or
four days' hacking at DjangoCon Ran code, had simple deployment
The last 10%... A month or two of hibernation Went
part-time in December Private beta since February Public launch later this year
Why? Why not?
Why? Why not? Lack of good solutions Strong, technical team
Writing backend code is fun
It's a challenge We're still a closed beta 300+ apps,
on 4 servers Some people just have crazy code Security, security, security
Our Architecture
ep.io Cloud Request Sugar XML Response Code Magic
Balancer Runner Runner Runner App 1 App 2 App 3
App 2 App 4 App 1 Databases File Storage
Load Balancer Started with HaProxy Moved to custom Python loadbalancer
Still needs refinement
Runners Daemon on each machine Nginx + gunicorn for each
app instance Output captured, CPU time measured
Coordinator Analyses whole system Juggles apps between servers Detects dead
servers
PostgreSQL Normal PostgreSQL 9 install Daemon to read query logs,
make users
Redis Custom Redis loadbalancer/manager Starts processes on demand Handles multi-user
security
Upload Receiver SSH endpoint for git, hg, commands Wraps VCSs,
extracts uploaded files Creates filesystem images
Other Services Log aggregation UID assignment Calculate costs
Statistics Queued in Redis Consumed asynchronously Currently stored in Redis,
changing soon Graphed and profiled
Configuration Management Puppet for the simpler stuff Daemons handle complex
stuff Don't try to reinvent the wheel
Monitoring Nagios SaaS monitoring Nagios Emails, texts, pager Several custom
checks
Backups Currently just rdiff-backup Moving to btrfs snapshots + DRBD
HA is not a backup solution
Perils
Initial bad design (To be fair, it was a prototype)
Networks really aren't reliable (Well, EC2's, at least.)
Memory pressure is bad (Prepare to have a fallback. And
another.)
Raw file handles are… fun. (As is the PTY subsystem.
Be very careful.)
Write just enough automation (If a server dies, I now
just go and get a drink)
HaProxy doesn't like 500+ backends (it's not exactly common)
Single redundancy is only so good (and remember, HA is
not backups!)
Future Perils
Payment (Already underway, still hard)
Oversized Sites (we need to get a lot bigger first)
European Servers (people really do want them)
More Databases (how on earth do you measure MongoDB use?)
More Languages (easy to get it working, hard to polish)
The Potential Big Outage (quite useful as a motivational tool)
Thank you. Andrew Godwin @andrewgodwin
[email protected]