Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Peris of Writing a PaaS
Search
Andrew Godwin
May 10, 2011
Programming
0
92
The Peris of Writing a PaaS
A talk I gave at London Devops in May of 2011.
Andrew Godwin
May 10, 2011
Tweet
Share
More Decks by Andrew Godwin
See All by Andrew Godwin
Reconciling Everything
andrewgodwin
1
250
Django Through The Years
andrewgodwin
0
150
Writing Maintainable Software At Scale
andrewgodwin
0
380
A Newcomer's Guide To Airflow's Architecture
andrewgodwin
0
300
Async, Python, and the Future
andrewgodwin
2
590
How To Break Django: With Async
andrewgodwin
1
650
Taking Django's ORM Async
andrewgodwin
0
660
The Long Road To Asynchrony
andrewgodwin
0
580
The Scientist & The Engineer
andrewgodwin
1
680
Other Decks in Programming
See All in Programming
タクシーアプリ『GO』のリアルタイムデータ分析基盤における機械学習サービスの活用
mot_techtalk
4
1.4k
シールドクラスをはじめよう / Getting Started with Sealed Classes
mackey0225
4
640
Amazon Bedrock Agentsを用いてアプリ開発してみた!
har1101
0
330
Hotwire or React? ~アフタートーク・本編に含めなかった話~ / Hotwire or React? after talk
harunatsujita
1
120
よくできたテンプレート言語として TypeScript + JSX を利用する試み / Using TypeScript + JSX outside of Web Frontend #TSKaigiKansai
izumin5210
6
1.7k
Remix on Hono on Cloudflare Workers
yusukebe
1
280
AI時代におけるSRE、 あるいはエンジニアの生存戦略
pyama86
6
1.1k
イベント駆動で成長して委員会
happymana
1
320
CSC509 Lecture 11
javiergs
PRO
0
180
Macとオーディオ再生 2024/11/02
yusukeito
0
370
Nurturing OpenJDK distribution: Eclipse Temurin Success History and plan
ivargrimstad
0
870
Duckdb-Wasmでローカルダッシュボードを作ってみた
nkforwork
0
120
Featured
See All Featured
Large-scale JavaScript Application Architecture
addyosmani
510
110k
Raft: Consensus for Rubyists
vanstee
136
6.6k
Done Done
chrislema
181
16k
How to Think Like a Performance Engineer
csswizardry
20
1.1k
Typedesign – Prime Four
hannesfritz
40
2.4k
The Art of Programming - Codeland 2020
erikaheidi
52
13k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5k
No one is an island. Learnings from fostering a developers community.
thoeni
19
3k
How STYLIGHT went responsive
nonsquared
95
5.2k
GitHub's CSS Performance
jonrohan
1030
460k
A better future with KSS
kneath
238
17k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
229
52k
Transcript
The Perils of Writing a PaaS Andrew Godwin http://www.flickr.com/photos/jannem/2719976702/
Hi, I'm Andrew. Serial Python developer Django core committer Sysadmin
by night
We're ep.io Python Platform-as-a-Service Utility billing PostgreSQL, Redis, Celery, and
more
We built a… prototype. Me and Ben Firshman Three or
four days' hacking at DjangoCon Ran code, had simple deployment
The last 10%... A month or two of hibernation Went
part-time in December Private beta since February Public launch later this year
Why? Why not?
Why? Why not? Lack of good solutions Strong, technical team
Writing backend code is fun
It's a challenge We're still a closed beta 300+ apps,
on 4 servers Some people just have crazy code Security, security, security
Our Architecture
ep.io Cloud Request Sugar XML Response Code Magic
Balancer Runner Runner Runner App 1 App 2 App 3
App 2 App 4 App 1 Databases File Storage
Load Balancer Started with HaProxy Moved to custom Python loadbalancer
Still needs refinement
Runners Daemon on each machine Nginx + gunicorn for each
app instance Output captured, CPU time measured
Coordinator Analyses whole system Juggles apps between servers Detects dead
servers
PostgreSQL Normal PostgreSQL 9 install Daemon to read query logs,
make users
Redis Custom Redis loadbalancer/manager Starts processes on demand Handles multi-user
security
Upload Receiver SSH endpoint for git, hg, commands Wraps VCSs,
extracts uploaded files Creates filesystem images
Other Services Log aggregation UID assignment Calculate costs
Statistics Queued in Redis Consumed asynchronously Currently stored in Redis,
changing soon Graphed and profiled
Configuration Management Puppet for the simpler stuff Daemons handle complex
stuff Don't try to reinvent the wheel
Monitoring Nagios SaaS monitoring Nagios Emails, texts, pager Several custom
checks
Backups Currently just rdiff-backup Moving to btrfs snapshots + DRBD
HA is not a backup solution
Perils
Initial bad design (To be fair, it was a prototype)
Networks really aren't reliable (Well, EC2's, at least.)
Memory pressure is bad (Prepare to have a fallback. And
another.)
Raw file handles are… fun. (As is the PTY subsystem.
Be very careful.)
Write just enough automation (If a server dies, I now
just go and get a drink)
HaProxy doesn't like 500+ backends (it's not exactly common)
Single redundancy is only so good (and remember, HA is
not backups!)
Future Perils
Payment (Already underway, still hard)
Oversized Sites (we need to get a lot bigger first)
European Servers (people really do want them)
More Databases (how on earth do you measure MongoDB use?)
More Languages (easy to get it working, hard to polish)
The Potential Big Outage (quite useful as a motivational tool)
Thank you. Andrew Godwin @andrewgodwin
[email protected]