Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying, at an Unusual Scale

Deploying, at an Unusual Scale

A talk I gave at DjangoCon Europe 2011, about Epio's internal architecture at that point.

Andrew Godwin

October 22, 2011
Tweet

More Decks by Andrew Godwin

Other Decks in Programming

Transcript

  1. Deploying, At An Unusual Scale
    Andrew Godwin
    http://www.flickr.com/photos/whiskeytango/1431343034/
    @andrewgodwin

    View Slide

  2. Hi, I'm Andrew.
    Serial Python developer
    Django core committer
    Co-founder of ep.io

    View Slide

  3. Hi, I'm Andrew.
    Serial Python developer
    Django core committer
    Co-founder of ep.io
    Occasional fast talker

    View Slide

  4. ""Andrew speaks English
    like a machine gun
    speaks bullets.""
    Reinout van Rees

    View Slide

  5. We're ep.io
    Python Platform-as-a-Service
    Easy deployment, easy upgrades
    PostgreSQL, Redis, Celery, and more

    View Slide

  6. Why am I here?
    Our Architecture
    How we deploy Django
    How varied Django deployments are

    View Slide

  7. Our Architecture

    View Slide

  8. Balancer
    Runner Runner Runner
    App 1
    App 2
    App 3
    App 2
    App 4
    App 1
    Databases File Storage
    Balancer

    View Slide

  9. Oh My God, It's Full of Pairs
    Everything is redundant
    Distributed programming is Hard

    View Slide

  10. Hardware
    Real colo'd machines
    Linode
    EC2 (pretty unreliable)
    (pretty reliable)
    (pretty reliable)
    IPv6 (as much as we can)

    View Slide

  11. ØMQ
    We used to use Redis
    Everything now on ZeroMQ
    Eliminates SPOF*
    * Single Point Of Failure. What a pointless acronym.

    View Slide

  12. ØMQ Usage
    Redundant location-resolvers (Nexus)
    REQ/XREP for control messages
    PUSH/PULL for stats, logs
    PUB/SUB for heartbeats, locking

    View Slide

  13. Runners
    Unsurprisingly, these run the code
    SquashFS filesystem images
    Virtualenvs per app
    UID & permission isolation, more coming

    View Slide

  14. Logging/Stats
    All done asynchronously using ØMQ
    Logs to filesystem (chunked files)
    Stats to PostgreSQL database, for now

    View Slide

  15. Loadbalancers
    Intercept all incoming HTTP requests
    Look up hostname (or suffix)
    HTTP 1.1 compliant

    View Slide

  16. Databases
    Shared (only for PostgreSQL)
    Dedicated (uses Runner framework)
    PostgreSQL 9, damnit

    View Slide

  17. Django in the backend
    We use the ORM extensively
    Annoying settings fiddling in __init__

    View Slide

  18. www.ep.io
    Runs on ep.io, just like any other app*
    Provides JSON API, web UI
    * Well not quite - App ID 0 is special - but we're working on it

    View Slide

  19. WSGI
    It's a standard, right?

    View Slide

  20. WSGI
    It's a standard, right?
    Well, yes, and it works fine, but it's not
    enough for serving a Python app

    View Slide

  21. Static Files
    CSS, images, JavaScript, etc.
    Needs a URL and a directory path

    View Slide

  22. Python & Dependencies
    Mostly filled by pip/buildout/etc
    packaging apparently allows version spec

    View Slide

  23. Deploying Django
    It makes things consistent, right?

    View Slide

  24. Settings Layouts
    Vanilla settings.py
    local_settings.py
    configs/HOSTNAME.py
    Many others...

    View Slide

  25. Python Paths
    Project-level imports
    App-level imports
    apps/ directories

    View Slide

  26. Databases
    If it's SQL, it's PostgreSQL
    Redis for key-value, MongoDB soon
    Some things assume a safe network

    View Slide

  27. HA (High Availability)
    Not terribly easy with shared DBs
    PostgreSQL 9's sensible warm standby
    Redis has SLAVEOF
    Possibly use DRBD for general solution

    View Slide

  28. Backups
    High Availability is NOT a backup
    btrfs for consistent snapshotting
    Archived remote syncs
    No access to backups from servers

    View Slide

  29. Migrations
    No solution yet for migration/code sync
    We're working on it...

    View Slide

  30. Web serving
    It's not like it's important or anything

    View Slide

  31. gunicorn
    Small and lightweight
    Supports long-running requests
    Pretty stable

    View Slide

  32. nginx
    Even more lightweight
    Extremely fast
    Really, really stable

    View Slide

  33. The Load Balancer
    Used to be HAProxy
    Rewritten to custom Python daemon
    eventlet used for high throughput
    Can't use nginx, no HTTP 1.1 for backends

    View Slide

  34. Celery
    See: Yesterday's Talk
    Slightly tricky to run many
    We use Redis as the backend

    View Slide

  35. Management Commands
    First off, run as subprocess
    Then, a custom PTY module
    Now, run as pty-wrapping subprocesses

    View Slide

  36. Some General Advice
    If you're crazy enough to do this

    View Slide

  37. Messaging's Not Enough
    Having a state to check is handy

    View Slide

  38. Why run one, when you can
    run two for twice the price?
    Redundancy is good. Double redundancy is better.

    View Slide

  39. Always expect the worst
    Hope you never have to deal with it.

    View Slide

  40. The more backups, the better.
    Make sure you have historical ones, too.

    View Slide

  41. Django is very flexible
    Sometimes a little too flexible...

    View Slide

  42. Your real problems will emerge later
    Don't over-optimise up front for everything

    View Slide

  43. Questions?
    Andrew Godwin
    [email protected]
    @andrewgodwin

    View Slide