Upgrade to Pro — share decks privately, control downloads, hide ads and more …

You don't have a production environment

You don't have a production environment

"We have a problem in production! does it reproduce in staging?" is a sentence often uttered in many companies. Underlying this simple sentence lie implicit assumptions about environments which often turn out to be false in our modern services and applications. Clusters and micro-services complicate matters and make production environments hard to reproduce and model, but when A/B tests, feature toggles and ephemeral containers, serverless functions and multi-tenancy are introduced one must conclude we no longer have a "production environment", but rather "production environments", plural. In a sense, every transaction experiences a unique unreproducible universe. In this world, what is the value of Staging environments? how does such understanding change our development processes?

Avishai Ish-Shalom

June 17, 2019
Tweet

More Decks by Avishai Ish-Shalom

Other Decks in Programming

Transcript

  1. You don’t have a production
    environment
    Avishai Ish-Shalom (@nukemberg)

    View full-size slide

  2. You don’t have a staging environment
    either

    View full-size slide

  3. What is it good for?
    ● Testing new subsystems
    ● Reproducing issues
    ● Testing artificial conditions
    Basically, a laboratory
    Realistic feedback, safely

    View full-size slide

  4. What is it good for?
    ● Testing new subsystems
    ● Reproducing issues
    ● Testing artificial conditions
    Basically, a laboratory
    Realistic feedback, safely

    View full-size slide

  5. ● Conceptually: a system,
    hosted within an
    environment
    ● Replicate/move the
    system between
    environments

    View full-size slide

  6. Assumptions
    1. Clear boundary between system and environment
    2. System can be separated from environment
    3. System is deterministic
    4. Boundary can be replicated
    5. System state can be replicated

    View full-size slide

  7. ● O/S configurations
    ● Topology configurations
    ● API dependencies
    ● Data dependencies
    Complete boundary
    spec? easy!

    View full-size slide

  8. ● The environment
    always leaks into the
    system
    ● The more scale, the
    more leakage, data,
    state

    View full-size slide

  9. Clusters
    ● Not homogenous
    ● Every host a little different
    ● Hosts have state
    ● Transactions can be
    serviced by different hosts

    View full-size slide

  10. Clusters (you do deploy sometimes, right?)
    ● Hosts can have different versions
    ● Even with blue/green deployment
    ● V1→V2→V4 != V2→V3→V4 != V4
    ● Golden images to the rescue? (nope)

    View full-size slide

  11. Micro-services (AKA The Opsocalypse)
    ● Different transaction touch
    different services
    ● On different hosts
    ● Possibly async
    ● And everyone is deploying
    all the time

    View full-size slide

  12. A/B tests
    ● Different code paths
    ● Possibly different services
    ● Multiple concurrent experiments

    View full-size slide

  13. Personalization
    Caches
    Monitoring
    Client versions
    DNS
    CDN
    That stupid Cron job
    SSL certificates
    Traffic

    View full-size slide

  14. You don’t have a production environment
    You have production
    environments

    View full-size slide

  15. Production environments (plural)
    ● Transaction coordinate space practically infinite
    ● Every transaction experiences a unique, transient
    universe
    ● Impossible to replicate exactly
    ● Impossible to isolate
    Is your Facebook experience the same as mine?

    View full-size slide

  16. So now what?

    View full-size slide

  17. What is our goal?
    To get (most) realistic feedback, safely

    View full-size slide

  18. Nothing is more realistic than reality
    Test in production

    View full-size slide

  19. Why isn’t your staging environment
    hosted on
    AWS staging cloud™?

    View full-size slide

  20. Multitenancy
    Logical isolation of a group of users
    including their data, configuration, user
    management, functional experience and
    non-functional aspects

    View full-size slide

  21. Multitenancy
    Single system, logical separation
    Multi-instances
    Duplicate the entire system

    View full-size slide

  22. But how do I….
    ● Test new service ABC version 0.23?
    ● Test existing service XYZ under load?
    ● Reproduce a bug?
    ● Run automated tests?
    Open a tenant, connect your stuff, run the test, remove the
    tenant.

    View full-size slide

  23. OK, i’m sold. How do I get there?
    ● Pass the tenant ID everywhere
    ● Mark everything with the tenant ID
    ● Partition all data by the tenant ID
    ● Never allow cross tenant operations
    ● Integrate observability tools to everything

    View full-size slide

  24. Multitenancy is a core feature for
    SaaS/Cloud

    View full-size slide