$30 off During Our Annual Pro Sale. View Details »

Wind of Change

Wind of Change

How we perform changes at Auth0 to move fast while reducing risk

Damian Schenkelman

August 23, 2018
Tweet

More Decks by Damian Schenkelman

Other Decks in Technology

Transcript

  1. The wind of change
    by @dschenkelman

    View Slide

  2. Logins

    View Slide

  3. Embrace risk

    View Slide

  4. SLOs
    https://landing.google.com/sre/book/chapters/service-
    level-objectives.html
    • Latency: 99th percentile at edge
    (measured every 5 minutes) < 500 ms
    • Reliability: 99.99% of requests succeed
    (200-499 status code)

    View Slide

  5. Error Budgets

    View Slide

  6. Consumption
    Uniform/simplified model https://goo.gl/7kWoX4
    duration[s] * requests impacted[%] / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100

    View Slide

  7. Take me to the
    magic of the moment
    On a glory night
    Where the children
    of tomorrow dream
    away (dream away)
    In the wind of
    change

    View Slide

  8. Rolling updates

    View Slide

  9. duration[s] * requests impacted[%] / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100

    View Slide

  10. Instances

    View Slide

  11. Environments

    View Slide

  12. Canary

    View Slide

  13. Blue/Green

    View Slide

  14. duration[s] * requests impacted[%] / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100

    View Slide

  15. Usage patterns

    View Slide

  16. duration[s] * requests impacted[%] *
    (period weight) / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100
    req during period length
    avg req period length during month
    period
    weight
    =
    Consumption
    Real Model

    View Slide

  17. Schedule
    Region/Time Start End
    US 12 AM UTC 10 AM UTC
    EU 9 PM UTC 9 AM UTC
    AU 11 AM UTC 9 PM UTC

    View Slide

  18. Feature flags

    View Slide

  19. Belgrano

    View Slide

  20. Usage
    const belgrano = require('belgrano');
    belgrano.init({ /* config */ });
    const getFlag = belgrano.entities.getFlag;
    if (await getFlag(entityId, flag)) {
    // enabled case
    } else {
    // disabled case
    }

    View Slide

  21. duration[s] * requests impacted[%] / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100

    View Slide

  22. Topologies

    View Slide

  23. DB only
    Belgrano SDK MongoDB

    View Slide

  24. + Abstraction
    Belgrano SDK Belgrano Server
    MongoDB

    View Slide

  25. Faster

    View Slide

  26. + Cache
    Belgrano SDK Belgrano Server
    MongoDB
    Redis

    View Slide

  27. Circuit Breakers

    View Slide

  28. Manage

    View Slide

  29. Go fast

    View Slide

  30. Hope is not a strategy

    View Slide

  31. Experiments

    View Slide

  32. Shadowing
    Fire and forget

    View Slide

  33. Real

    View Slide

  34. duration[s] * requests impacted[%] / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100

    View Slide

  35. Rate limiting

    View Slide

  36. Rate limiting
    limitd-leveldb
    Client
    Client
    Client
    { key space 2}
    { key space 3}
    { key space 1}
    level
    level
    level
    limitd
    limitd
    limitd

    View Slide

  37. Cluster
    Rate limiting
    limitd-redis
    Client
    Client
    Client
    { key space 2}
    { key space 3}
    { key space 1}

    View Slide

  38. Tap Compare

    View Slide

  39. feature-change
    const feature_change = require('feature-change');
    var options = {
    expected: function(cb){
    search_v2(query, cb);
    },
    actual: function(cb){
    search_v3(query, cb);
    },
    logAction: function(current_result, new_result){
    // invoked when there is a difference in the results
    // (useful for logging)
    }
    };
    feature_change(options, function(err, result){
    // this is the original callback you were using for search v2
    // err and result always come from search_v2
    });

    View Slide

  40. duration[s] * requests impacted[%] / 100
    (60*60*24*30)[s] * (error budget[%]/100[%]) * 100

    View Slide

  41. Iron out differences

    View Slide

  42. Future

    View Slide

  43. Platform
    • Pick resources (CPU, memory, etc.)
    • Pick deployment method
    • Control routed traffic
    • Spinnaker and automatic anomaly detection

    View Slide

  44. Feature flags
    • % based flags
    • User Interface for managing
    • New stores (not just mongodb)

    View Slide

  45. Rate limiting
    • Dynamic configuration
    • Support for “concurrent requests”

    View Slide

  46. We’re hiring
    https://auth0.com/careers

    View Slide

  47. Links
    • https://medium.com/@copyconstruct/testing-in-
    production-the-safe-way-18ca102d0ef1
    • https://landing.google.com/sre/book/index.html
    • https://martinfowler.com/articles/feature-
    toggles.html
    • https://redis.io/topics/cluster-tutorial
    • https://stripe.com/blog/rate-limiters

    View Slide

  48. Links
    • http://zachholman.com/talk/move-fast-break-
    nothing/
    • https://auth0.com/blog/2015/10/27/feature-
    changes-at-auth0/
    • https://github.com/dschenkelman/feature-
    change

    View Slide

  49. Gracias!
    https://github.com/dschenkelman/wind-of-change-talk
    @dschenkelman

    View Slide