Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Fallacy of Fast

The Fallacy of Fast

Ines Sombra

April 28, 2016
Tweet

More Decks by Ines Sombra

Other Decks in Technology

Transcript

  1. Obligatory Disclaimer Things you will see in this talk Fast

    Talking & Opinions TM Un-tweetable moments Rantifestos TM Questionable language A therapy llama and ZERO Kevin Bacons @Randommood
  2. De-prioritizing Testing Cutting corners on testing carries a hidden cost

    Test the full system: client, code, & provisioning code Code reviews != tests. Have both Continuous Integration (CI) is critical to velocity, quality, & transparency
  3. De-prioritizing Releases Release stability is tied to system stability Iron

    out your deploy process! Dependencies on other systems make this even more important Canary testing, dark launches, feature flags, etc are good @Randommood
  4. Automation shortcuts taken while in a rush will come back

    to haunt you Playbooks are a must have Localhost is the devil Sloppy operational work is the mother of all evils De-prioritizing Ops ! @Randommood
  5. “Future you monitoring” is bad, make it part of MVP

    Alert fatigue has a high cost, don’t let it get that far Link alerts to playbook Routinely test your escalation paths De-prioritizing Insight ✨ @Randommood
  6. The inner workings of data components matter. Learn about them

    System boundaries ought to be made explicit Deprecate your go-to person De-prioritizing Knowledge @Randommood
  7. The internet is an awful place Expect DoS/DDoS Think about

    your system, its connections, and their dependencies Having the ability to turn off features/clients helps De-prioritizing Security
  8. Service ownership implies leveling-up operationally Architectural choices made in a

    rush can have a long shelf life Don’t sacrifice tests. 
 Test the FULL system What we learned ✨ @Randommood
  9. building shrines of Agile Assuming a given methodology will solve

    everything is naive at best Magical thinking leads to misaligned expectations All tools are terrible, avoid religious wars #
  10. Anarchy Complex Complicated Complicated Simple Close to agreement far from

    agreement Close to certainty far from certainty * “When to Scrum?” stolen from Angela Druckman Requirements Technology Scrum
  11. “In truth a range of approaches, a hybrid mix, of

    management methods is required to succeed in today's enterprise IT environment. That customer enterprise environment never was like the simplified product development environment where Agile software development was conceived…” @Randommood
  12. “In truth a range of approaches, a hybrid mix, of

    management methods is required to succeed in today's enterprise IT environment. That customer enterprise environment never was like the simplified product development environment where Agile software development was conceived…” @Randommood DUH
  13. Agile Gotchas Uncertainty in problem domain (and company size) will

    challenge your ability to adhere to it Has a cost but it’s different Nihilism FTW?
  14. Mind system Design Simple & utilitarian design takes you a

    long way Use well understood components NIH is a double edged sword Use feature flags & on/off switches (test them!) @Randommood
  15. Alice’s Testing Areas Correctness Error Performance Robustness Good output from

    good inputs Reasonable reaction to incorrect input Time to Task (TTT) for Behavior after Goal Single node Multi node Clustered Cache enabled Given # of input/outputs Given uptime @Randommood
  16. a Testing Harness Is a fantastic thing to have Invest

    in QA automation engineers Adding support for regressions & domain- specific testing pays off @Randommood
  17. Mind system Configs System assumptions are dangerous, make them explicit

    Standardize system configuration (data bags, config file, etc) Hardcoding is the devil
  18. Mind system Limits Rate limit your API calls especially if

    they are public or expensive to run Instrument / add metrics to track them Rank your services & data (what can you drop?) Capacity analysis is not dead ✨
  19. Mind system Growth Watch out for initial over- architecting “The

    application that takes you to 100k users is not the same one that takes you to 1M, and so on…” @netik Expect changes & refactors @Randommood
  20. Mind Process Architectural reviews FTW Request flow, API shape, Failure

    conditions, Reliability, Data Model, Threat modeling, Testing strategy, Operations, Monitor logging & Alerting, Pricing/Billing, Supported clients, etc @Randommood
  21. Mind Resources Redundancies of resources, execution paths, checks, replication of

    data, replay of messages, anti-entropy build resilience Mechanisms to guard system resources are good to have Your system is also tied to the resources of its dependencies
  22. Distrust is healthy Distrust client behavior, even if they are

    internal Decisions have an expiration date. Periodically re- evaluate them as past 
 you was much dumber A revisionist culture produces more resilient systems ✨ @Randommood
  23. about Resilience Traditional
 engineering Reactive
 ops unk-unk * Stolen from

    Paul Borrill Cascading or catastrophic failures & you don’t know where they will come from! Same area as other 2 combined Probability of failure Rank
  24. classical Engineering reactive Operations unk-unk Building Resilience Code standards Programming

    patterns Testing (full system!) Metrics & monitoring Convergence to good state Hazard inventories Redundancies Feature flags Dark deploys Runbooks & docs Canaries System verification Formal methods Fault injection The goal is to build failure domain independence
  25. Keep track of your technical debt & repay it regularly

    It’s about lowering the risk of change with tools & culture Mind assumptions What we learned ✨ @Randommood
  26. TL;DR Easy to sacrifice things may be harder to correct

    later Think in terms of tradeoffs TESTING MATTERS! Not all process is evil Keep in Mind Make system boundaries & dependencies explicit Playbooks are your friends, have them Use kill switches & limits Prioritize your services Distributed systems