Slide 1

Slide 1 text

the Fallacy Of Fast

Slide 2

Slide 2 text

INES 
 Sombra @Randommood

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Obligatory Disclaimer Things you will see in this talk Fast Talking & Opinions TM Un-tweetable moments Rantifestos TM Questionable language A therapy llama and ZERO Kevin Bacons @Randommood

Slide 5

Slide 5 text

Common Mistakes What Matters @Randommood

Slide 6

Slide 6 text

SPOILER alert @Randommood

Slide 7

Slide 7 text

@Randommood

Slide 8

Slide 8 text

Reason about design choices in terms of trade-offs @Randommood

Slide 9

Slide 9 text

chosen trade-offs make the foundation of your system @Randommood

Slide 10

Slide 10 text

@Randommood

Slide 11

Slide 11 text

@Randommood

Slide 12

Slide 12 text

@Randommood

Slide 13

Slide 13 text

@Randommood

Slide 14

Slide 14 text

Common Mistakes

Slide 15

Slide 15 text

Accidentally 
 de-emphasizing 
 long term quality & system Stability @Randommood

Slide 16

Slide 16 text

De-prioritizing Testing Cutting corners on testing carries a hidden cost Test the full system: client, code, & provisioning code Code reviews != tests. Have both Continuous Integration (CI) is critical to velocity, quality, & transparency

Slide 17

Slide 17 text

De-prioritizing Releases Release stability is tied to system stability Iron out your deploy process! Dependencies on other systems make this even more important Canary testing, dark launches, feature flags, etc are good @Randommood

Slide 18

Slide 18 text

Automation shortcuts taken while in a rush will come back to haunt you Playbooks are a must have Localhost is the devil Sloppy operational work is the mother of all evils De-prioritizing Ops ! @Randommood

Slide 19

Slide 19 text

“Future you monitoring” is bad, make it part of MVP Alert fatigue has a high cost, don’t let it get that far Link alerts to playbook Routinely test your escalation paths De-prioritizing Insight ✨ @Randommood

Slide 20

Slide 20 text

The inner workings of data components matter. Learn about them System boundaries ought to be made explicit Deprecate your go-to person De-prioritizing Knowledge @Randommood

Slide 21

Slide 21 text

The internet is an awful place Expect DoS/DDoS Think about your system, its connections, and their dependencies Having the ability to turn off features/clients helps De-prioritizing Security

Slide 22

Slide 22 text

Service ownership implies leveling-up operationally Architectural choices made in a rush can have a long shelf life Don’t sacrifice tests. 
 Test the FULL system What we learned ✨ @Randommood

Slide 23

Slide 23 text

RANTIFESTO! @Randommood

Slide 24

Slide 24 text

building shrines of Agile Assuming a given methodology will solve everything is naive at best Magical thinking leads to misaligned expectations All tools are terrible, avoid religious wars #

Slide 25

Slide 25 text

Anarchy Complex Complicated Complicated Simple Close to agreement far from agreement Close to certainty far from certainty * “When to Scrum?” stolen from Angela Druckman Requirements Technology Scrum

Slide 26

Slide 26 text

@Randommood

Slide 27

Slide 27 text

@Randommood

Slide 28

Slide 28 text

@Randommood

Slide 29

Slide 29 text

@Randommood $%&% ' '' ' '' ' ' '

Slide 30

Slide 30 text

( ( @Randommood

Slide 31

Slide 31 text

“In truth a range of approaches, a hybrid mix, of management methods is required to succeed in today's enterprise IT environment. That customer enterprise environment never was like the simplified product development environment where Agile software development was conceived…” @Randommood

Slide 32

Slide 32 text

“In truth a range of approaches, a hybrid mix, of management methods is required to succeed in today's enterprise IT environment. That customer enterprise environment never was like the simplified product development environment where Agile software development was conceived…” @Randommood DUH

Slide 33

Slide 33 text

Agile Gotchas Uncertainty in problem domain (and company size) will challenge your ability to adhere to it Has a cost but it’s different Nihilism FTW?

Slide 34

Slide 34 text

) ? @Randommood

Slide 35

Slide 35 text

What Matters @Randommood

Slide 36

Slide 36 text

Mind system Design Simple & utilitarian design takes you a long way Use well understood components NIH is a double edged sword Use feature flags & on/off switches (test them!) @Randommood

Slide 37

Slide 37 text

Meet Alice I’m way too cool for this outfit

Slide 38

Slide 38 text

Alice’s Testing Areas Correctness Error Performance Robustness Good output from good inputs Reasonable reaction to incorrect input Time to Task (TTT) for Behavior after Goal Single node Multi node Clustered Cache enabled Given # of input/outputs Given uptime @Randommood

Slide 39

Slide 39 text

a Testing Harness Is a fantastic thing to have Invest in QA automation engineers Adding support for regressions & domain- specific testing pays off @Randommood

Slide 40

Slide 40 text

Mind system Configs System assumptions are dangerous, make them explicit Standardize system configuration (data bags, config file, etc) Hardcoding is the devil

Slide 41

Slide 41 text

Mind system Limits Rate limit your API calls especially if they are public or expensive to run Instrument / add metrics to track them Rank your services & data (what can you drop?) Capacity analysis is not dead ✨

Slide 42

Slide 42 text

Mind system Growth Watch out for initial over- architecting “The application that takes you to 100k users is not the same one that takes you to 1M, and so on…” @netik Expect changes & refactors @Randommood

Slide 43

Slide 43 text

Mind Process Architectural reviews FTW Request flow, API shape, Failure conditions, Reliability, Data Model, Threat modeling, Testing strategy, Operations, Monitor logging & Alerting, Pricing/Billing, Supported clients, etc @Randommood

Slide 44

Slide 44 text

Mind Resources Redundancies of resources, execution paths, checks, replication of data, replay of messages, anti-entropy build resilience Mechanisms to guard system resources are good to have Your system is also tied to the resources of its dependencies

Slide 45

Slide 45 text

Distrust is healthy Distrust client behavior, even if they are internal Decisions have an expiration date. Periodically re- evaluate them as past 
 you was much dumber A revisionist culture produces more resilient systems ✨ @Randommood

Slide 46

Slide 46 text

about Resilience Traditional
 engineering Reactive
 ops unk-unk * Stolen from Paul Borrill Cascading or catastrophic failures & you don’t know where they will come from! Same area as other 2 combined Probability of failure Rank

Slide 47

Slide 47 text

classical Engineering reactive Operations unk-unk Building Resilience Code standards Programming patterns Testing (full system!) Metrics & monitoring Convergence to good state Hazard inventories Redundancies Feature flags Dark deploys Runbooks & docs Canaries System verification Formal methods Fault injection The goal is to build failure domain independence

Slide 48

Slide 48 text

Keep track of your technical debt & repay it regularly It’s about lowering the risk of change with tools & culture Mind assumptions What we learned ✨ @Randommood

Slide 49

Slide 49 text

TL;DR Easy to sacrifice things may be harder to correct later Think in terms of tradeoffs TESTING MATTERS! Not all process is evil Keep in Mind Make system boundaries & dependencies explicit Playbooks are your friends, have them Use kill switches & limits Prioritize your services Distributed systems

Slide 50

Slide 50 text

github.com/Randommood/FallacyOfFast @Randommood