Slide 1

Slide 1 text

Release It! Design and Deploy Production-Ready Software Jake Trent 18 Aug 2009 by Michael T. Nygard

Slide 2

Slide 2 text

The Book Is thoughtful reading ●Prompt evaluation: ○Self ○Project ○Company ●Pragprog.com

Slide 3

Slide 3 text

Motivation For quality, production-ready software ●Go home at night w/o calls to your cell phone ●Avoid un-needed cost: ○Down time costs ○Opportunity costs ○Operational costs ○Legal costs ●Respectable work

Slide 4

Slide 4 text

The Need For periodic reminder of these issues ●Code that passes QA: ○Can still fail miserably ○Can still give users bad impressions ○Can still inflict avoidable costs ●Problems will happen

Slide 5

Slide 5 text

Stability Antipatterns

Slide 6

Slide 6 text

Integration Points Antipattern ●Modern display of coupling: Systems talking to systems ●Every contact point is a possible failure point ●Things not under your control ○Network reliability ○Data availability ○External system correctness

Slide 7

Slide 7 text

Chain Reactions Antipattern ●One failure often triggers another ●Resource availability often the catalyst ●Turn into the system attacking itself

Slide 8

Slide 8 text

Cascading Failures Antipattern ●Failure in one layer causes problems in callers ●Insufficiently paranoid integration points

Slide 9

Slide 9 text

Users Antipattern ●"Users of a system have this knack for creative destruction." ●Each user consumes more memory ●Some are a burden, others plain malicious

Slide 10

Slide 10 text

Blocked Threads Antipattern ●"Adding complexity to solve on problem creates the risk of entirely new failure modes." ●Resource pool contention ●Beware 3rd party API ●Timeouts

Slide 11

Slide 11 text

Attacks of Self-Denial Antipattern ●Plan for your own success ●"Good marketing can kill you at any time."

Slide 12

Slide 12 text

Scaling Effects Antipattern ●Horizontal scale communication ●Shared resource bottleneck

Slide 13

Slide 13 text

Unbalanced Capacities Antipattern ●Performance will depend on your most constrained resource ●Not often discovered by QA ●Consider proportions of types of transactions

Slide 14

Slide 14 text

Slow Responses Antipattern ●Better to fail fast than to hog resources only to eventually fail

Slide 15

Slide 15 text

SLA Inversion Antipattern ●"When calling third parties, service levels only decrease." ●Consider real need and real cost ●Service level can only be as high as the lowest subsystem

Slide 16

Slide 16 text

Unbounded Result Sets Antipattern ●Test uses unrealistically small data sets ●Use limits on all queries

Slide 17

Slide 17 text

Stability Patterns

Slide 18

Slide 18 text

Use Timeouts Pattern ●Prevent integration points from becoming blocked threads ●Retry for potential transient timeouts ●Ability to move on without return (fail fast)

Slide 19

Slide 19 text

Circuit Breaker Pattern ●Prevent operations rather than re-execute them ●Note each failure until switch is flipped ●Use with timeouts - try again eventually ●Visible to operations

Slide 20

Slide 20 text

Bulkheads Pattern ●Find natural partitions ○Thread groups ○Resource pools ○Hardware

Slide 21

Slide 21 text

Steady State Pattern ●System should run w/o manual intervention ●Human fiddling leads to error ●Purge data ●Roll logs ●At least move out of production environment

Slide 22

Slide 22 text

Fail Fast Pattern ●Check availability before attempted use ●Basic parameter checking before loading expensive objects ●"Don't do useless work"

Slide 23

Slide 23 text

Handshaking Pattern ●Allow integration points to throttle themselves

Slide 24

Slide 24 text

Test Harness Pattern ●Box independent of the "norms" of the environment ●As devious as possible, esp at network level ●Out-of-spec ●Stress

Slide 25

Slide 25 text

Decoupling Middleware Pattern ●Decide on the plumbing at the "last responsible moment" ●Hardest to change later

Slide 26

Slide 26 text

"Paranoia is just good thinking." "It's unlikely that anyone will notice your system's lack of downtime." Michal T. Nygard

Slide 27

Slide 27 text

Das Ende aprilandjake.com/content/release-it-stability-review/