@GanbaroDigital
Lessons I’ve learned
from nearly a quarter of a century
of making things robust.
Slide 8
Slide 8 text
@GanbaroDigital
I am still learning.
Slide 9
Slide 9 text
@GanbaroDigital
What Is Robustness?
Slide 10
Slide 10 text
@GanbaroDigital
Let’s make sure
we’re all talking
about the same thing.
Slide 11
Slide 11 text
@GanbaroDigital
??
??
What do you think
robustness is?
Slide 12
Slide 12 text
@GanbaroDigital
??
??
Have you heard of
“happy path” programming?
Slide 13
Slide 13 text
@GanbaroDigital
Slide 14
Slide 14 text
@GanbaroDigital
Slide 15
Slide 15 text
@GanbaroDigital
Slide 16
Slide 16 text
@GanbaroDigital
The “happy path”
only deals with correctness
under perfect conditions.
Slide 17
Slide 17 text
@GanbaroDigital
This is a world without
robustness.
Slide 18
Slide 18 text
@GanbaroDigital
“
Correctness bugs
are logic flaws that
produce unacceptable results.
Slide 19
Slide 19 text
@GanbaroDigital
The happy path works
when correctness bugs
have been found
and resolved.
Slide 20
Slide 20 text
@GanbaroDigital
“ Robustness code
deals with all the things
that shouldn’t happen,
but inevitably do.
Slide 21
Slide 21 text
@GanbaroDigital
Slide 22
Slide 22 text
@GanbaroDigital
Slide 23
Slide 23 text
@GanbaroDigital
Slide 24
Slide 24 text
@GanbaroDigital
Slide 25
Slide 25 text
@GanbaroDigital
x
Slide 26
Slide 26 text
@GanbaroDigital
“
Robustness bugs
stop perfectly-good logic
from working.
Slide 27
Slide 27 text
@GanbaroDigital
“
Without robustness,
your code runs on
“There But For The Grace of God”
Slide 28
Slide 28 text
@GanbaroDigital
Slide 29
Slide 29 text
@GanbaroDigital
Slide 30
Slide 30 text
@GanbaroDigital
“
Add robustness up-front
or deal with the consequences
in production.
Slide 31
Slide 31 text
@GanbaroDigital
??
??
How do you
make your applications
more robust?
Slide 32
Slide 32 text
@GanbaroDigital
Here are five patterns
that have been
very successful for me.
Slide 33
Slide 33 text
@GanbaroDigital
These patterns work best
when used together.
Slide 34
Slide 34 text
@GanbaroDigital
I’m going to illustrate each pattern
using very simple examples.
Slide 35
Slide 35 text
@GanbaroDigital
“
Robustness patterns are
micro-patterns.
Slide 36
Slide 36 text
@GanbaroDigital
Pattern 1
Slide 37
Slide 37 text
@GanbaroDigital
??
??
What do you use
returned values for?
Slide 38
Slide 38 text
@GanbaroDigital
Slide 39
Slide 39 text
@GanbaroDigital
Slide 40
Slide 40 text
@GanbaroDigital
“
Nobody checks
returned data
100% reliably.
Slide 41
Slide 41 text
@GanbaroDigital
“
Nobody does
anything
100% reliably.
Slide 42
Slide 42 text
@GanbaroDigital
Sometimes, they don’t
check it safely.
Slide 43
Slide 43 text
@GanbaroDigital
if (!$position)
vs
if (false === $position)
Slide 44
Slide 44 text
@GanbaroDigital
Sometimes, they don’t
check it at all.
Slide 45
Slide 45 text
@GanbaroDigital
Eliminate the need
to check return values.
Slide 46
Slide 46 text
@GanbaroDigital
Slide 47
Slide 47 text
@GanbaroDigital
Slide 48
Slide 48 text
@GanbaroDigital
Slide 49
Slide 49 text
@GanbaroDigital
??
??
What do you do
when a robustness problem
has been detected?
Slide 50
Slide 50 text
@GanbaroDigital
Slide 51
Slide 51 text
@GanbaroDigital
Slide 52
Slide 52 text
@GanbaroDigital
Use exceptions
when normal flow
cannot continue.
Slide 53
Slide 53 text
@GanbaroDigital
??
??
What about returning NULL?
Slide 54
Slide 54 text
@GanbaroDigital
NULL
is not an error report
Slide 55
Slide 55 text
@GanbaroDigital
Stop using true, false and NULL as
error responses
Slide 56
Slide 56 text
@GanbaroDigital
Return NULL
only when “no data”
is a valid response.
Slide 57
Slide 57 text
@GanbaroDigital
Throw an exception
when “no data”
is not a valid response.
Slide 58
Slide 58 text
@GanbaroDigital
Succeed,
Except When You Fail
Slide 59
Slide 59 text
@GanbaroDigital
1. Build APIs that succeed
2. Throw exceptions when they can’t
3. Use NULL only when it’s a valid data value
Slide 60
Slide 60 text
@GanbaroDigital
Pattern 2
Slide 61
Slide 61 text
@GanbaroDigital
??
??
What breaks
perfectly-good logic?
Slide 62
Slide 62 text
@GanbaroDigital
Slide 63
Slide 63 text
@GanbaroDigital
Slide 64
Slide 64 text
@GanbaroDigital
Slide 65
Slide 65 text
@GanbaroDigital
“
Type mismatches
break perfectly good logic.
Slide 66
Slide 66 text
@GanbaroDigital
Standard solution:
check incoming data
before using it.
Slide 67
Slide 67 text
@GanbaroDigital
PHP7 extended
type-hinting to help.
Slide 68
Slide 68 text
@GanbaroDigital
Slide 69
Slide 69 text
@GanbaroDigital
Type-hinting isn’t a silver bullet.
Slide 70
Slide 70 text
@GanbaroDigital
Type-hinting only tells you
if something is
the right shape.
Type-hinting does not
validate your data.
Slide 71
Slide 71 text
@GanbaroDigital
Slide 72
Slide 72 text
@GanbaroDigital
??
??
What would you do
about this?
Slide 73
Slide 73 text
@GanbaroDigital
Checks and Requirements
can help.
Slide 74
Slide 74 text
@GanbaroDigital
Slide 75
Slide 75 text
@GanbaroDigital
Slide 76
Slide 76 text
@GanbaroDigital
Unsupported outputs
can break logic.
Slide 77
Slide 77 text
@GanbaroDigital
Slide 78
Slide 78 text
@GanbaroDigital
Slide 79
Slide 79 text
@GanbaroDigital
Unsupported state
can break logic.
Slide 80
Slide 80 text
@GanbaroDigital
Slide 81
Slide 81 text
@GanbaroDigital
Unhandled errors
and uncaught exceptions
ultimately crash code.
Slide 82
Slide 82 text
@GanbaroDigital
x
Slide 83
Slide 83 text
@GanbaroDigital
“
Your code can crash and burn,
or it can fail gracefully.
Pick one.
Slide 84
Slide 84 text
@GanbaroDigital
Step 1:
Know that you need to fail.
Slide 85
Slide 85 text
@GanbaroDigital
Step 2:
Fail deliberately.
Slide 86
Slide 86 text
@GanbaroDigital
Fail Deliberately
Slide 87
Slide 87 text
@GanbaroDigital
1. Check everything before you use it
2. Check both type and condition
3. Don’t propagate bad data
Slide 88
Slide 88 text
@GanbaroDigital
Pattern 3
Slide 89
Slide 89 text
@GanbaroDigital
??
??
When a method cannot continue,
what do you do?
Slide 90
Slide 90 text
@GanbaroDigital
Slide 91
Slide 91 text
@GanbaroDigital
??
??
How often do you catch
exceptions in your code?
Slide 92
Slide 92 text
@GanbaroDigital
“
Nobody does
anything
100% reliably.
Slide 93
Slide 93 text
@GanbaroDigital
??
??
How often do you catch
exceptions thrown
by someone else’s code?
Slide 94
Slide 94 text
@GanbaroDigital
“
No-one catches exceptions
they didn’t throw themselves.
Slide 95
Slide 95 text
@GanbaroDigital
??
??
When code must fail,
can a library know
what must happen next?
Slide 96
Slide 96 text
@GanbaroDigital
A library, by its nature,
cannot know the context
it is being used in.
Slide 97
Slide 97 text
@GanbaroDigital
A library can decide
that it must fail.
Failure is a technical situation.
Slide 98
Slide 98 text
@GanbaroDigital
The app must decide
what to do on failure.
What to do is a matter of policy.
Slide 99
Slide 99 text
@GanbaroDigital
“
Policy belongs in
application code,
not libraries or packages.
Slide 100
Slide 100 text
@GanbaroDigital
Slide 101
Slide 101 text
@GanbaroDigital
??
??
What if we injected policy
from the app
into the library?
Slide 102
Slide 102 text
@GanbaroDigital
Slide 103
Slide 103 text
@GanbaroDigital
Inversion of Control:
library decides if to fail,
app decides how to fail.
Slide 104
Slide 104 text
@GanbaroDigital
For example:
throw an exception
containing the HTTP response
Slide 105
Slide 105 text
@GanbaroDigital
Slide 106
Slide 106 text
@GanbaroDigital
$onFailure can provide
a default RETURN value
instead.
Slide 107
Slide 107 text
@GanbaroDigital
Tell Me How To Fail
Slide 108
Slide 108 text
@GanbaroDigital
1. Support a $onFailure callback
2. Let it throw any exceptions
3. Let it produce a return value
Slide 109
Slide 109 text
@GanbaroDigital
Pattern 4
Slide 110
Slide 110 text
@GanbaroDigital
??
??
What state are things left in
when your code fails?
Slide 111
Slide 111 text
@GanbaroDigital
“
Your code can crash and burn,
or it can fail gracefully.
Pick one.
Slide 112
Slide 112 text
@GanbaroDigital
Step 1:
Know that you need to fail.
Slide 113
Slide 113 text
@GanbaroDigital
Step 2:
Fail deliberately.
Slide 114
Slide 114 text
@GanbaroDigital
Step 3:
Don’t leave a mess behind
when you fail.
Slide 115
Slide 115 text
@GanbaroDigital
If you don’t do this,
your application stops working
until someone manually
repairs the damage done.
Slide 116
Slide 116 text
@GanbaroDigital
??
??
What needs to be considered?
Slide 117
Slide 117 text
@GanbaroDigital
Basic principle:
reset as if the operation
never happened.
Slide 118
Slide 118 text
@GanbaroDigital
Post-Failure Reset
1. PHP session contents
2. Changes to your databases / data stores
3. Changes on remote systems
Slide 119
Slide 119 text
@GanbaroDigital
??
??
How do you avoid a mess
at the caller’s end?
Slide 120
Slide 120 text
@GanbaroDigital
Tell them that
something went wrong,
and it’s not their fault.
HTTP 5xx!
Slide 121
Slide 121 text
@GanbaroDigital
Make it safe
to retry actions that
produce HTTP 5xx errors.
Slide 122
Slide 122 text
@GanbaroDigital
“Idempotent actions
can be safely attempted
multiple times,
but only take effect once.
Slide 123
Slide 123 text
@GanbaroDigital
“
Idempotent is HARD.
Slide 124
Slide 124 text
@GanbaroDigital
Don’t Crash Land
Slide 125
Slide 125 text
@GanbaroDigital
1. Reset state after a robustness failure
2. Return an error indicator to your API caller
3. Make operations idempotent
Slide 126
Slide 126 text
@GanbaroDigital
Pattern 5
Slide 127
Slide 127 text
@GanbaroDigital
??
??
How do you know
that you have
a robustness problem
in production?
Slide 128
Slide 128 text
@GanbaroDigital
Write Errors and Exceptions
to your log files.
Slide 129
Slide 129 text
@GanbaroDigital
??
??
How do you know
what the problem is?
Slide 130
Slide 130 text
@GanbaroDigital
I’ve lost count
of the number of problems
that had to be
debugged in Production.
Slide 131
Slide 131 text
@GanbaroDigital
Make sure your logs
tell you what you need
to fix the bug.
Slide 132
Slide 132 text
@GanbaroDigital
“
If you can’t solve a Production issue,
ship extra checks and logging
to gather better evidence.
Slide 133
Slide 133 text
@GanbaroDigital
Logs don’t write themselves.
Install a default error handler.
Install a default exception handler.
Slide 134
Slide 134 text
@GanbaroDigital
Fix the bugs
that appear in your logs.
Wipe them out. All of them.
Slide 135
Slide 135 text
@GanbaroDigital
You never know
what’s hiding behind
the bugs that you can see.
Slide 136
Slide 136 text
@GanbaroDigital
Know That
Code Has Failed
Slide 137
Slide 137 text
@GanbaroDigital
1. Log errors and exceptions
2. Logging must be production-safe
3. Logs must be useful!
4. Fix every bug that appears in your logs
Slide 138
Slide 138 text
@GanbaroDigital
In Summary
Slide 139
Slide 139 text
@GanbaroDigital
5 Patterns To Improve
Robustness
1. Succeed, Except When You Fail
2. Fail Deliberately
3. Tell Me How To Fail
4. Don’t Crash Land
5. Know That Code Has Failed
Slide 140
Slide 140 text
@GanbaroDigital
5 Patterns To Improve
Robustness
1. Succeed, Except When You Fail
2. Fail Deliberately
3. Tell Me How To Fail
4. Don’t Crash Land
5. Know That Code Has Failed
Slide 141
Slide 141 text
@GanbaroDigital
5 Patterns To Improve
Robustness
1. Succeed, Except When You Fail
2. Fail Deliberately
3. Tell Me How To Fail
4. Don’t Crash Land
5. Know That Code Has Failed
Slide 142
Slide 142 text
@GanbaroDigital
5 Patterns To Improve
Robustness
1. Succeed, Except When You Fail
2. Fail Deliberately
3. Tell Me How To Fail
4. Don’t Crash Land
5. Know That Code Has Failed
Slide 143
Slide 143 text
@GanbaroDigital
5 Patterns To Improve
Robustness
1. Succeed, Except When You Fail
2. Fail Deliberately
3. Tell Me How To Fail
4. Don’t Crash Land
5. Know That Code Has Failed
Slide 144
Slide 144 text
@GanbaroDigital
5 Patterns To Improve
Robustness
1. Succeed, Except When You Fail
2. Fail Deliberately
3. Tell Me How To Fail
4. Don’t Crash Land
5. Know That Code Has Failed
@GanbaroDigital
It must have been created
for a genuine reason.
Slide 148
Slide 148 text
@GanbaroDigital
The tips in the original video
reduce immediate cognitive load.
That must be attractive
to programmers at the beginning
of their experience.
Slide 149
Slide 149 text
@GanbaroDigital
In my experience:
they also reduce or eliminate
proven robustness and correctness
practices.
Slide 150
Slide 150 text
@GanbaroDigital
Those problems may not appear
in small, disposable apps.
Slide 151
Slide 151 text
@GanbaroDigital
Those problems will appear
in larger, or longer-lived projects.
Slide 152
Slide 152 text
@GanbaroDigital
Larger projects require code
from different people
to work together.
That never happens perfectly.
Slide 153
Slide 153 text
@GanbaroDigital
Longer-lived projects
have to be resilient
to changes to code
and changes to the team.
Slide 154
Slide 154 text
@GanbaroDigital
“
Resilience against change and time
doesn’t happen accidentally.
Slide 155
Slide 155 text
@GanbaroDigital
On larger projects,
both robustness and correctness
save time, money, and reputation.
Slide 156
Slide 156 text
@GanbaroDigital
“
No-one can afford
to produce zero-defect code.
Slide 157
Slide 157 text
@GanbaroDigital
I’ll be looking at that
@GlasgowPHP
in July 2017.
Slide 158
Slide 158 text
Thank You
Any Questions?
A presentation by @stuherbert
for @GanbaroDigital