Spike: Drop Connection
• Incoming connection accepted
• Attempting outgoing connection
• Connection established
Slide 124
Slide 124 text
Spike: Drop Connection
• Incoming connection accepted
• Attempting outgoing connection
• Connection established
• Data sent
Slide 125
Slide 125 text
Spike: Drop Connection
• Incoming connection accepted
• Attempting outgoing connection
• Connection established
• Data sent
• Data received
Slide 126
Slide 126 text
Integrating Spike
"Double and Halve" app
Slide 127
Slide 127 text
Integrating Spike
"Double and Halve" app
Slide 128
Slide 128 text
Integrating Spike
"Double and Halve" app
Slide 129
Slide 129 text
Integrating Spike
"Double and Halve" app
Slide 130
Slide 130 text
Integrating Spike
"Double and Halve" app
Slide 131
Slide 131 text
Integrating Spike
"Double and Halve" app
Slide 132
Slide 132 text
Integrating Spike
"Double and Halve" app
Slide 133
Slide 133 text
Integrating Spike
"Double and Halve" app
Slide 134
Slide 134 text
Integrating Spike
"Double and Halve" app
Slide 135
Slide 135 text
Integrating Spike
"Double and Halve" app
Slide 136
Slide 136 text
• Easy to verify
Integrating Spike
"Double and Halve" app
Slide 137
Slide 137 text
• Easy to verify
• Messages cross process boundary
Integrating Spike
"Double and Halve" app
Slide 138
Slide 138 text
• Easy to verify
• Messages cross process boundary
• Messages cross network boundary
Integrating Spike
"Double and Halve" app
Slide 139
Slide 139 text
Integrating Spike
• Double and Halve App
Slide 140
Slide 140 text
Integrating Spike
• Double and Halve App
• No Spiking
Slide 141
Slide 141 text
Integrating Spike
• Double and Halve App
• No Spiking
• Test, Test, Test
Slide 142
Slide 142 text
Integrating Spike
• Double and Halve App
• No Spiking
• Test, Test, Test
• Wesley: It passes! It passes! It passes!
Slide 143
Slide 143 text
Integrating Spike
• Double and Halve App
Slide 144
Slide 144 text
Integrating Spike
• Double and Halve App
• Spike with “drop connection”
Slide 145
Slide 145 text
Integrating Spike
• Double and Halve App
• Spike with “drop connection”
• Test, Test, Test
Slide 146
Slide 146 text
Integrating Spike
• Double and Halve App
• Spike with “drop connection”
• Test, Test, Test
• Wesley: It fails! It fails! It fails!
Slide 147
Slide 147 text
Integrating Spike
Slide 148
Slide 148 text
Integrating Spike
== Session Recovery!
Slide 149
Slide 149 text
Integrating Spike
• Double and Halve App
Slide 150
Slide 150 text
Integrating Spike
• Double and Halve App
• Spike with “drop connection”
Slide 151
Slide 151 text
Integrating Spike
• Double and Halve App
• Spike with “drop connection”
• Test, Test, Test
Slide 152
Slide 152 text
Integrating Spike
• Double and Halve App
• Spike with “drop connection”
• Test, Test, Test
• Wesley: It passes! It passes! It passes!
Slide 153
Slide 153 text
Repeated runs with different results
==
Mostly Useless
Spike
Slide 154
Slide 154 text
Determinism & Spike
Slide 155
Slide 155 text
It's easy to get wrong
Determinism & Spike
Slide 156
Slide 156 text
Determinism & Spike
TCP delivery is not deterministic
Slide 157
Slide 157 text
Determinism & Spike
TCP guarantees:
Per connection in order delivery
Slide 158
Slide 158 text
Determinism & Spike
Per connection in order delivery
Per connection duplicate detection
TCP guarantees:
Slide 159
Slide 159 text
Determinism & Spike
Per connection in order delivery
Per connection duplicate detection
Per connection retransmission of lost data
TCP guarantees:
Slide 160
Slide 160 text
Determinism & Spike
Per connection in order delivery
Per connection duplicate detection
Per connection retransmission of lost data
but it doesn't guarantee determinism
TCP guarantees:
Slide 161
Slide 161 text
Determinism & Spike
TCP delivery is not deterministic
Slide 162
Slide 162 text
Determinism & Spike
TCP delivery is not deterministic
Slide 163
Slide 163 text
Determinism & Spike
TCP delivery is not deterministic
Slide 164
Slide 164 text
Determinism & Spike
TCP delivery is not deterministic
Per method call Spiking won't work
Slide 165
Slide 165 text
Determinism & Spike
TCP delivery is not deterministic
Per method call Spiking won't work
unless we make it work…
Slide 166
Slide 166 text
Determinism & Spike
TCP message framing
Slide 167
Slide 167 text
Determinism & Spike
TCP message framing
Slide 168
Slide 168 text
Determinism & Spike
TCP message framing
Slide 169
Slide 169 text
Determinism & Spike
TCP message framing
Slide 170
Slide 170 text
Determinism & Spike
TCP message framing
Slide 171
Slide 171 text
Determinism & Spike
TCP message framing
Slide 172
Slide 172 text
Determinism & Spike
TCP message framing
Slide 173
Slide 173 text
Determinism & Spike
TCP message framing
Slide 174
Slide 174 text
Determinism & Spike
TCP message framing
Slide 175
Slide 175 text
Determinism & Spike
Expect in action
Slide 176
Slide 176 text
Determinism & Spike
Expect in action
Slide 177
Slide 177 text
Determinism & Spike
Expect in action
Slide 178
Slide 178 text
Determinism & Spike
Expect in action
Slide 179
Slide 179 text
Determinism & Spike
Expect in action
Slide 180
Slide 180 text
Determinism & Spike
Expect in action
Slide 181
Slide 181 text
Determinism & Spike
Expect in action
Slide 182
Slide 182 text
Determinism & Spike
Expect in action
Slide 183
Slide 183 text
Determinism & Spike
Expect in action
Slide 184
Slide 184 text
Determinism & Spike
Expect in action
Slide 185
Slide 185 text
Determinism & Spike
Expect makes received deterministic
Slide 186
Slide 186 text
Determinism & Spike
Expect makes received deterministic
Slide 187
Slide 187 text
Determinism & Spike
Expect makes received deterministic
Slide 188
Slide 188 text
Determinism & Spike
Expect makes received deterministic
Slide 189
Slide 189 text
Determinism & Spike
Expect makes received deterministic
Slide 190
Slide 190 text
Determinism & Spike
Expect makes received deterministic
Slide 191
Slide 191 text
Determinism & Spike
Expect makes received deterministic
Slide 192
Slide 192 text
Determinism & Spike
Received gets called with
Slide 193
Slide 193 text
Determinism & Spike
then…
Slide 194
Slide 194 text
Determinism & Spike
and then another…
Slide 195
Slide 195 text
Determinism & Spike
and finally…
Slide 196
Slide 196 text
Same number of notifier method calls
Determinism & Spike
no matter how the data arrives
Slide 197
Slide 197 text
Drop Connection & Expect
fast deterministic friends
Determinism & Spike
Determinism & Spike
Slide 198
Slide 198 text
Slow Connections
Nemesis #2:
Slide 199
Slide 199 text
Spike: Delay
Slide 200
Slide 200 text
Spike: Delay
Slide 201
Slide 201 text
Spike: Delay
Slide 202
Slide 202 text
Spike: Delay
Slide 203
Slide 203 text
Spike: Delay
Delay overrides expect
Slide 204
Slide 204 text
Spike: Delay
Delay overrides expect
and controls the flow of bytes
Slide 205
Slide 205 text
Spike: Delay
Delay overrides expect
and controls the flow of bytes
to maintain determinism
Slide 206
Slide 206 text
Spike: Delay
Slide 207
Slide 207 text
Spike: Delay
Slide 208
Slide 208 text
Spike: Delay
Slide 209
Slide 209 text
Spike: Delay
Slide 210
Slide 210 text
Spike: Delay
r TCP
Spike
Slide 211
Slide 211 text
Spike: Delay
r TCP
Spike
Slide 212
Slide 212 text
Spike: Delay
r TCP
Spike
Slide 213
Slide 213 text
Spike: Delay
TCP
Slide 214
Slide 214 text
Spike: Delay
TCP
TCP
Spike
Slide 215
Slide 215 text
Spike: Delay
TCP
TCP
TCP
Spike
Spike
Slide 216
Slide 216 text
Results
Slide 217
Slide 217 text
Results
• Bugs in Session Recovery
Found…
Slide 218
Slide 218 text
Results
• Bugs in Session Recovery
• Bug in Pony standard library
Found…
Slide 219
Slide 219 text
Results
• Bugs in Session Recovery
• Bug in Pony standard library
• Bugs in Spike
Found…
Slide 220
Slide 220 text
Results
• Bugs in Session Recovery
• Bug in Pony standard library
• Bugs in Spike
• And more bugs…
Found…
Slide 221
Slide 221 text
Determinism is key
Results
Found…
Slide 222
Slide 222 text
Determinism is key
Results
but hard to achieve
Found…
Slide 223
Slide 223 text
Data Lineage
Slide 224
Slide 224 text
WARNING!!!
Vaporware ahead
Slide 225
Slide 225 text
Output
Data Lineage
How did I get here?
Slide 226
Slide 226 text
Output
Data Lineage
Slide 227
Slide 227 text
Data Lineage
Input: 1,2,3
Slide 228
Slide 228 text
Data Lineage
Input: 1,2,3
Expect: 2,4,6
Slide 229
Slide 229 text
Data Lineage
Input: 1,2,3
Expect: 2,4,6
Get: 4,6
Slide 230
Slide 230 text
Data Lineage
Input: 1,2,3
Expect: 2,4,6
Get: 4,6
How did we get here?
these are not our beautiful results
Slide 231
Slide 231 text
Data Lineage
Input: 1,2,3
Slide 232
Slide 232 text
Data Lineage
Input: 1,2,3
Expect: 2,4,6
Slide 233
Slide 233 text
Data Lineage
Input: 1,2,3
Expect: 2,4,6
Get: 2,6,12
Slide 234
Slide 234 text
Data Lineage
Input: 1,2,3
Expect: 2,4,6
Get: 2,6,12
¯\_(ϑ)_/¯
Slide 235
Slide 235 text
Data Lineage to the Rescue!
Slide 236
Slide 236 text
Data Lineage
Externally verify determinism
Slide 237
Slide 237 text
Data Lineage
Externally verify determinism
is it REALLY deterministic?
Slide 238
Slide 238 text
Data Lineage
Find incorrect executions
Slide 239
Slide 239 text
Data Lineage
Find incorrect executions
bugs in Wallaroo
Slide 240
Slide 240 text
Data Lineage
Input: 1
Expected: 2
Got: 4
¯\_(ϑ)_/¯
Slide 241
Slide 241 text
Data Lineage
Execution path was…
when it should have been
Slide 242
Slide 242 text
Data Lineage
when it should have been
Execution path was…
Slide 243
Slide 243 text
Data Lineage
Useful outside of
development
Slide 244
Slide 244 text
Data Lineage
Production Debugging
Slide 245
Slide 245 text
Data Lineage
Production Debugging
how did I get here?
Slide 246
Slide 246 text
Data Lineage
Audit Log
Slide 247
Slide 247 text
Data Lineage
Audit Log
why did you do that?
Slide 248
Slide 248 text
Data Lineage
Hindsight Machine
Slide 249
Slide 249 text
Building Confidence
is difficult
Slide 250
Slide 250 text
and frustrating
Slide 251
Slide 251 text
No content
Slide 252
Slide 252 text
Peter Alvaro
http://www.cs.berkeley.edu/~palvaro/molly.pdf
@palvaro
https://www.youtube.com/watch?v=ggCffvKEJmQ
Lineage-driven Fault Injection:
Outwards from the Middle of the Maze:
Will Wilson
https://www.youtube.com/watch?v=4fFDFbi3toc
Testing Distributed Systems w/ Deterministic Simulation:
Slide 255
Slide 255 text
Catie McCaffrey
http://queue.acm.org/detail.cfm?ref=rss&id=2889274
@caitie
The Verification of a Distributed System
The Verification of a Distributed System:
A practitioner's guide to increasing confidence in system correctness
https://www.infoq.com/presentations/distributed-systems-
verification
Slide 256
Slide 256 text
Inés Sombra
https://www.youtube.com/watch?v=KSdNYi55kjg
Testing in a Distributed World:
@randommood
Slide 257
Slide 257 text
http://principlesofchaos.org
Principles of Chaos Engineering:
Chaos Engineering
Slide 258
Slide 258 text
No content
Slide 259
Slide 259 text
Thanks
Peter Alvaro
Sylvan Clebsch
Zeeshan Lakhani
John Mumm
Rob Roland
Andrew Turley