Slide 1

Slide 1 text

A PRACTICAL TAXONOMY OF BUGS AND HOW TO SQUASH THEM

Slide 2

Slide 2 text

Instinctual Indications…6 Research Methods…9 Practical Taxonomy…13 Bohrbug…17 Schrödinbug…25 Fractalbug…24 Heisenbug…35 Mandelbug…44 Resources…55 Table of Contents

Slide 3

Slide 3 text

Debugging Skills

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

“As you familiarize yourself with the application, you’ll build up some debugging instincts"

Slide 6

Slide 6 text

“Whenever I see something like this happening, the first thing I do is scan the logs to see if this process is completing or is sending a weird message.”

Slide 7

Slide 7 text

Debugging Instincts “ ”

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

“Whenever I see #{x}, I always check #{y}”

Slide 10

Slide 10 text

Research Methods • containment sometimes takes priority over squashing • we can only work with facts • we can’t squash every bug in this talk

Slide 11

Slide 11 text

Observable Attributes

Slide 12

Slide 12 text

Phenetics { ]

Slide 13

Slide 13 text

Warning: Contrived Scenarios Ahead

Slide 14

Slide 14 text

A Practical Taxonomy of Bugs Upsettingly Observable Wildly Chaotic {

Slide 15

Slide 15 text

Upsettingly Observable

Slide 16

Slide 16 text

Wildly Chaotic

Slide 17

Slide 17 text

How to Squash Them

Slide 18

Slide 18 text

upsettingly observable bug #1 UPSETTINGLY OBSERVABLE

Slide 19

Slide 19 text

Observable Attributes is the bug observable in production? can it be reproduced locally? does it seem to be restricted to one area?

Slide 20

Slide 20 text

Bohrbug deterministic, highly reproducible UPSETTINGLY OBSERVABLE

Slide 21

Slide 21 text

Bohrbug Commonly found in code,sometimes on server UPSETTINGLY OBSERVABLE

Slide 22

Slide 22 text

Bohrbug likes to hide in complex branching in functions, classes or config UPSETTINGLY OBSERVABLE

Slide 23

Slide 23 text

Bohrbug In the wild: validation UPSETTINGLY OBSERVABLE

Slide 24

Slide 24 text

Reproduction & Resolution replicate locally and in test write the simple solution rewrite to be highly readable and extendable UPSETTINGLY OBSERVABLE

Slide 25

Slide 25 text

Bohrbug UPSETTINGLY OBSERVABLE

Slide 26

Slide 26 text

Bohrbug UPSETTINGLY OBSERVABLE

Slide 27

Slide 27 text

upsettingly observable bug #2 UPSETTINGLY OBSERVABLE

Slide 28

Slide 28 text

Observable Attributes how does this work? does this work? wait, what is this even testing? did this ever work?

Slide 29

Slide 29 text

Schrödinbug stick-like body appendages look like twigs UPSETTINGLY OBSERVABLE

Slide 30

Slide 30 text

Schrödinbug Likes to pretend to be working code. On close inspection, reveals itself to be a bug. UPSETTINGLY OBSERVABLE

Slide 31

Slide 31 text

Schrödinbug In the wild: code that never worked UPSETTINGLY OBSERVABLE

Slide 32

Slide 32 text

Schrödinbug In the wild: it didn’t work how you thought it did UPSETTINGLY OBSERVABLE

Slide 33

Slide 33 text

Logging as Verification Tool

Slide 34

Slide 34 text

Git Bisect Tool

Slide 35

Slide 35 text

Reproduction & Resolution reproduce the “broken” state locally and in test add log statements until you can verify what causes the broken state. if the bug did work at some point, find the point at which it did work. write tests to represent the configuration and flow of the fixed state

Slide 36

Slide 36 text

Schrödinbug UPSETTINGLY OBSERVABLE

Slide 37

Slide 37 text

Schrödinbug UPSETTINGLY OBSERVABLE

Slide 38

Slide 38 text

wildly chaotic bug #1 WILDLY CHAOTIC

Slide 39

Slide 39 text

Observable Attributes Does it appear non-deterministic? Does it seem to disappear once you observe or debug it?

Slide 40

Slide 40 text

Heisenbug “now you see it, now you don’t” WILDLY CHAOTIC

Slide 41

Slide 41 text

Heisenbug WILDLY CHAOTIC In the wild: a heisenbug that lives in code

Slide 42

Slide 42 text

Heisenbug WILDLY CHAOTIC In the wild: a heisenbug that lives in data

Slide 43

Slide 43 text

Profiling for Verification https://kcachegrind.github.io/html/CallgrindFormat.html Tool

Slide 44

Slide 44 text

FLAME GRAPHS http://www.brendangregg.com/FlameGraphs/cpu-mysql-updated.svg Tool

Slide 45

Slide 45 text

Reproduction & Resolution use profiling to find the trigger state use the app (not fixtures or DB manipulation) to get the data in this state recreate that state in test follow borhbug instruction

Slide 46

Slide 46 text

Heisenbug WILDLY CHAOTIC

Slide 47

Slide 47 text

Heisenbug WILDLY CHAOTIC

Slide 48

Slide 48 text

wildly chaotic bug #2 WILDLY CHAOTIC

Slide 49

Slide 49 text

Observable Attributes is everything broken? all of it? send help??

Slide 50

Slide 50 text

Mandelbug WILDLY CHAOTIC

Slide 51

Slide 51 text

Mandelbug seems like everything is broken at once WILDLY CHAOTIC

Slide 52

Slide 52 text

Mandelbug people are very upset with you WILDLY CHAOTIC

Slide 53

Slide 53 text

Mandelbug likely an issue with your system, not code WILDLY CHAOTIC

Slide 54

Slide 54 text

“The bug is huge and everywhere at once. SQL: could not connect to server: Connection refused was bubbling up all over the place. Jobs won’t run, emails won’t send, every submit button on the site fatal errored.” on-call log 24 June 2014 WILDLY CHAOTIC

Slide 55

Slide 55 text

Disk Usage Tool df -h

Slide 56

Slide 56 text

Reproduction & Resolution attempt to connect to server & view logs use df -h to find if all the storage is being used can that be restarted, rotated or killed at this time?

Slide 57

Slide 57 text

Mandelbug WILDLY CHAOTIC

Slide 58

Slide 58 text

Mandelbug WILDLY CHAOTIC

Slide 59

Slide 59 text

A Practical Taxonomy of Bugs Upsettingly Observable Wildly Chaotic { bohrbug schrödinbug mandelbug heisenbug

Slide 60

Slide 60 text

“Debugging Instincts”

Slide 61

Slide 61 text

“Debugging Instincts”

Slide 62

Slide 62 text

Debugging Skills

Slide 63

Slide 63 text

Observe & Classify

Slide 64

Slide 64 text

Verify with logging and time travel

Slide 65

Slide 65 text

Verify without changing state by profiling

Slide 66

Slide 66 text

Use linux server tools to observe entire process

Slide 67

Slide 67 text

Observe & Classify Verify with logging and time travel Verify without changing state by profiling Use linux server tools to observe entire process

Slide 68

Slide 68 text

Build Up Your Own Toolkit and Share it

Slide 69

Slide 69 text

Resources & Further Study • “Linux Debugging Tools I Love”, Julia Evans • Systems Performance, Brendan Gregg • Site Reliability Engineering, Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy • “Why Do Computers Stop and What Can Be Done About It?”, Jim Gray • “Debug Patterns for Efficient High- levelSystemC Debugging”, Frank Rogin, Erhard Fehlauer, Christian Haufe, Sebastian Ohnewald

Slide 70

Slide 70 text

No content