Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Practical Taxonomy of Bugs and How to Squash Them-Keep Ruby Weird 2016

Kylie
November 04, 2016

A Practical Taxonomy of Bugs and How to Squash Them-Keep Ruby Weird 2016

Catching software bugs is a mysterious magic, unknowable by science and untouchable by process.

False! Programming bugs, like real bugs, can be organized into a taxonomy. Come with me and I’ll show you how classification can help you build “programmer’s instinct” into a logical debugging process.

Kylie

November 04, 2016
Tweet

More Decks by Kylie

Other Decks in Programming

Transcript

  1. “Whenever I see something like this happening, the first thing

    I do is scan the logs to see if this process is completing or is sending a weird message.”
  2. Research Methods • containment sometimes takes priority over squashing •

    we can only work with facts • we can’t squash every bug in this talk
  3. Observable Attributes is the bug observable in production? can it

    be reproduced locally? does it seem to be restricted to one area?
  4. Reproduction & Resolution replicate locally and in test write the

    simple solution rewrite to be highly readable and extendable UPSETTINGLY OBSERVABLE
  5. Observable Attributes how does this work? does this work? wait,

    what is this even testing? did this ever work?
  6. Schrödinbug Likes to pretend to be working code. On close

    inspection, reveals itself to be a bug. UPSETTINGLY OBSERVABLE
  7. Reproduction & Resolution reproduce the “broken” state locally and in

    test add log statements until you can verify what causes the broken state. if the bug did work at some point, find the point at which it did work. write tests to represent the configuration and flow of the fixed state
  8. Reproduction & Resolution use profiling to find the trigger state

    use the app (not fixtures or DB manipulation) to get the data in this state recreate that state in test follow borhbug instruction
  9. “The bug is huge and everywhere at once. SQL: could

    not connect to server: Connection refused was bubbling up all over the place. Jobs won’t run, emails won’t send, every submit button on the site fatal errored.” on-call log 24 June 2014 WILDLY CHAOTIC
  10. Reproduction & Resolution attempt to connect to server & view

    logs use df -h to find if all the storage is being used can that be restarted, rotated or killed at this time?
  11. Resources • “Linux Debugging Tools I Love”, Julia Evans •

    Systems Performance, Brendan Gregg • “Why Do Computers Stop and What Can Be Done About It?”, Jim Gray • “Debug Patterns for Efficient High- levelSystemC Debugging”, Frank Rogin, Erhard Fehlauer, Christian Haufe, Sebastian Ohnewald