A walk to remember: Debugging a distributed system failure

C05edcc8a57f64b4e040d94ad89cee57?s=47 flaper87
August 22, 2016

A walk to remember: Debugging a distributed system failure

Debugging distributed systems has a different set of complications than other fields in our industry. Each system may behave differently depending on the environment it's running in and this undeterministic behavior makes the process more challenging. If the debugging happens on a production environment the risk increases and the nerves get to us.

The debugging process for a distributed system is hardly the same every time. Therefore, we need to have a toolsbelt ready to attack this issue from different fronts but we also need to be ready to backoff when we've gathered enough information to do a proper analysis.

This talk will walk you through the debugging process for an issue on an OpenStack deployment and the strategy used from a technical and non-technical perspective.

C05edcc8a57f64b4e040d94ad89cee57?s=128

flaper87

August 22, 2016
Tweet