Attaining Resiliency - Culture , Tools and Practices

7090d58d804c96911a37c84e4e90a9cf?s=47 Ranjib Dey
November 17, 2013

Attaining Resiliency - Culture , Tools and Practices

What we have learned during building resilient apps? How cultural and technical aspects matter. How they reflect in the solutions you develop

7090d58d804c96911a37c84e4e90a9cf?s=128

Ranjib Dey

November 17, 2013
Tweet

Transcript

  1. None
  2. Some context

  3. What is resiliency?

  4. How failures are introduced?

  5. • Human error

  6. • Human error • Application error

  7. • Human error • Application error • External sources

  8. Changing mindset, accept failures instead of avoiding

  9. Designing for resiliency

  10. Ephemeral everything

  11. Automation is an asymptotic phenomenon

  12. • See if you can do it manually

  13. • See if you can do it manually • Build

    tools to adopt semi automatic workflows
  14. • See if you can do it manually • Build

    tools to adopt semi automatic workflows • Remember not to introduce a dead end
  15. Resiliency in different application tiers

  16. • Application tier

  17. • Application tier • Persistence tier

  18. • Application tier • Persistence tier

  19. • Application tier • Persistence tier • Distributes systems (2PC,

    RAFT, Paxos etc)
  20. Patterns for distributed systems

  21. Avoid cascading failures

  22. • Automation will increase risk of failures also

  23. • Automation will increase risk of failures also • Components,

    allow independent failures
  24. • Automation will increase risk of failures also • Components,

    allow independent failures • Contain failures
  25. Supporting degraded modes

  26. Metrics for everything

  27. • Metrics from app, tools and integrations

  28. • Metrics from app, tools and integrations • Logging and

    metrics
  29. The cultural aspects

  30. Everyone can be on call

  31. Build resiliency in human side

  32. Build safety nets

  33. • Test apps

  34. • Test apps • Test infrastructure

  35. • Test apps • Test infrastructure • Test integrations

  36. Inject failures

  37. Concluding thoughts

  38. • Failures are inevitable, accept them

  39. • Failures are inevitable, accept them • It’s a mindset.

    That needs to be reflected in tools and culture
  40. Thank You @RanjibDey