Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chaos Engineering @ Spring I/O 2019 Barcelona

Chaos Engineering @ Spring I/O 2019 Barcelona

Why we need Chaos Engineering
The complexity in modern and distributed architectures continues to increase. We have successfully broken down our application into small and maintainable components. Each individual component can be automated and brought into production at any time. A lot of effort was put into the development to keep the test coverage as high as possible. Every release has to successfully pass our pipeline and countless unit, integration and acceptance tests.

But why do we have this unpleasant feeling shortly before our arrival at the most beautiful place in the world (production)?

6801a620bef0b27695ecc24adccf5a36?s=128

Benjamin Wilms

May 16, 2019
Tweet

Transcript

  1. WITHSTANDING TURBULENT CONDITIONS IN PRODUCTION

  2. BENJAMIN WILMS CODECENTRIC @MRBWILMS

  3. None
  4. None
  5. None
  6. None
  7. None
  8. 80% - 90% ...

  9. None
  10. None
  11. HOW OUR BABY BEHAVES IN PRODUCTION?

  12. None
  13. None
  14. None
  15. None
  16. None
  17. None
  18. None
  19. None
  20. None
  21. None
  22. None
  23. None
  24. None
  25. None
  26. None
  27. None
  28. TOP 10 ECOMMERCE GERMANY 2017

  29. None
  30. None
  31. None
  32. None
  33. None
  34. CHAOS ENGINEERING IS NOT... ...TO CAUSE CHAOS!

  35. CHAOS ENGINEERING IS NOT... ...BREAKING THINGS JUST TO BREAK THEM!

  36. CHAOS ENGINEERING IS NOT... ...TO RUN A CHAOS MONKEY!

  37. CHAOS ENGINEERING IS NOT... ...TO LET THE WHOLE SIMIAN ARMY

    OUT OF THE CAGE!
  38. CHAOS ENGINEERING IS NOT... ...TO USE IT IN PRODUCTION!

  39. CHAOS ENGINEERING IS NOT... ...TO REPLACE OTHER KINDS OF TESTS!

  40. CHAOS ENGINEERING IS NOT... ...TO DO IT ALONE AND WITHOUT

    ANY ARRANGEMENT!
  41. None
  42. None
  43. THIS IS BOB

  44. HE IS RESPONSIBLE FOR SERVICE A

  45. THESE ARE BOB'S TEAMMATES

  46. THEY DEPEND ON BOB'S SERVICE

  47. THEY BLAME BOB FOR THE BAD PERFORMANCE OF SERVICE A

  48. BE SOCIAL AND COMMUNICATIVE SHARE YOUR EXPERIENCES AND THOUGHTS STOP

    BLAMING EACH OTHER WORK TOGETHER
  49. None
  50. None
  51. CHAOS ENGINEERING IS THE DISCIPLINE OF EXPERIMENTING ON A DISTRIBUTED

    SYSTEM IN ORDER TO BUILD CONFIDENCE IN THE SYSTEM’S CAPABILITY TO WITHSTAND TURBULENT CONDITIONS IN PRODUCTION.
  52. None
  53. None
  54. WHAT SHOULD HAPPEN WHEN...

  55. None
  56. None
  57. IF YOU KNOW YOUR CHAOS EXPERIMENT WILL FAIL... ...DON'T DO

    IT!!!
  58. None
  59. None
  60. None
  61. STEADY STATE ORDERS PER MINUTE ON A TYPICAL MONDAY MORNING

  62. STEADY STATE ORDERS PER MINUTE

  63. STEADY STATE EXPERIMENT WAS CANCELED

  64. None
  65. CPU BURNING - INSPIRED BY TAMMY BUTOW # burn.zsh while

    true; do openssl speed; done EOF # cpu_burning.zsh for i in {1..32} do nohup /bin/zsh burn.zsh & done
  66. STRESS CPU stress ­­cpu 2 ­­io 1 ­­vm 1 ­­vm­bytes

    128M ­­timeout 10s ­­verbose
  67. TRAFFIC CONTROL (TC) PACKAGE IPROUTE2

  68. None
  69. PLATFORMS

  70. HOW DOES IT WORK <dependency> <groupid>de.codecentric</groupid> <artifactid>chaos­monkey­spring­boot</artifactid> <version>2.0.2</version> </dependency>

  71. (SIDECAR PATTERN) java ­cp your­app.jar ­Dloader.path=chaos­monkey­spring­boot­2.0.2­jar­with­dependencies.jar org.springframework.boot.loader.PropertiesLauncher ­­spring.profiles.active=chaos­monkey ­­spring.config.location=file:./chaos­monkey.properties

  72. WATCHER

  73. ASSAULTS

  74. SPRING BOOT ACTUATOR ENDPOINT CONTROL VIA REST ENDPOINT AT RUNTIME

  75. None
  76. None
  77. None
  78. None
  79. None
  80. None
  81. None
  82. None
  83. None
  84. None
  85. None
  86. None
  87. None
  88. None
  89. None
  90. None
  91. None
  92. None
  93. None
  94. None
  95. None
  96. None
  97. BENJAMIN WILMS @MRBWILMS BENJAMIN.WILMS@CODECENTRIC.DE SHOPPING DEMO