Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Plan for Success

Plan for Success

What happens to your system, when you get lucky and become successful?

Monica Giambitto

May 13, 2020
Tweet

More Decks by Monica Giambitto

Other Decks in Technology

Transcript

  1. RIGHT TIME 27TH JANUARY - First COVID-19 case in Germany

    24TH FEBRUARY - Pandemic plan activated 09TH MARCH - First COVID-19 death in Germany 21ST MARCH - Official lockdown called in Bavaria, more Länder to follow
  2. DNS HELL • KubeDNS doesn’t cache name resolution on internal

    calls -> moved to CoreDNS does it • Short DNS Names for internal calls: we used bodyweight.api instead of bodyweight.api.svc.cluster.local, requiring 2 DNS resolution requests for each internal call PROBLEM #1 • Sidekiqs and internal calls • KubeDNS asks AWS for internal calls as well, so we used up our quota for external DNS requests very fast. As there is a high chance that a high error rate on something leads to an increased pile on sidekiq delayed jobs that call dns we increased the pressure here.
  3. • Task Force • Calendario & Meeting • Confluence •

    Miro Board • Tools • NR • Grafana • AWS Dashboard • Statuspage • Slack • Hangout