DNS HELL • KubeDNS doesn’t cache name resolution on internal calls -> moved to CoreDNS does it
• Short DNS Names for internal calls: we used bodyweight.api instead of bodyweight.api.svc.cluster.local, requiring 2 DNS resolution requests for each internal call PROBLEM #1 • Sidekiqs and internal calls
• KubeDNS asks AWS for internal calls as well, so we used up our quota for external DNS requests very fast. As there is a high chance that a high error rate on something leads to an increased pile on sidekiq delayed jobs that call dns we increased the pressure here.