Upgrade to Pro — share decks privately, control downloads, hide ads and more …

coredns-nodecache

Karthik
November 19, 2019

 coredns-nodecache

coredns-nodecache is a plugin for CoreDNS that on the same lines as the node-cache proposal for k8s

Karthik

November 19, 2019
Tweet

More Decks by Karthik

Other Decks in Technology

Transcript

  1. • DNS in Kubernetes 101 • … and why it

    doesn't work well • Node-local, and its shortcomings • Presenting coredns-nodecache • Upcoming work Agenda 3 3
  2. • We noticed high latency & 5xx errors on some

    requests • Logs pointing to DNS timeouts • Implemented application-side DNS retries & timeouts to alleviate First sightings 4
  3. DNS in K8S Primer 8 # cat /etc/resolv.conf nameserver 172.168.0.1

    search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5
  4. • Kernel module for tracking incoming and outgoing connections •

    Table containing src + dest IP, src + dest port, connection state • Can be controlled via iptables rules • Unregistered entries can lead to missing responses https://twitter.com/b0rk/status/1059109780059504641 UDP Connection tracking 9
  5. Conntrack race condition • When two packets are sent via

    the same socket at the same time • Packets dropped • Lookups are in waiting state • Only for UDP Conntrack bug 10 Race condition https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts
  6. • Max of 1024 packets per second per interface for

    AWS DNS Server • But… each DNS resolution is expanded to 8 lookups AWS VPC DNS Limits 11 https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html
  7. lookup(‘bar’, ‘A’) -> lookup(‘bar.default.svc.cluster.local.’, ‘A’) • Trailing dot `.` avoids

    search path expansion Downsides • Manually find and expand all lookups • Some will be hardcoded in SDKs Workaround #1 FQDN 13
  8. • The bug is UDP specific • DNS works on

    TCP as well • Unless you use ipv6, AAAA lookups aren’t really useful Downsides • Majority of the containers are on alpine, which uses musl instead of glibc • Doesn’t solve VPC DNS Limits Workaround #2 TCP, AAAA and musl 14 https://wiki.musl-libc.org/functional-differences-from-glibc.html#Name-Resolver/DNS
  9. • Add a dnsmasq sidecar • Acts like a caching

    resolver to the cluster dns • Pods queries are cached by dnsmasq Downsides • Need to change local resolvers per pod • Takes its own set of resources and memory https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config Workaround #3 Dnsmasq sidecar 15
  10. • tc to play around with packet scheduling • Useful

    to simulate packet loss or delays • Useful here to introduce a delay to avoid race Downsides • Introduces delay Workaround #4 Adding a random delay 16 https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/
  11. To summarise 17 • The implementation might be different for

    different pods • They only reduce the probability of the issue happening • We have a lot of pods
  12. Nodecache 18 Going from 5s to 5ms: Benefits of a

    Node-Local DNSCache - Pavithra Ramesh & Blake Barnett
  13. A very thin wrapper around CoreDNS that: • Creates a

    dummy interface and IPTables rules on startup • Removes them on shutdown … that's it. … But what is actually node-cache? 23 ... or maybe it should be?
  14. Like nodecache, but: • Less is more • Unit tests

    for all logic • Configuration is done in the Corefile - just add "nodecache" to your config block • Easy upgrades of CoreDNS • Highly-available setup possible Advantages of Coredns-nodecache 26
  15. • CoreDNS uses SO_REUSEPORT out of the box when binding

    to an interface • This allows multiple instances of CoreDNS to bind to the same interface So: • Run two DaemonSets • Do not tear down the interface when shutting down A highly-available node-local DNS cache 27
  16. • Coredns-nodecache needs to run as root (same as regular

    nodecache) • IPTables need to be present in the container • Need to use a full linux image vs "From scratch" But there are open issues... 29
  17. • Likely to get DNS timeouts by default • Use

    a node-local DNS caching server • Progress is being made • Check out coredns-nodecache! DNS is hard 32
  18. Thank you! It's not DNS There's no way it's DNS

    It was DNS Yann [email protected] GitHub @yannh Karthik [email protected] Twitter @argvk github.com/contentful-labs/coredns-nodecache