coredns-nodecache

1bc5330e619fff66e61515ee6fb0d2ac?s=47 Karthik
November 19, 2019

 coredns-nodecache

coredns-nodecache is a plugin for CoreDNS that on the same lines as the node-cache proposal for k8s

1bc5330e619fff66e61515ee6fb0d2ac?s=128

Karthik

November 19, 2019
Tweet

Transcript

  1. 3.

    • DNS in Kubernetes 101 • … and why it

    doesn't work well • Node-local, and its shortcomings • Presenting coredns-nodecache • Upcoming work Agenda 3 3
  2. 4.

    • We noticed high latency & 5xx errors on some

    requests • Logs pointing to DNS timeouts • Implemented application-side DNS retries & timeouts to alleviate First sightings 4
  3. 8.

    DNS in K8S Primer 8 # cat /etc/resolv.conf nameserver 172.168.0.1

    search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5
  4. 9.

    • Kernel module for tracking incoming and outgoing connections •

    Table containing src + dest IP, src + dest port, connection state • Can be controlled via iptables rules • Unregistered entries can lead to missing responses https://twitter.com/b0rk/status/1059109780059504641 UDP Connection tracking 9
  5. 10.

    Conntrack race condition • When two packets are sent via

    the same socket at the same time • Packets dropped • Lookups are in waiting state • Only for UDP Conntrack bug 10 Race condition https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts
  6. 11.

    • Max of 1024 packets per second per interface for

    AWS DNS Server • But… each DNS resolution is expanded to 8 lookups AWS VPC DNS Limits 11 https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html
  7. 13.

    lookup(‘bar’, ‘A’) -> lookup(‘bar.default.svc.cluster.local.’, ‘A’) • Trailing dot `.` avoids

    search path expansion Downsides • Manually find and expand all lookups • Some will be hardcoded in SDKs Workaround #1 FQDN 13
  8. 14.

    • The bug is UDP specific • DNS works on

    TCP as well • Unless you use ipv6, AAAA lookups aren’t really useful Downsides • Majority of the containers are on alpine, which uses musl instead of glibc • Doesn’t solve VPC DNS Limits Workaround #2 TCP, AAAA and musl 14 https://wiki.musl-libc.org/functional-differences-from-glibc.html#Name-Resolver/DNS
  9. 15.

    • Add a dnsmasq sidecar • Acts like a caching

    resolver to the cluster dns • Pods queries are cached by dnsmasq Downsides • Need to change local resolvers per pod • Takes its own set of resources and memory https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config Workaround #3 Dnsmasq sidecar 15
  10. 16.

    • tc to play around with packet scheduling • Useful

    to simulate packet loss or delays • Useful here to introduce a delay to avoid race Downsides • Introduces delay Workaround #4 Adding a random delay 16 https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/
  11. 17.

    To summarise 17 • The implementation might be different for

    different pods • They only reduce the probability of the issue happening • We have a lot of pods
  12. 18.

    Nodecache 18 Going from 5s to 5ms: Benefits of a

    Node-Local DNSCache - Pavithra Ramesh & Blake Barnett
  13. 23.

    A very thin wrapper around CoreDNS that: • Creates a

    dummy interface and IPTables rules on startup • Removes them on shutdown … that's it. … But what is actually node-cache? 23 ... or maybe it should be?
  14. 26.

    Like nodecache, but: • Less is more • Unit tests

    for all logic • Configuration is done in the Corefile - just add "nodecache" to your config block • Easy upgrades of CoreDNS • Highly-available setup possible Advantages of Coredns-nodecache 26
  15. 27.

    • CoreDNS uses SO_REUSEPORT out of the box when binding

    to an interface • This allows multiple instances of CoreDNS to bind to the same interface So: • Run two DaemonSets • Do not tear down the interface when shutting down A highly-available node-local DNS cache 27
  16. 29.

    • Coredns-nodecache needs to run as root (same as regular

    nodecache) • IPTables need to be present in the container • Need to use a full linux image vs "From scratch" But there are open issues... 29
  17. 32.

    • Likely to get DNS timeouts by default • Use

    a node-local DNS caching server • Progress is being made • Check out coredns-nodecache! DNS is hard 32
  18. 33.

    Thank you! It's not DNS There's no way it's DNS

    It was DNS Yann yann.hamon@contentful.com GitHub @yannh Karthik karthik.viswanathan@contentful.com Twitter @argvk github.com/contentful-labs/coredns-nodecache