A Hidden Pitfall of K8s DNS with Spring Webflux

© LY Corporation X (formerly Twitter)/GitHub: @musaprg Kotaro INOUE A
Hidden Pitfall of K8s DNS with Spring Webflux Kubernetes Meetup Tokyo #69

© LY Corporation 1. Spring Webflux’s default DNS resolver behaves
differently from JVM built-in implementation. 2. Make use of known workarounds when working with Kubernetes DNS. 1. Use FQDN (with trailing dot) for query • example.com. • example.com 2. Specify ndots to 1 or 2 in the Pod spec field .spec.dnsConfig.options[*] 2 Summary

© LY Corporation 3 HELP! DNS request timed out! Application
pods cannot resolve cluster-local domain.

© LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)
• Secondary nameserver • Upstream DNS 4 Overview of name resolution in our cluster /etc/resolv.conf

© LY Corporation • The target cluster was in the
middle of in-place migration process. • Cluster IP of Upstream DNS was changed before/after the migration. 5 What happened? https://youtu.be/BDjhGEVJ0Gs Rancher (Internal Fork) Cluster API (kubeadm) In-place migration

having correct Upstream DNS IP • Secondary nameserver • Upstream DNS with wrong IP = unreachable due to our bug 6 What happened? /etc/resolv.conf

having correct Upstream DNS IP • Secondary nameserver • Upstream DNS with wrong IP = unreachable due to our bug 7 What happened? /etc/resolv.conf The domain should still be resolvable.

© LY Corporation • Common pitfall of K8s DNS •
Non-FQDN (PQDN) Query would take so long • Query “example.org” will look like: 8 Possible cause #1: ndots=5 + search domain https://speakerdeck.com/toversus/reliable-and-performant-dns-resolution-with-high-available- nodelocal-dnscache

© LY Corporation • Spring Webflux 2.4.0 started using reactor-netty
v1.x • reactor-netty v1.0.0 switched their default DNS resolver to their own Netty DNS Resolver instead of JVM one. 9 Possible cause #2: Netty DNS Resolver https://github.com/reactor/reactor-netty/pull/1252

© LY Corporation • When we query nginx.default, Netty DNS
Resolver behaves like: 1. (1st searchdomain) Try primary nameserver 2. (1st searchdomain) If the response is NXDomain, Try secondary nameserver 3. Proceed with the next searchdomain 4. … 10 Possible cause #2: Netty DNS Resolver tcpdump logs (modified)

© LY Corporation 1. Try primary nameserver for all search
domain 2. Try secondary nameserver for all search domain 11 Reference: JVM’s built-in resolver tcpdump logs (modified)

© LY Corporation 1. First, try primary nameserver for all
search domain 2. Next, try secondary nameserver for all search domain 12 Reference: cURL (glibc) tcpdump logs (modified)

© LY Corporation 1. Explicitly use JVM’s built-in resolver in
Spring Webflux 2. Use FQDN (with trailing dot) for query • example.com. • example.com 3. Specify ndots to 1 or 2 in the Pod spec field .spec.dnsConfig.options[*] 13 Workaround

© LY Corporation 1. Explicitly use JVM’s built-in resolver in
Spring Webflux 2. Use FQDN (with trailing dot) for query • example.com. • example.com 3. Specify ndots to 1 or 2 in the Pod spec field .spec.dnsConfig.options[*] 14 Workaround Make use of these workarounds to avoid DNS issues

A Hidden Pitfall of K8s DNS with Spring Webflux

A Hidden Pitfall of K8s DNS with Spring Webflux

Kotaro Inoue

More Decks by Kotaro Inoue

Other Decks in Technology

Featured

Transcript

© LY Corporation X (formerly Twitter)/GitHub: @musaprg Kotaro INOUE A

© LY Corporation 1. Spring Webflux’s default DNS resolver behaves

© LY Corporation 3 HELP! DNS request timed out! Application

© LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)

© LY Corporation • The target cluster was in the

© LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)

© LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)

© LY Corporation • Common pitfall of K8s DNS •

© LY Corporation • Spring Webflux 2.4.0 started using reactor-netty

© LY Corporation • When we query nginx.default, Netty DNS

© LY Corporation 1. Try primary nameserver for all search

© LY Corporation 1. First, try primary nameserver for all

© LY Corporation 1. Explicitly use JVM’s built-in resolver in

© LY Corporation 1. Explicitly use JVM’s built-in resolver in

© LY Corporation