Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Hidden Pitfall of K8s DNS with Spring Webflux

Kotaro Inoue
February 04, 2025

A Hidden Pitfall of K8s DNS with Spring Webflux

Presented at Kubernetes Meetup Tokyo #69

Kotaro Inoue

February 04, 2025
Tweet

More Decks by Kotaro Inoue

Other Decks in Technology

Transcript

  1. © LY Corporation X (formerly Twitter)/GitHub: @musaprg Kotaro INOUE A

    Hidden Pitfall of K8s DNS with Spring Webflux Kubernetes Meetup Tokyo #69
  2. © LY Corporation 1. Spring Webflux’s default DNS resolver behaves

    differently from JVM built-in implementation. 2. Make use of known workarounds when working with Kubernetes DNS. 1. Use FQDN (with trailing dot) for query • example.com. • example.com 2. Specify ndots to 1 or 2 in the Pod spec field .spec.dnsConfig.options[*] 2 Summary
  3. © LY Corporation 3 HELP! DNS request timed out! Application

    pods cannot resolve cluster-local domain.
  4. © LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)

    • Secondary nameserver • Upstream DNS 4 Overview of name resolution in our cluster /etc/resolv.conf
  5. © LY Corporation • The target cluster was in the

    middle of in-place migration process. • Cluster IP of Upstream DNS was changed before/after the migration. 5 What happened? https://youtu.be/BDjhGEVJ0Gs Rancher (Internal Fork) Cluster API (kubeadm) In-place migration
  6. © LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)

    having correct Upstream DNS IP • Secondary nameserver • Upstream DNS with wrong IP = unreachable due to our bug 6 What happened? /etc/resolv.conf
  7. © LY Corporation • Primary nameserver • Node-local DNS (DaemonSet)

    having correct Upstream DNS IP • Secondary nameserver • Upstream DNS with wrong IP = unreachable due to our bug 7 What happened? /etc/resolv.conf The domain should still be resolvable.
  8. © LY Corporation • Common pitfall of K8s DNS •

    Non-FQDN (PQDN) Query would take so long • Query “example.org” will look like: 8 Possible cause #1: ndots=5 + search domain https://speakerdeck.com/toversus/reliable-and-performant-dns-resolution-with-high-available- nodelocal-dnscache
  9. © LY Corporation • Spring Webflux 2.4.0 started using reactor-netty

    v1.x • reactor-netty v1.0.0 switched their default DNS resolver to their own Netty DNS Resolver instead of JVM one. 9 Possible cause #2: Netty DNS Resolver https://github.com/reactor/reactor-netty/pull/1252
  10. © LY Corporation • When we query nginx.default, Netty DNS

    Resolver behaves like: 1. (1st searchdomain) Try primary nameserver 2. (1st searchdomain) If the response is NXDomain, Try secondary nameserver 3. Proceed with the next searchdomain 4. … 10 Possible cause #2: Netty DNS Resolver tcpdump logs (modified)
  11. © LY Corporation 1. Try primary nameserver for all search

    domain 2. Try secondary nameserver for all search domain 11 Reference: JVM’s built-in resolver tcpdump logs (modified)
  12. © LY Corporation 1. First, try primary nameserver for all

    search domain 2. Next, try secondary nameserver for all search domain 12 Reference: cURL (glibc) tcpdump logs (modified)
  13. © LY Corporation 1. Explicitly use JVM’s built-in resolver in

    Spring Webflux 2. Use FQDN (with trailing dot) for query • example.com. • example.com 3. Specify ndots to 1 or 2 in the Pod spec field .spec.dnsConfig.options[*] 13 Workaround
  14. © LY Corporation 1. Explicitly use JVM’s built-in resolver in

    Spring Webflux 2. Use FQDN (with trailing dot) for query • example.com. • example.com 3. Specify ndots to 1 or 2 in the Pod spec field .spec.dnsConfig.options[*] 14 Workaround Make use of these workarounds to avoid DNS issues