Bringing Traffic Into Your Kubernetes Cluster

Bringing Traffic Into Your Kubernetes Cluster

A look at various models for receiving traffic from outside of your cluster

569f10721398d92f5033097ac6d9132c?s=128

Tim Hockin

July 11, 2020
Tweet

Transcript

  1. Bringing traffic into your Kubernetes cluster It seems like this

    should be easy Tim Hockin @thockin v1
  2. Start with a “normal” cluster

  3. Cluster: 10.0.0.0/16

  4. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Node2: IP: 10.240.0.2

  5. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24
  6. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2
  7. Kubernetes demands that pods can reach each other

  8. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2
  9. Kubernetes says very little about how traffic gets INTO the

    cluster
  10. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 ?
  11. That client might be from the internet or from elsewhere

    on your internal network
  12. Kubernetes offers 4 main APIs to bring traffic into your

    cluster
  13. 1) Pod IP

  14. 2) Service NodePort

  15. 3) Service LoadBalancer

  16. 4) Ingress

  17. Let’s look at these a bit more

  18. 1) Pod IP

  19. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: pod:pod-port
  20. Requires a fully integrated network (flat IP space)

  21. Doesn’t work well for internet traffic

  22. Requires smart clients and service discovery (pod IPs change when

    pods move)
  23. Included for completeness, but not what most people are here

    to read about
  24. 2) Service NodePort

  25. A port on each node will forward traffic to your

    service We know which service by which port
  26. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: node1:node-port :30093 :30076
  27. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 :30093 :30076 Src: node1 Dst: pod:pod-port
  28. Hold up, why did the source IP change?

  29. By default, a NodePort can forward to any pod, so

    this is possible:
  30. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 :30093 :30076 Src: node1 Dst: pod:pod-port
  31. In that case, the traffic MUST return through node1, so

    we have to SNAT
  32. Pro: - No external infrastructure needed Con: - Can’t use

    arbitrary ports - Clients have to pick a node (nodes can be added and removed over time) - SNAT loses client IP - Two hops
  33. Option: externalTrafficPolicy = Local

  34. If you set this on your service, nodes will only

    choose “local” pods
  35. Eliminates the need for SNAT

  36. Client must choose nodes which actually have pods, or else:

  37. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 :30093 :30076 Src: node1 Dst: ??? Failure ?
  38. Also risk imbalance if clients assume equal weight on nodes:

  39. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 pod Client: 1.2.3.4 pod pod pod pod pod pod 50% 50%
  40. Pro: - No external infrastructure needed - Client IP is

    available Con: - Can’t use arbitrary ports - Clients have to pick a node with pods - Two hops (but less impactful)
  41. 3) Service LoadBalancer

  42. Someone (e.g. cloud provider) allocates a load-balancer for your service

  43. This is an API with very loose requirements

  44. There are a few ways this has been implemented (non-exhaustive)

  45. 3a) VIP-like, 2-hops (e.g. GCP NetworkLB)

  46. The node knows which service by which destination IP (VIP)

  47. How VIPs are propagated and managed is a broad topic,

    and not considered here
  48. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: VIP:service-port VIP
  49. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: VIP:service-port VIP
  50. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: node1 Dst: pod:pod-port VIP
  51. Why did the source IP change, again?

  52. Like a NodePort, a VIP can forward to any pod,

    so this is possible:
  53. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: node1 Dst: pod:pod-port VIP
  54. Again, the traffic MUST return through node1, so we have

    to SNAT
  55. Pro: - Stable VIP - Can use any port you

    want Con: - Requires programmable infrastructure - SNAT loses client IP - Two hops
  56. Option: externalTrafficPolicy = Local

  57. If you set this on your service, nodes will only

    choose “local” pods
  58. Eliminates the need for SNAT

  59. LBs must choose nodes which actually have pods

  60. Pro: - Stable VIP - Can use any port you

    want - Client IP is available Con: - Requires programmable infrastructure - Two hops (but less impactful)
  61. 3b) VIP-like, 1-hop (no known examples)

  62. As far as I know, nobody has implemented this

  63. 3c) Proxy-like, 2-hops (e.g. AWS ElasticLB)

  64. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: proxy:service-port Proxy :30093 :30076
  65. Proxy Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2:

    IP: 10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: proxy Dst: node1:node-port :30093 :30076
  66. Proxy Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2:

    IP: 10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 :30093 :30076 Src: node1 Dst: pod:pod-port
  67. Again with the SNAT?

  68. Yes, this is basically the same as NodePort, but with

    nicer front door
  69. Note that the node which receives the traffic has no

    idea what the original client IP was
  70. Pro: - Stable IP - Can use any port you

    want - Proxy can prevent some classes of attacks - Proxy can add value (e.g. TLS) Con: - Requires programmable infrastructure - Two hops - Loss of client IP (has to move in-band)
  71. Option: externalTrafficPolicy = Local

  72. If you set this on your service, nodes will only

    choose “local” pods
  73. Eliminates the need for SNAT

  74. LBs must choose nodes which actually have pods

  75. Pro: - Stable IP - Can use any port you

    want - Proxy can prevent some classes of attacks - Proxy can add value (e.g. TLS) Con: - Requires programmable infrastructure - Two hops - Loss of client IP (has to move in-band)
  76. 3d) Proxy-like, 1-hop (e.g. GCP HTTP LB)

  77. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: proxy:service-port Proxy
  78. Proxy Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2:

    IP: 10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: proxy Dst: pod:pod-port
  79. No need for the node to do anything

  80. LB needs to know the pod IPs and be kept

    in sync
  81. Pro: - Stable IP - Can use any port you

    want - Proxy can prevent some classes of attacks - Proxy can add value (e.g. TLS) - One hop Con: - Requires programmable infrastructure - Loss of client IP (has to move in-band)
  82. 4) Ingress (HTTP only)

  83. Someone (e.g. cloud provider) allocates an HTTP load-balancer for your

    service
  84. This is an API with very loose requirements

  85. There are a couple ways this has been implemented (non-exhaustive)

  86. 4a) External, 2-hops (e.g. GCP without VPC Native)

  87. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: proxy:service-port Proxy :30093 :30076
  88. Proxy Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2:

    IP: 10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: proxy Dst: node1:node-port :30093 :30076
  89. Proxy Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2:

    IP: 10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 :30093 :30076 Src: node1 Dst: pod:pod-port
  90. Same as 3c

  91. HTTP Proxy can save client IP in X-Forwarded-For header

  92. Pro: - Stable IP - Proxy can prevent some classes

    of attacks - Proxy can add value (e.g. TLS) - Can offer HTTP semantics (e.g. URL maps) Con: - Requires programmable infrastructure - Two hops
  93. Option: externalTrafficPolicy = Local

  94. Same as before

  95. 4b) External, 1-hop (e.g. GCP with VPC Native)

  96. Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2: IP:

    10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: client Dst: proxy:service-port Proxy
  97. Proxy Cluster: 10.0.0.0/16 Node1: IP: 10.240.0.1 Pod range: 10.0.1.0/24 Node2:

    IP: 10.240.0.2 Pod range: 10.0.2.0/24 Pod-a: 10.0.1.1 Pod-b: 10.0.1.2 Pod-c: 10.0.2.1 Pod-d: 10.0.2.2 Client: 1.2.3.4 Src: proxy Dst: pod:pod-port
  98. Same as 3c

  99. HTTP Proxy can save client IP in X-Forwarded-For header

  100. Pro: - Stable IP - Proxy can prevent some classes

    of attacks - Proxy can add value (e.g. TLS) - Can offer HTTP semantics (e.g. URL maps) - One hop Con: - Requires programmable infrastructure
  101. 4c) Internal, shared (e.g. nginx)

  102. Use a service LoadBalancer (see 3a-d) to bring traffic into

    pods which are HTTP proxies Those in-cluster proxies route to the final pods
  103. Pro: - Stable IP - Proxy can add value (e.g.

    TLS) - Multiple hops - Flexible - Low-cost Con: - You manage and scale the proxies - Conflicts can arise between Ingress resources (e.g. claim same hostname)
  104. 4d) Internal, dedicated (e.g. no known examples)

  105. The idea is that you would spin up the equivalent

    of 4c for each Ingress instance or maybe per-namespace
  106. As far as I know, nobody has implemented this