Slide 1

Slide 1 text

101 - eBPF & Dataplane v2 with GKE Lê Anh Tuấn [email protected] @tuanbo91

Slide 2

Slide 2 text

Agenda ● eBPF - 101 ● Cilium - 101 ● GKE Dataplane v2 ● What’s next?

Slide 3

Slide 3 text

eBPF - 101

Slide 4

Slide 4 text

What is eBPF? ● eBPF is a mechanism for Linux applications to execute code in Linux kernel space. ● eBPF can run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules. ● Several complex components are involved in the functioning of eBPF programs and their execution.

Slide 5

Slide 5 text

What is eBPF? ● Extended BPF (or eBPF) is similar to the original ("classic") BPF (cBPF) used to filter network packets. ● Linux kernel runs eBPF only and loaded cBPF bytecode is transparently translated into an eBPF representation in the kernel before program execution. ● eBPF programs can be attached to different events: ○ Kprobes, tracepoints, uprobes, sockets, cgroup_filters, etc.

Slide 6

Slide 6 text

Why eBPF? ● Kernel space has always been an ideal place to implement observability, security, and networking functionality due to the kernel’s privileged ability to oversee and control the entire system. ● At the same time, Kernel is hard to evolve due to its central role and high requirement towards stability and security. The rate of innovation at the operating system level has thus traditionally been lower compared to functionality implemented outside of the operating system.

Slide 7

Slide 7 text

eBPF use cases ● Security ● Networking ● Profiling ● Observability (Log / Metrics / Tracing)

Slide 8

Slide 8 text

The pitfalls of eBPF ● Restricted to Linux and a recent kernel ● Sandboxed programs are limited

Slide 9

Slide 9 text

The pitfalls of eBPF ● Restricted to Linux and a recent kernel NOT ANYMORE!

Slide 10

Slide 10 text

eBPF Foundation

Slide 11

Slide 11 text

BPF Tech Adoption

Slide 12

Slide 12 text

Cilium - 101

Slide 13

Slide 13 text

What is Cilium? ● Open Source project using eBPF as its foundation ● Networking & Load-Balancing ● Network Security ● Observability

Slide 14

Slide 14 text

So, why Cilium? Kubernetes uses iptables for... ● kube-proxy - the component which implements Services and load balancing by DNAT iptables rules ● Most of CNI plugins are using iptables for Network Policies

Slide 15

Slide 15 text

So, what’s wrong with iptables? ● iptables updates must be made by recreating and updating all rules in a single transaction. ● Implements chains of rules as a linked list, so all operations are O(n). ● The standard practice of implementing access control lists (ACLs) as implemented by iptables was to use sequential list of rules. ● It’s based on matching IPs and ports, not aware about L7 protocols.

Slide 16

Slide 16 text

Cilium performance

Slide 17

Slide 17 text

Cilium use cases ● Networking ○ Highly efficient and flexible networking ○ Routing, Overlay, Cloud-provider native ○ IPv4, IPv6, NAT 46 ● Network Security ○ Identity-based network security ○ API-aware security (HTTP, gRPC, Kafka, Cassandra, memcached, ..) ○ DNS-aware ● Load-balancing: ○ Highly scalable L3-L4 load balancing ○ Kubernetes services (replaces kube-proxy) ● Observability ○ Metrics (Network, DNS, Security, Latencies, HTTP, …) ○ Flow logs (w/ datapath aggregation)

Slide 18

Slide 18 text

Cilium: Load-balancing

Slide 19

Slide 19 text

Cilium: Network Policies

Slide 20

Slide 20 text

Cilium: Identity-based Security

Slide 21

Slide 21 text

Cilium: DNS-aware Policy

Slide 22

Slide 22 text

GKE Dataplane V2

Slide 23

Slide 23 text

GKE Dataplane V2 - Advantages ● Security ○ Kubernetes Network policy is always on in clusters with GKE Dataplane V2. You don't have to install and manage third-party software add-ons such as Calico to enforce network policy. ● Scalability ○ GKE Dataplane V2 is implemented without kube-proxy and does not rely on iptables for service routing. This removes a major bottleneck for scaling Kubernetes services in very large clusters. ● Operations ○ When you create a cluster with GKE Dataplane V2, network policy logging is built in. Configure the logging CRD on your cluster to see when connections are allowed and denied by your Pods. ● Consistency ○ GKE Dataplane V2 is available and provides the same features on GKE and on other Anthos clusters environments.

Slide 24

Slide 24 text

GKE Dataplane V2 - Technical specifications

Slide 25

Slide 25 text

GKE Dataplane V2 - Technical specifications

Slide 26

Slide 26 text

GKE Dataplane V2

Slide 27

Slide 27 text

GKE Dataplane V2 - Limitations ● GKE Dataplane V2 can only be enabled when creating a new cluster. Existing clusters cannot be upgraded to use GKE Dataplane V2 (at the moment) ● Not all cilium configuration are supported by GKE Dataplane V2 ● https://cloud.google.com/kubernetes-engine/docs/concepts/dataplane-v2#limitations

Slide 28

Slide 28 text

What’s next

Slide 29

Slide 29 text

What’s next

Slide 30

Slide 30 text

What’s next gcloud container clusters create CLUSTER_NAME \ --enable-dataplane-v2 \ --enable-ip-alias \ --release-channel CHANNEL_NAME \ --region COMPUTE_REGION https://cloud.google.com/kubernetes-engine/docs/how-to/dataplane-v2

Slide 31

Slide 31 text

What’s next https://console.cloud.google.com/freetrial

Slide 32

Slide 32 text

Thank you!