A Deep Dive into Service Mesh and Istio 0 ~$ speaker Shukun SONG ~$ company FUJITSU LIMITED Copyright 2019 FUJITSU LIMITED CLOUDNATIVE DAYS TOKYO 2019 / OPENSTACK DAYS TOKYO 2019 [2C1] 13:20 – 14:00 JULY 23th, 2019
Outline Service mesh and Istio Architecture Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 1
Who am I? Shukun SONG Software engineer for OSS development team at Fujitsu. He focuses on improving the performance of Istio and rolling-upgrade for Kubernetes components. Copyright 2019 FUJITSU LIMITED 2
Outline Service mesh and Istio Architecture Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 3
Service Mesh What is service mesh? Network of microservices that make up service-to-service communication on cloud-based applications, containers and microservices with the following key features: Copyright 2019 FUJITSU LIMITED Traffic management: Service discovery, routing Observability: Telemetry Policies and Security: Policy check, Authentication, credential management 4
Overview of Istio What is Istio? A service mesh started by Lyft, Google and IBM. Characteristics Extensibility/pluggable Not CNCF project, run on K8s Copyright 2019 FUJITSU LIMITED 5
Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Proxy Proxy = + (Mixer client) Proxy between applications Istio’s proxy includes envoy which is OSS service proxy for cloud native applications. 7
Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Mixer Control Plane Proxy Proxy Observability and control Mixer policy controls and telemetry collection. 8
Service A Service B Control Plane Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Pilot Proxy Proxy Traffic management Pilot configures all proxies to perform load balancing for requests, routing, and provides an API for service discovery. 9
Control Plane Service A Service B Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Citadel CA CA Proxy Proxy Certification management Citadel provides key pair and manages certification for mTLS. 10
Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Galley Mixer Pilot Config management Galley can validate Istio configuration with K8s ValidatingWebhook. Proxy Proxy Control Plane 11
Packet flows (except telemetry) Copyright 2019 FUJITSU LIMITED Service A Service A Service B ⇔ ⇔ : : Service A Service B Mixer istio-policy LB istio-ingress gateway Proxy Proxy 12
Outline Service mesh and Istio Architecture Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 13
Performance measurement of Istio Purpose Measure the time of one response Test environment Istio version: 1.0.5 A k8s HA cluster with 3 masters and 5 workers VM spec: 8×2GHz Cores + 8GB Memory Use bookinfo sample app with 3 replicas Send 100 requests with 1RPS(Request per Second) Copyright 2019 FUJITSU LIMITED 14
Result of measurement Our result Community result Official report https://istio.io/docs/concepts/performance-and-scalability/ Recent performance assessment https://drive.google.com/drive/folders/1Npndz_yUVXktocg5MUNgwA0lP1tNzrZX Copyright 2019 FUJITSU LIMITED Min[ms] Max[ms] Average[ms] With Istio 33.56 73.07 53 Without Istio 22.9 50.82 36 Mixer adds lots of overhead, both check and report 15
Where the problem is Data path ① Request ② Check send ③ Check response ④ Request ⑤ Report send ⑥ Report response Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ 16
1. Check Data path ① Request ② Check send ③ Check response ④ Request ⑤ Report send ⑥ Report response Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ 18
Block until check is done To reduce traffic Cache Cannot really help as there are too many attributes, too many values, just make local proxy bigger and bigger. Turn off Check really do not matter, most purposes can be achieved by RBAC Default from v1.1 1. Check Copyright 2019 FUJITSU LIMITED Adapter B Adapter C Adapter A Mixer A : xxx B : xx C : xx Attributes Proxy cache request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example cache cache cache 19
2. Report Data path ① Request ② Check send ③ Check response ④ Request ⑤ Report send ⑥ Report response Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ Mixer 20
2. Report Happens after request done Batched, not happens every request. Not always affect performance but do harm it when report Currently, the default batch number has been reduced from 1000 to 100. Copyright 2019 FUJITSU LIMITED 21
Current structure Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ Reduce this traffic Mixer 23
Our idea Reduce traffic between nodes Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ Service B Proxy Node 2 App pod Istio-policy Istio-telemetry Mixer Mixer 24
Reference (IBM) Measurement Not use mixc, but write a client call report function directly and serially Result Compare to the result of 3.9ms, report to local mixer can reduce 1.1~2.4ms latency Copyright 2019 FUJITSU LIMITED Node B Istio-telemetry Node A Istio-telemetry Client Node C Istio-telemetry Client Client 26
Distributed Mixer Points ① Create a mixer on each node ② Ensure proxies communicate with the mixer on the same node Our trial Copyright 2019 FUJITSU LIMITED Node Mixer Proxy App Container Proxy App Container Proxy App Container … Node Mixer Proxy App Container Proxy App Container Proxy App Container ① ① ② ② ② ② ② ② 27
Ensure proxies communicate with the mixer on the same node Copyright 2019 FUJITSU LIMITED Mixer Mixer Service … Mixer Mixer … Access to Mixer via service (Load balanced) Access to Mixer directly 28
Ensure proxies communicate with the mixer on the same node Copyright 2019 FUJITSU LIMITED Mixer Mixer Service … Mixer Mixer … Access to Mixer directly Sidecar deployment Envoy has to be deployed with configuration which specifying Mixer directly. Pilot-agent Template --- ----- JSON --- ----- Mixer 29
Ensure proxies communicate with the mixer on the same node Copyright 2019 FUJITSU LIMITED Mixer Mixer Service … Mixer Mixer … Access to Mixer directly Sidecar deployment Envoy has to be deployed with configuration which specifying Mixer directly. Pilot-agent Template --- ----- Mixer JSON --- ----- …, “dynamic_resources”: { // Ask Pilot }, “static_resources”: { // Pilot information, // Add mixer’s information },…, 30
Our trial Configure Envoy Add static resource The procedure about creating configuration from template is written in the file istio.io/istio/pkg/bootstrap/bootstrap_config.go whose name is WriteBootstrap Copyright 2019 FUJITSU LIMITED // WriteBootstrap generates an envoy config based on config and epoch, and returns the filename. // TODO: in v2 some of the LDS ports (port, http_port) should be configured in the bootstrap. func WriteBootstrap(config *meshconfig.ProxyConfig, node string, epoch int, pilotSAN []string, opts map[string]interface{}, localEnv []string, nodeIPs []string, dnsRefreshRate string) (string, error){ Add a little change to this function to enable pilot-agent recognize HOST_IP 31
Our trial Configure Envoy Figure out the process of check (In progress..) • Dump the config of a running Envoy sidecar • Envoy config is hard! Cluster, listener, router, filter, … So many new concepts over 7000 lines. • Istio creates a custom filter named mixer This filter is a mixer client responsible to interact with mixer server through gRPC Have no idea about how to write the config about this filter correctly Copyright 2019 FUJITSU LIMITED 32
Outline Service mesh and Istio Architecture Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 33
Istio’s next feature Extensibility v2 (formerly known as Mixer v2) Envoy communicates with backend directly to reduce overhead b/w Envoy and Mixer. Copyright 2019 FUJITSU LIMITED Mixer Backend Backend Backend Backend Backend Backend No change to Envoy No restart Envoy https://www.youtube.com/watch?v=XdWmm_mtVXI 34