Slide 1

Slide 1 text

A Deep Dive into Service Mesh and Istio 0 ~$ speaker Shukun SONG ~$ company FUJITSU LIMITED Copyright 2019 FUJITSU LIMITED CLOUDNATIVE DAYS TOKYO 2019 / OPENSTACK DAYS TOKYO 2019 [2C1] 13:20 – 14:00 JULY 23th, 2019

Slide 2

Slide 2 text

Outline Service mesh and Istio  Architecture  Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 1

Slide 3

Slide 3 text

Who am I? Shukun SONG Software engineer for OSS development team at Fujitsu. He focuses on improving the performance of Istio and rolling-upgrade for Kubernetes components. Copyright 2019 FUJITSU LIMITED 2

Slide 4

Slide 4 text

Outline Service mesh and Istio  Architecture  Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 3

Slide 5

Slide 5 text

Service Mesh What is service mesh?  Network of microservices that make up service-to-service communication on cloud-based applications, containers and microservices with the following key features: Copyright 2019 FUJITSU LIMITED  Traffic management: Service discovery, routing  Observability: Telemetry  Policies and Security: Policy check, Authentication, credential management 4

Slide 6

Slide 6 text

Overview of Istio What is Istio?  A service mesh started by Lyft, Google and IBM. Characteristics Extensibility/pluggable Not CNCF project, run on K8s Copyright 2019 FUJITSU LIMITED 5

Slide 7

Slide 7 text

Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Galley Citadel Mixer Pilot Control Plane Proxy Proxy 6

Slide 8

Slide 8 text

Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Proxy Proxy = + (Mixer client) Proxy between applications Istio’s proxy includes envoy which is OSS service proxy for cloud native applications. 7

Slide 9

Slide 9 text

Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Mixer Control Plane Proxy Proxy Observability and control Mixer policy controls and telemetry collection. 8

Slide 10

Slide 10 text

Service A Service B Control Plane Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Pilot Proxy Proxy Traffic management Pilot configures all proxies to perform load balancing for requests, routing, and provides an API for service discovery. 9

Slide 11

Slide 11 text

Control Plane Service A Service B Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Citadel CA CA Proxy Proxy Certification management Citadel provides key pair and manages certification for mTLS. 10

Slide 12

Slide 12 text

Istio’s Architecture Copyright 2019 FUJITSU LIMITED Data Plane Service A Service B Galley Mixer Pilot Config management Galley can validate Istio configuration with K8s ValidatingWebhook. Proxy Proxy Control Plane 11

Slide 13

Slide 13 text

Packet flows (except telemetry) Copyright 2019 FUJITSU LIMITED Service A Service A Service B ⇔ ⇔ : : Service A Service B Mixer istio-policy LB istio-ingress gateway Proxy Proxy 12

Slide 14

Slide 14 text

Outline Service mesh and Istio  Architecture  Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 13

Slide 15

Slide 15 text

Performance measurement of Istio Purpose  Measure the time of one response Test environment  Istio version: 1.0.5  A k8s HA cluster with 3 masters and 5 workers  VM spec: 8×2GHz Cores + 8GB Memory  Use bookinfo sample app with 3 replicas  Send 100 requests with 1RPS(Request per Second) Copyright 2019 FUJITSU LIMITED 14

Slide 16

Slide 16 text

Result of measurement  Our result  Community result  Official report https://istio.io/docs/concepts/performance-and-scalability/  Recent performance assessment https://drive.google.com/drive/folders/1Npndz_yUVXktocg5MUNgwA0lP1tNzrZX Copyright 2019 FUJITSU LIMITED Min[ms] Max[ms] Average[ms] With Istio 33.56 73.07 53 Without Istio 22.9 50.82 36 Mixer adds lots of overhead, both check and report 15

Slide 17

Slide 17 text

Where the problem is  Data path ① Request ② Check send ③ Check response ④ Request ⑤ Report send ⑥ Report response Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ 16

Slide 18

Slide 18 text

Mixer architecture  Uniform abstraction of backends  Envoy produces common attributes based on requests  Mixer consumes attributes and talks to backends using adapters  Backends provide functionality such as logging, monitoring and checking Copyright 2019 FUJITSU LIMITED Backend Backend Backend Adapter Adapter Adapter Mixer Attribute request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example Proxy 17

Slide 19

Slide 19 text

1. Check  Data path ① Request ② Check send ③ Check response ④ Request ⑤ Report send ⑥ Report response Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ 18

Slide 20

Slide 20 text

Block until check is done To reduce traffic  Cache Cannot really help as there are too many attributes, too many values, just make local proxy bigger and bigger.  Turn off Check really do not matter, most purposes can be achieved by RBAC Default from v1.1 1. Check Copyright 2019 FUJITSU LIMITED Adapter B Adapter C Adapter A Mixer A : xxx B : xx C : xx Attributes Proxy cache request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example request.path: xyz/abc request.size: 234 request.time: 12:34:56.789 04/17/2017 source.ip: [192 168 0 1] destination.service.name: example cache cache cache 19

Slide 21

Slide 21 text

2. Report  Data path ① Request ② Check send ③ Check response ④ Request ⑤ Report send ⑥ Report response Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ Mixer 20

Slide 22

Slide 22 text

2. Report Happens after request done Batched, not happens every request. Not always affect performance but do harm it when report Currently, the default batch number has been reduced from 1000 to 100. Copyright 2019 FUJITSU LIMITED 21

Slide 23

Slide 23 text

Our trial Copyright 2019 FUJITSU LIMITED 22

Slide 24

Slide 24 text

Current structure Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Service B Proxy Node 2 App pod Node 3 Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ Reduce this traffic Mixer 23

Slide 25

Slide 25 text

Our idea  Reduce traffic between nodes Copyright 2019 FUJITSU LIMITED Service A Proxy Node 1 App pod Istio-policy Istio-telemetry ① ② ③ ④ ⑤ ⑥ Service B Proxy Node 2 App pod Istio-policy Istio-telemetry Mixer Mixer 24

Slide 26

Slide 26 text

Reference (IBM)  Measurement  Environment - Node A: 1 CPU(2.2GHz) / 4GB RAM - Node B: 2 CPUs(2.2GHz) / 4GB RAM - Node C: 4 CPUs(2.2GHz) / 8GB RAM  Run mixc on a node, access to istio-telemetry through service and send 100k requests  Result  Average report time 3.9ms Copyright 2019 FUJITSU LIMITED Node B Istio-telemetry Node A Istio-telemetry mixc Node C Istio-telemetry istio-telemetry.istio-system.svc.cluster.local:15004 25

Slide 27

Slide 27 text

Reference (IBM)  Measurement  Not use mixc, but write a client call report function directly and serially  Result  Compare to the result of 3.9ms, report to local mixer can reduce 1.1~2.4ms latency Copyright 2019 FUJITSU LIMITED Node B Istio-telemetry Node A Istio-telemetry Client Node C Istio-telemetry Client Client 26

Slide 28

Slide 28 text

Distributed Mixer Points ① Create a mixer on each node ② Ensure proxies communicate with the mixer on the same node Our trial Copyright 2019 FUJITSU LIMITED Node Mixer Proxy App Container Proxy App Container Proxy App Container … Node Mixer Proxy App Container Proxy App Container Proxy App Container ① ① ② ② ② ② ② ② 27

Slide 29

Slide 29 text

Ensure proxies communicate with the mixer on the same node Copyright 2019 FUJITSU LIMITED Mixer Mixer Service … Mixer Mixer … Access to Mixer via service (Load balanced) Access to Mixer directly 28

Slide 30

Slide 30 text

Ensure proxies communicate with the mixer on the same node Copyright 2019 FUJITSU LIMITED Mixer Mixer Service … Mixer Mixer … Access to Mixer directly Sidecar deployment Envoy has to be deployed with configuration which specifying Mixer directly. Pilot-agent Template --- ----- JSON --- ----- Mixer 29

Slide 31

Slide 31 text

Ensure proxies communicate with the mixer on the same node Copyright 2019 FUJITSU LIMITED Mixer Mixer Service … Mixer Mixer … Access to Mixer directly Sidecar deployment Envoy has to be deployed with configuration which specifying Mixer directly. Pilot-agent Template --- ----- Mixer JSON --- ----- …, “dynamic_resources”: { // Ask Pilot }, “static_resources”: { // Pilot information, // Add mixer’s information },…, 30

Slide 32

Slide 32 text

Our trial Configure Envoy  Add static resource The procedure about creating configuration from template is written in the file istio.io/istio/pkg/bootstrap/bootstrap_config.go whose name is WriteBootstrap Copyright 2019 FUJITSU LIMITED // WriteBootstrap generates an envoy config based on config and epoch, and returns the filename. // TODO: in v2 some of the LDS ports (port, http_port) should be configured in the bootstrap. func WriteBootstrap(config *meshconfig.ProxyConfig, node string, epoch int, pilotSAN []string, opts map[string]interface{}, localEnv []string, nodeIPs []string, dnsRefreshRate string) (string, error){ Add a little change to this function to enable pilot-agent recognize HOST_IP 31

Slide 33

Slide 33 text

Our trial Configure Envoy  Figure out the process of check (In progress..) • Dump the config of a running Envoy sidecar • Envoy config is hard!  Cluster, listener, router, filter, … So many new concepts over 7000 lines. • Istio creates a custom filter named mixer  This filter is a mixer client responsible to interact with mixer server through gRPC  Have no idea about how to write the config about this filter correctly Copyright 2019 FUJITSU LIMITED 32

Slide 34

Slide 34 text

Outline Service mesh and Istio  Architecture  Packet flow Performance measurement of Istio Where the problem is Our trial Istio’s next feature Copyright 2019 FUJITSU LIMITED 33

Slide 35

Slide 35 text

Istio’s next feature Extensibility v2 (formerly known as Mixer v2)  Envoy communicates with backend directly to reduce overhead b/w Envoy and Mixer. Copyright 2019 FUJITSU LIMITED Mixer Backend Backend Backend Backend Backend Backend No change to Envoy No restart Envoy https://www.youtube.com/watch?v=XdWmm_mtVXI 34

Slide 36

Slide 36 text

No content