Slide 1

Slide 1 text

penShift Service Mesh rtwin Schneider Principal Technical Marketing Manager, Red Hat 1

Slide 2

Slide 2 text

OpenShift Service Mesh Contents ● What is a Service Mesh? ● Why do I need a Service Mesh at all? ● Key Capabilities and Usage Scenarios ● OpenShift Service Mesh and how can I get it? ● What type of apps is it for? ● Should I care about a Service Mesh? Personas ● What is the Overhead? ● Service Mesh across clusters - Mesh Federation ● Roadmap & FAQ

Slide 3

Slide 3 text

What is a Service Mesh? “Proxies and a Control Plane”

Slide 4

Slide 4 text

What is a Service Mesh ? A programmable network?! ● ... a bunch of userspace proxies as sidecar next to your services ● ... a controlplane with management components managing the proxies and providing an API ● ... proxies intercept calls and “do” something with them ● … the proxies are Layer 7-aware and act as proxies and reverse proxies

Slide 5

Slide 5 text

Istio Service Mesh - Architecture

Slide 6

Slide 6 text

Connecting Services within the Mesh ● All service pods are given an Envoy proxy as a sidecar container. Together, these form the Data Plane. ● All communications occur through these proxies. ● This creates a mesh of communication that has full visibility and control of all traffic. ● The proxies - and thus the mesh, are configured and managed by a central Control Plane. Service A Envoy Proxy Service B Envoy Proxy Service C Envoy Proxy Control Plane

Slide 7

Slide 7 text

What is a Service Mesh ? An Abstraction of Microservice Connectivity ● ... a strict control mechanism over the communication of a set of microservices ● ... a ‘firewall’ and a ‘router’ for incoming requests ● ... it is completely abstracted and invisible to the microservices themselves ● … it helps transition monolithic applications to distributed microservice architecture

Slide 8

Slide 8 text

Key Capabilities of a Service Mesh

Slide 9

Slide 9 text

Service Mesh - Key capabilities ● Traffic Management ○ Control the flow of traffic and API calls between services ○ Make calls more reliable ○ Make the network more robust in the face of adverse conditions ○ Give applications greater flexibility for deployment ● Observability ○ Understand the dependencies between services ○ Identify the nature and flow of traffic between them services ○ Quickly identify issues ○ Observe and demonstrate traffic flow and communication timing

Slide 10

Slide 10 text

Service Mesh - Key capabilities ● Policy Enforcement ○ Apply organizational policy to the interaction between services ○ Ensure access policies are enforced and resources are fairly distributed among consumers ○ Policy changes are made by configuring the mesh, not by changing application code ● Service Identity and Security ○ Provide services in the mesh with a verifiable identity ○ Protect service traffic as it flows over networks of varying degrees of trust

Slide 11

Slide 11 text

Why do I need Service Mesh?

Slide 12

Slide 12 text

Developing Microservices A Common Pattern ● A common pattern when developing microservices. ● In Development: ○ New services are written. ○ They are tested locally - looks good! ○ The are tested in a staging cluster - looks good! ● Ship it! Service A Service B Service C Gateway ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Slide 13

Slide 13 text

Microservice in Production A Common Pattern Service A Service B Service C Gateway X ? ? ? ? ? ? ? ● In production, things become less predictable: ○ Sporadic delays and failures are seen. ○ Performance is not as expected. ○ Security holes may be discovered. ○ Fixes are made, but upgrades cause further issues. ● Microservices are distributed systems and troubleshooting distributed systems is hard.

Slide 14

Slide 14 text

The Fallacies of Distributed Computing Microservices are Distributed Systems Service A Service B Service C Gateway ? ? ? ? ? ? ? ? ● These challenges are a result of the fallacies of distributed computing: ○ The network is reliable. ○ Latency is zero. ○ Bandwidth is infinite. ○ The network is secure. ○ Topology doesn't change. ○ There is one administrator. ○ Transport cost is zero. ○ The network is homogeneous.

Slide 15

Slide 15 text

Why Service Mesh? Solving Microservices Challenges with Code ● These challenges are often mitigated with: ○ Code to handle failures between services. ○ Logs, metrics and traces in source code. ○ 3rd party libraries for managing deployments, security and more. ● A wide range of open source libraries exist to managing these challenges (Netflix are best known) ● This results in: ○ Different solutions in different services. ○ Boilerplate code. ○ New dependencies to keep up date. Every Service Service ...and more boilerplate code. Traffic Management Code Failure Handling Code Metrics & Tracing Code Security Code Container Platform

Slide 16

Slide 16 text

Why Service Mesh? An Abstraction for Microservice Challenges ● Service Mesh solve distributed systems challenges at a common infrastructure layer. ● This reduces boilerplate code and copy/paste errors across services. ● Enforces common policies across all services. ● Removes the obligation to implement cross cutting concerns from developers. Service Container Platform Service Mesh Services Without Service Mesh Services With Service Mesh Service ...and more boilerplate code. Traffic Management Code Failure Handling Code Metrics & Tracing Code Security Code Container Platform

Slide 17

Slide 17 text

What type of app is it for? “Microservices, Monolithic, Serverless …”

Slide 18

Slide 18 text

What type of apps ? ● Microservices? Yes! That is what it’s “made” for ● Monolithic? Yes, but … ● Serverless? Yes ● Jobs? Yes, but … ● Event based, Kafka, Message Brokers? Yes, but … ● Non containerized? VMs? Bare Metal? Yes

Slide 19

Slide 19 text

OpenShift Service Mesh and how can I get it?

Slide 20

Slide 20 text

OpenShift Service Mesh OpenShift Service Mesh ● Based on the upstream Istio.io project, and maintained as the downstream Maistra.io project. ● Built on the upstream Istio.io, though not the bleeding edge: ○ Red Hat performs validation and QA on upstream Istio releases to ensure they are ready for production support. ○ Fixes and enhancements are contributed to upstream Istio. ○ Maistra.io maintains a unique set of features for OpenShift Service Mesh customers. ○ OpenShift Service Mesh 2.1 is based on Istio 1.9.

Slide 21

Slide 21 text

OpenShift Service Mesh Next generation service management through Open source software The service mesh - traffic management and control User interface for service communication visualisation Analytics and timing information for service communication

Slide 22

Slide 22 text

Management, Monitoring & Observability ● OpenShift Service Mesh includes a baked in stack for management, monitoring and observability: ○ Kiali, with its topology view can be used to observe, manage and troubleshoot the mesh. ○ Grafana and Prometheus provide out of the box metrics and monitoring for all services. ○ Jaeger and ElasticSearch capture distributed traces providing a “per request” view for isolating bottlenecks between services.

Slide 23

Slide 23 text

What's New in OpenShift 4.10 23 • Service mesh | Serverless • Builds | CI/CD pipelines • GitOps | Distributed Tracing • Log management • Cost management • Languages and runtimes • API management • Integration • Messaging • Process automation • Databases | Cache • Data ingest and preparation • Data analytics • AI/ML • Developer CLI | IDE • Plugins and extensions • CodeReady workspaces • CodeReady containers Developer services Developer productivity Kubernetes cluster services Install | Over-the-air updates | Networking | Ingress | Storage | Monitoring | Log forwarding | Registry | Authorization | Containers | VMs | Operators | Helm Linux (container host operating system) Kubernetes (orchestration) Physical Virtual Private cloud Public cloud Edge Cluster security Global registry Multicluster management Data services* Data-driven insights Application services* Build cloud-native apps Platform services Manage workloads * Red Hat OpenShift® includes supported runtimes for popular languages/frameworks/databases. Additional capabilities listed are from the Red Hat Application Services and Red Hat Data Services portfolios. ** Disaster recovery, volume and multicloud encryption, key management service, and support for multiple clusters and off-cluster workloads requires OpenShift Data Foundation Advanced Observability | Discovery | Policy | Compliance | Configuration | Workloads Image management | Security scanning | Geo-replication Mirroring | Image builds Declarative security | Container vulnerability management | Network segmentation | Threat detection and response RWO, RWX, Object | Efficiency | Performance | Security | Backup | DR Multicloud gateway Cluster data management Red Hat open hybrid cloud platform

Slide 24

Slide 24 text

Connect, Secure, Control and Observe Services on OpenShift ● A software infrastructure layer between Kubernetes and your services for managing communications. ● Handles common “microservice” challenges, so that developers don’t have to: ○ Security ○ Monitoring & Observability ○ Application Resilience ○ Upgrades, Rollouts & A/B Testing ○ And more... Product Manager: Jamie Longmuir and Mauricio "Maltron" Leal OPENSHIFT OpenShift Service Mesh Istio Jaeger Red Hat Enterprise Linux CoreOS Physical Virtual Private cloud Public cloud Services F * Eventing is currently in Technology Preview ** Functions are currently a work in progress initiative Kiali OpenShift Service Mesh Envoy Envoy Envoy

Slide 25

Slide 25 text

Installation & Management ● OpenShift Service Mesh is Operator driven, installed and upgraded via OpenShift’s OperatorHub. ● A custom resource (CRD) called a ServiceMeshControlPlane is used for configuring control plane components, including: ○ Number of replicas (for a highly available Control Plane) ○ Resource requests ○ Node affinity ○ and more... ● ServiceMeshMemberRoll and ServiceMeshMember resources configure which projects are part of the mesh.

Slide 26

Slide 26 text

● OpenShift Service Mesh provides a multi-tenant topology where multiple service meshes are deployed within a single OpenShift cluster. ● A mesh consists of one or more projects (namespaces). ● Each mesh is isolated and managed independently. ● Communication between meshes involves configuring one or more Gateways, as you would for accessing external services. Service A Service B Service Mesh: foo.com Service C Service D Service Mesh: bar.com Control Plane Control Plane Project: foo-istio-system Project: bar-istio-system Project: foo Project: bar Multi-Tenant Service Mesh

Slide 27

Slide 27 text

Service Mesh with OpenShift Routes OpenShift Service Mesh ● In Service Mesh, an Ingress Gateway is used for accessing services within the Mesh. ● The Ingress Gateway is a standalone Envoy proxy that acts as an entry point into the mesh. ● In OpenShift, a route* acts as an entrypoint into the cluster, backed by HAProxy. ● OpenShift Service Mesh automatically creates and configures routes when Ingress Gateways are created. * OpenShift also supports Kubernetes Ingress (which were inspired by routes) and Red Hat is an active contributor in the next generation of Ingress - Service APIs. Ingress Gateway Service A Service B Service C OpenShift Route OpenShift Cluster Network

Slide 28

Slide 28 text

Security & Compliance OpenShift Service Mesh ● Reduced Permissions for Service Mesh administration: ○ Upstream Istio requires users to have elevated privileges to manage a Service Mesh. ○ In OpenShift, the Service Mesh Operator performs privileged operations on behalf of individual mesh installations. ○ Control Plane and Data Plane components require no elevated permissions to be granted to users. ○ Service Mesh components only have visibility within their mesh namespaces ○ This reduces the level of permissions required to manage a service mesh, controlled using kubernetes RBAC. OpenShift Cluster Service Mesh Service Mesh Service Mesh Cluster Admin Service Mesh Operator Mesh User Mesh User Mesh User

Slide 29

Slide 29 text

Security & Compliance OpenShift Service Mesh ● OpenSSL Encryption: ○ OpenShift Service Mesh uses RHEL’s OpenSSL library in place of the BoringSSL library used by upstream Istio. ○ OpenSSL is the standard cryptographic library within Red Hat, supported by the RHEL team at Red Hat. ○ Facilitates FIPS compliance, taking advantage of the OpenSSL FIPS Object Module

Slide 30

Slide 30 text

Service Mesh with API Management OpenShift Service Mesh ● 3Scale is Red Hat’s API Management solution that makes it easy to share, secure, distribute, control and monetize your APIs. ● Available as both a hosted SaaS offering, as well as on premises. ● 3Scale integrates directly with OpenShift Service Mesh: ○ As of 2.0, this integration uses Istio’s Mixer component (deprecated) ○ As of Service Mesh 2.1, this will use a WebAssembly Extension plugin.

Slide 31

Slide 31 text

Difference to Upstream Istio?

Slide 32

Slide 32 text

CONFIDENTIAL Integrations with OpenShift components such as OperatorHub, OpenShift Routes and 3Scale API Management. OpenShift Integrations OpenShift Service Mesh OpenShift Service Mesh vs Istio Multiple meshes securely deployed within the same cluster, with each mesh isolated and managed independently. Multi-Tenant Architecture Pre-configured Kiali, Jaeger and Grafana for simplified management, monitoring and observability. Management, Monitoring & Observability Control Plane and Data Plane components execute with standard privileges. OpenSSL for FIPS compliance. Security & Compliance Focus Additions for Red Hat’s Enterprise & Public Sector Customers

Slide 33

Slide 33 text

What is the difference between Istio and OSSM? 33 ○ For automatic injection, we use annotation on the deployment instead of the namespace label that upstream uses. All services to be included in the mesh must have these annotations. ○ We add network policies which change the network behavior - restricting traffic from outside of the mesh, and opening traffic inside the mesh (to be managed by Mesh policies). This feature can optionally be disabled. ○ We replaced MeshPolicy with ServiceMeshPolicy and ClusterRbacConfig with ServicemeshRbacConfig. ○ We are multi-tenant by default. We have a servicemesh member roll/service mesh control plane that would need to be configured with the projects that are to be included in the mesh. We are exploring a cluster-wide installation option similar to upstream for 2.3/2.4.

Slide 34

Slide 34 text

Service Mesh Scenarios “Connect, secure, observe and control traffic”

Slide 35

Slide 35 text

Service Mesh in operation With service mesh External user UI Service Sidecar Container G VS Gateway Virtual service Application Container DR Destination rule Istiod External user UI Service Route FQDN Application Container Without service mesh

Slide 36

Slide 36 text

Gateway and virtual service 36 ● Gateway ○ Ingress controller provided as part of the Istio control plane ○ Standalone Envoy proxy at the edge of the mesh ○ Load balancing for incoming traffic (layer 4 - 6) ● Virtual Service ○ Configuration of routing requirements ○ Operates alongside destination rules ○ Distributed to the sidecar proxy containers by istiod (control plane) ○ Fine grained control of traffic management Layer 1 - physical structure Layer 2 - Data link (frames) Layer 3 - IP packets Layer 4 - Transport (TCP/UDP) Layer 5 - Session (API’s ) Layer 6 - Presentation (SSL) Layer 7 - Application (http) Virtual service OSI Stack Gateway

Slide 37

Slide 37 text

Virtual service example - Traffic control 37 apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: layer2-a spec: hosts: - layer2-a http: - match: - uri: prefix: /call-layers - uri: exact: /get-info route: - destination: host: layer2-a port: number: 8080 subset: inst-1 weight: 80 - destination: host: layer2-a port: number: 8080 subset: inst-2 weight: 20 timeout: 1.500s Http uri matching using ‘prefix’ and ‘exact’ Prefix required for example /call-layers?key=value Route destination rules : 80% to layer-2a inst-1 20% to layer-2a inst-2 Timeout of 1.5 seconds after which communication is abandoned

Slide 38

Slide 38 text

Releasing Services ● Control all communications allows for fine-grained traffic control between services without source code changes or restarting services. ● Create multiple “subsets” of a service (e.g. different versions) to enable: ○ Canary Deployments - apply a small amount of traffic to a new subset using weights. ○ A/B Testing - Apply a fraction of traffic to a different service using weights. ○ Mirrored Launches - duplicate live traffic loads across services to see how a new service handles the real world. ○ Header based routing. Service A V1 Service B Service C Ingress & Egress Control Plane Service A V2

Slide 39

Slide 39 text

Securing Services ● As all communication is via the proxies, we can enforce and manage security policies across services without source code changes. ○ Enforce the use of mTLS encryption across all services. ○ Authenticate requests using JSON Web Token (JWT) validation. ○ Define service-to-service and user-to-service authorization policies. ■ Facilitate Zero-trust networking. ○ Secure the Service Mesh Control plane with RBAC policies. Service A Service B Service C Ingress & Egress Control Plane

Slide 40

Slide 40 text

Monitoring & Observing Services ● As communication is via proxies, we have full visibility of all traffic without source code changes*. ○ Metrics and dashboards - request volumes, duration, success/failure rates, etc. ○ Distributed Tracing - identify bottlenecks in slow request paths. ○ OpenShift Service Mesh includes Kiali, Jaeger, Grafana, Prometheus, ElasticSearch. *Trace context propagation within services will require minor code changes. Service A Service B Service C Ingress & Egress Control Plane

Slide 41

Slide 41 text

Securing Services - mTLS encryption ● Enforce and manage security policies across services without source code changes. POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY TLS TLS

Slide 42

Slide 42 text

Building Resilient Services ● Timeout ○ Allow more time for the application to respond to the Envoy proxy apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: layer2 spec: hosts: - layer2 http: - match: - uri: exact: /call-layers route: - destination: host: layer2 port: number: 8080 subset: v1 retries: attempts: 5 perTryTimeout: 10s POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY timeout: 10 sec retry: 5 timeout: 15 sec retry: 5

Slide 43

Slide 43 text

Testing Service Resilience ● Circuit Breaker configuration : ○ Set threshold limits beyond which the circuit breaker "trips" ○ Further traffic is prevented by the service mesh ○ Build resilience into applications ○ Provide a microservice with the time required to recover apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: userprofile spec: host: userprofile subsets: - name: v3 labels: version: '3.0' trafficPolicy: connectionPool: http: http1MaxPendingRequests: 1 maxRequestsPerConnection: 1 outlierDetection: consecutiveErrors: 1 interval: 1s baseEjectionTime: 10m maxEjectionPercent: 100 POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY

Slide 44

Slide 44 text

Testing Service Resilience ● Fault injection provides the ability to validate how services will perform when failures inevitably occur. ○ Examples: ■ 50% of requests to a service will fail with error code 503. ○ Enables chaos engineering. apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: userprofile spec: hosts: - userprofile http: - fault: abort: httpStatus: 503 percentage: value: 50 route: - destination: host: userprofile subset: v3 POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY 503 status in 50% of responses

Slide 45

Slide 45 text

Testing Service Resilience ● Fault injection provides the ability to validate how services will perform when failures inevitably occur. ○ Examples: ■ 20% of services will be delayed by 5 seconds. ○ Enables chaos engineering. apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: userprofile spec: hosts: - userprofile http: - fault: delay: fixedDelay: 5s percentage: value: 20 route: - destination: host: userprofile subset: v3 POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY 5 sec delay in 30% of requests

Slide 46

Slide 46 text

Extending Service Mesh ● The power of Envoy is that it is highly extensible with WebAssembly Extensions. ● WebAssembly is a format that allows extensions to be written in more than 15 programming languages. ● This will allow mesh operators to incorporate custom cross-cutting functionality at the proxy level. Service Service Service Ingress & Egress Control Plane

Slide 47

Slide 47 text

Service Mesh Across Clusters OpenShift Service Mesh 2.1

Slide 48

Slide 48 text

Upstream Istio offers multiple deployment models for multicluster services meshes. ● Multi-Primary, Primary-Remote, External Control Plane, etc. ● These assume that the mesh and cluster admins are part of the same administrative boundary - no multi-tenancy. ● These topologies connect the IstioD control planes and the Kubernetes API servers of all involved meshes and clusters - a security risk. Istio Multicluster Topologies Istiod B Kubernetes API Server B Istiod A Kubernetes API Server A Istiod C Kubernetes API Server C

Slide 49

Slide 49 text

● OpenShift Service Mesh’s Multi-Cluster Strategy aims to put security first and support multi-tenant environments. ● Multi-Tenant support has been a pillar of OpenShift Service Mesh from day 1. ● We have divided our multi-cluster approach into two categories: ○ Service Mesh Federation - securely connecting distinct service meshes across multiple clusters to enable sharing, load balancing and failover scenarios. ○ Multi-cluster Service Mesh - a single service mesh stretched across multiple clusters managed by a single control plane. OpenShift Service Mesh Multi-Cluster

Slide 50

Slide 50 text

Service Mesh Federation - Topology 50 ● Each mesh remains distinct with its own control plane. ● Federated meshes may be in the same or different OpenShift clusters. ● All traffic between meshes is via configurable Ingress/Egress Gateways ○ Connectivity between Gateways is a prerequisite ● For multi-cluster, there is no need to connect with the Kubernetes API server. Service Service Service Service Control Plane (Istiod) Control Plane (Istiod) Gateway Gateway

Slide 51

Slide 51 text

Service Mesh Federation - Administration 51 ● Provides a “need to know” model for multi-cluster service mesh ● Decisions around exposing services between meshes are delegated to mesh administrators. ● Services must explicitly be configured to be exported and visible to other meshes. ○ Including configuring trust domains between meshes. Service A Service B Service Mesh: foo.com Service C Service D Service Mesh: bar.com Control Plane (Istiod) Control Plane (Istiod) Gateway Gateway

Slide 52

Slide 52 text

Federation: New Configuration 52 ● ServiceMeshPeer - Meshes are federated in pairs, and a MeshFederation is configured for each side of the pair. ○ Gateway Configuration ○ Root Trust Configuration ● ExportedServiceSet - configures which services to export for a given federation. ● ImportedServiceSet - configures which services to import from a given federation Service A Service B Service Mesh: foo.com Service C Service D Service Mesh: bar.com Control Plane (Istiod) Control Plane (Istiod) Gateway Gateway

Slide 53

Slide 53 text

Federation: Imported Services 53 ● Once a service has been imported it can be managed as if it was a local service. ● By default, services will be identified by the remote mesh trust zone and namespace. ● Policies can then be created using the remote service identity, for example: ○ Authorization Policies ○ mTLS encryption ○ Routing rules for canary deployments, a/b testing, etc. ○ Observability (to the egress only) ○ Resilience & testing - timeouts, retries, circuit breakers, fault injection, etc. Service A Service B Service Mesh: foo.com bar.com/ namespace/ Service C Control Plane (Istiod)

Slide 54

Slide 54 text

Federation: Multi-Mesh Services 54 ● Services can be configured to be imported as if they were actually local services using importAsLocal. ● If the service already exists, the endpoints of both services will be aggregated together as a single service. ● This can be used to load balance traffic between meshes and mitigate local failures with available remote endpoints. Service A Service B Service Mesh: foo.com Service A’ Control Plane (Istiod)

Slide 55

Slide 55 text

Service Mesh Federation in Kiali ● Kiali will display the local meshes as well as services imported from other meshes. ● Federated meshes may be in the same or different OpenShift clusters. ● Services in different namespaces and clusters are given different boxes.

Slide 56

Slide 56 text

Service Mesh Personas? “Who is using the Mesh?”

Slide 57

Slide 57 text

Service Mesh Personas OpenShift Service Mesh ● “Need to know” permissions for administration: ○ Cluster Admins: ■ Manage clusters and infrastructure. ■ Often central ops/infra group. ○ Mesh Admin(s): ■ Manage one or more Service Mesh(es) - application connectivity and security within the mesh. ■ Does not require cluster admin. ○ Service Admin(s): ■ Responsible for one or more services, though may not manage service mesh resources. OpenShift Cluster Service Mesh Service Mesh Service Mesh Cluster Admin(s) Service Mesh Operator Mesh Admin(s) Service Admin(s)

Slide 58

Slide 58 text

● Software engineer, focus on business logic ● Software Architect / Platform architect ● Software Architect / Platform architect not using Kubernetes but Microservices ● Platform Administrator ● Cluster Administrator Personas

Slide 59

Slide 59 text

What is the Overhead? “Proxies and a Control Plane”

Slide 60

Slide 60 text

● The Envoy proxy uses 0.5 vCPU and 50 MB memory per 1000 requests per second going through the proxy. ● Istiod uses 1 vCPU and 1.5 GB of memory. ● The Envoy proxy adds 3.12 ms to the 90th percentile latency. https://istio.io/latest/docs/ops/deployment/performance-and-scalability/ Load test (1000 services 2000 sidecars)

Slide 61

Slide 61 text

CPU consumption scales with the following factors: ● The rate of deployment changes. ● The rate of configuration changes. ● The number of proxies connecting to Istiod. Control plane performance

Slide 62

Slide 62 text

● Number of client connections ● Target request rate ● Request size and response size ● Number of proxy worker threads ● Protocol ● CPU cores (proxy 0.6 vCPU per 1000 requests per second) ● Number and types of proxy filters, specifically telemetry v2 related filters. (A large number of listeners, clusters, and routes can increase memory usage.) ● Inside the mesh, a request traverses the client-side proxy and then the server-side proxy. In the default configuration of Istio 1.6.8 (that is, Istio with telemetry v2), the two proxies add about 3.12 ms and 3.13 ms to the 90th and 99th percentile latency, respectively, over the baseline data plane latency. Data plane performance

Slide 63

Slide 63 text

P90 latency vs client connections (1.13)

Slide 64

Slide 64 text

P99 latency vs client connections (1.13)

Slide 65

Slide 65 text

Roadmap & FAQ

Slide 66

Slide 66 text

What's New in OpenShift 4.10 66 OpenShift Service Mesh ▸ OpenShift Service Mesh 2.2 (ETA: April 2022) will be based on Istio 1.12 and Kiali 1.47+. ▸ Istio 1.12 introduces WasmPlugin API which will deprecate the ServiceMeshExtensions API introduced in 2.0. ▸ Service Mesh 2.1.1+ and 2.2 allows users to override and customize Kubernetes NetworkPolicy creation. ▸ Kiali updates in Service Mesh 2.2: ▸ Enhancements to improve viewing and navigating large service meshes ▸ View internal certificate information ▸ Set Envoy proxy log levels ▸ New Service Mesh Federation demo

Slide 67

Slide 67 text

Is OpenShift Service Mesh FIPS compliant? 67 OpenShift Service Mesh is FIPS compliant and supported on a FIPS enabled OpenShift clusters. OpenShift Service Mesh achieves FIPS compliance by ensuring that all encryption is performed using the FIPS validated OpenSSL module https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/3781 to perform all encryption (via dynamic linking). Note that as newer versions of RHEL are released, newer OpenSSL modules will need to go through NIST's extensive validation process, which can take up to 16 months. Thus, there may occasionally be lag between the latest version of OpenSSL being used with service mesh and full FIPS validation of the module.

Slide 68

Slide 68 text

When will OSSM support IPv6 Dual Stack? 68 ○ Q12022 Status: We understand that IPv6 and dual stack is particularly desirable for large meshes and in particular telco use cases. ○ Upstream Istio provides “alpha” support for IPv6 on Kubernetes, but not dual-stack. To support Service Mesh with IPv6, we will need to be able to validate and document Service Mesh on an OpenShift cluster with IPv6 configured. ○ As of OCP 4.10, IPv6/dual-stack is supported on installer-provisioned bare metal clusters with OVN-Kubernetes. We will explore this feature in second half of 2022….

Slide 69

Slide 69 text

Federation on ARO, ROSA, OSD, etc. ● When will Service Mesh federation be supported on managed OpenShift environments such as ARO, ROSA, OSD, etc. ○ Q12022 Status: To date, the service mesh team has not been able to test federation across these environments, nor do we have a time fame for doing so. The main area of concern is the configuration of the load balancers attached to the federation ingress gateways. Those need to be able to support raw tls traffic. Obviously our aim is to support federation between meshes in any OpenShift environment though. If a customer wants to federate meshes across managed OpenShift clusters, they should proceed to attempt it and document the steps you took to get there. These notes could then be helpful for us to build out our own docs and sufficient QE testing to declare support across these environments. This is not an issue that is current scheduled, and customer requests and field assistance will be needed to drive this forward. ● Product Issue: https://issues.redhat.com/browse/OSSM-693

Slide 70

Slide 70 text

Single Control Plane Multi-cluster Topology ● When will Service Mesh support a single control plane managing a multi-cluster data plane? ○ Q12022 Status: Service Mesh 2.1 introduced federation of meshes across clusters, but does not provide a central control plane for managing data planes across clusters. Upstream Istio provides this functionality by opening communication between clusters Kubernetes API servers - creating a significant security opening, unless all of the involve clusters are trusted as part of the same admin domain. This is not something we are able to facilitate in the context of a multi-tenant service mesh. Red Hat’s Advanced Cluster Management has begun work on a solution for managing multiple federated meshes from a single control plane, and this may in the future evolve into support for a multi-cluster service mesh between trusted clusters (ie single tenant). ACM will be the channel for OpenShift’s multi-cluster service mesh management. For a single control plane, multi-tenant, multi-cluster mesh solution, customers may need to consider a partner solution - such as Solo.io’s Gloo Mesh or Kong’s service mesh. ● Product Issue:

Slide 71

Slide 71 text

Navigating Scaling ` Integrations Navigating Scaling ` Integrations Navigating Scaling ` Integrations Roadmap Product Manager: Jamie Longmuir Near Term - 2022 Q1 OSSM 2.2 Mid Term - 2022 Q2 OSSM 2.3 Long Term - 2022 Q3+ OSSM 2.4+ ● Continue to evolve federation for multi-cluster service mesh use cases ● More flexible integration with Network Policies 71 ● Internal improvements to increase release cadence - keeping closer to upstream Istio. ● Kiali enhancements for large meshes and federation ● OpenShift Console multi-cluster mesh admin ● Support (unmanaged) Service Mesh on Red Hat OpenShift on AWS (ROSA) ● Support for Service Mesh with OpenShift Virtualization (OCP 4.10) ● Update to Istio 1.12 ● Continue to evolve Federation for multi-cluster service mesh use cases ● Cluster-wide installation option Service Mesh troubleshooting guide with Kiali ● Kiali enhancements for managing and validating federated Service Meshes. ● Service Mesh on external VMs ● IPv6 support ● Multi-cluster service mesh with ACM ● Continue to optimize performance and scalability of Istio and Envoy ● Kiali support for centralized multi-cluster service mesh ● Enhanced CLI support for Service Mesh ● Support Service Mesh and OpenShift Multi-Cluster management ● Gateway API (Ingress v2) ● Keep within 1 release of latest Istio ● Update to Istio 1.14+

Slide 72

Slide 72 text

Service Mesh 2.2 72 ● Upgrade Istio to 1.12 ● Internal enhancements to stay closer to upstream Istio over time ○ Release OSSM at most 2 release behind Istio ● Minor customer driven feature enhancements ● Target: Late Q1 2022

Slide 73

Slide 73 text

Service Mesh 2.3 73 ● Upgrade Istio to 1.14+ ● Candidate features: ○ Service Mesh on external VMs ○ Additional multi-cluster use cases ● Target: Late Q2 2022

Slide 74

Slide 74 text

Service Mesh with External VMs ● Sometimes there are services outside of Kubernetes that you want to include in the Service Mesh. ● Examples: ○ Legacy services running on VMs or bare metal. ○ External datastores. ● Including these services in the mesh can provide the same security, observability and traffic management features as services within the cluster. Service A Service B Service Mesh: foo.com Service C Control Plane OpenShift Cluster Legacy Virtual Machine

Slide 75

Slide 75 text

Late 2022: Additional Multi-Cluster Use Cases ● Additional use cases - such as a central logical control plane to manage a service mesh dataplane across multiple clusters. ● In conjunction with OpenShift’s Advanced Cluster Management (ACM). Service A Service B Service Mesh: foo.com Service C Service D Single Logical Service Mesh Control Plane Cluster 2 Cluster 1 Cluster

Slide 76

Slide 76 text

linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat Thank You 76

Slide 77

Slide 77 text

Backup Slides BACKUP 77

Slide 78

Slide 78 text

Connecting Services Outside the Mesh ● External communication occurs via Gateway proxies, that are also part of the mesh. ● Ingress Gateways manage traffic entering the mesh. ○ An alternative to Kubernetes Ingress, with additional mesh features. ● Egress Gateways manage traffic exiting the mesh. ○ Can require all external services to be registered. ● On OpenShift, Service Mesh Ingress Gateways can be used in conjunction with an OpenShift route or on their own. Envoy Gateway(s) Service A Envoy Proxy Service B Envoy Proxy Service C Envoy Proxy Ingress & Egress Control Plane

Slide 79

Slide 79 text

Gateway apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: layer1-gateway spec: selector: istio: ingressgateway # use istio default controller servers: - port: number: 80 name: http protocol: HTTP hosts: - “*” ● Deployed to the Ingress envoy proxy pod within the istio control plane apiVersion: v1 kind: Pod metadata: openshift.io/scc: restricted labels: app: istio-ingressgateway istio: ingressgateway name: istio-ingressgateway-6f8cf6c85f-5989d namespace: istio-system ● Listening on port 80 for http traffic ● Accepting connections from any host ● Can restrict access to specific hosts with a configuration such as : hosts: - myserver1.com - anotherserver.co.uk

Slide 80

Slide 80 text

Virtual Service apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: layers spec: hosts: “*” gateways: - layer1-gateway http: route: - destination: host: layer1 port: number: 8080 ● Defines a set of routing rules applied to a specific host ● Associated with a gateway into which the rules are applied ● spec.hosts - The destination service to which traffic is being sent ○ Matches the hosts specification for the gateway when a gateway is referenced ● Directs traffic to the identified Kubernetes destination service ○ host: layer1 ● http section may contain matching rules for the desired conditions : ○ uri match ○ Timeout / retry / redirect / fault injection ○ Rewrite uri’s

Slide 81

Slide 81 text

Virtual service and Destination rule apiVersion: .../v1alpha3 kind: VirtualService metadata: name: layer2-a spec: hosts: - layer2-a http: route: - destination: host: layer2-a port: number: 8080 subset: inst-1 weight: 80 - destination: host: layer2-a port: number: 8080 subset: inst-2 weight: 20 apiVersion: .../v1alpha3 kind: DestinationRule metadata: name: layer2-a spec: host: layer2-a subsets: - name: inst-1 labels: instance: instance1 - name: inst-2 labels: instance: instance2 apiVersion: v1 kind: Service metadata: name: layer2-a labels: app: layer2-a spec: ports: - name: http port: 8080 selector: app: layer2-a apiVersion: apps/v1 kind: Deployment metadata: mame: Deployment-1-inst1 labels: app: layer2-a instance: instance1 apiVersion: apps/v1 kind: Deployment metadata: mame: Deployment-1-inst2 labels: app: layer2-a instance: instance2 80% 20% VS Virtual service DR Destination rule UI Service Deployment -1 Instance 1 Instance 2

Slide 82

Slide 82 text

Kiali service visualisation 82 ● Graphical representation of service connectivity ● Display options ○ Applications, services, versioned applications, workload (no application grouping) ○ Label connectors by percentage traffic, request per second, response time ○ Annotation options ○ Find / hide content ○ Wide range of query logic

Slide 83

Slide 83 text

83 Kiali service visualisation

Slide 84

Slide 84 text

84 Kiali visualisation of mesh resources ● View resources ○ Virtual services ○ Gateway ○ Destination rules ● Analysis of errors and helpful annotation

Slide 85

Slide 85 text

85 Jaeger analytics