$30 off During Our Annual Pro Sale. View details »

Ask an OpenShift Admin Episode 61: Service Mesh

Ask an OpenShift Admin Episode 61: Service Mesh

These slides were used during episode 61 of the Ask an OpenShift Admin livestream. For more information, see the stream here: https://www.youtube.com/watch?v=Mxcp1F4bJNE.

Do you need better insight into how your application services are communicating with each other? Do you wish you had the ability to fully implement blue/green deployment scenarios with little administrative overhead?

Red Hat Service Mesh gives OpenShift Administrators insights to their clusters with built-in observability and traceability features. They can control traffic distribution to their applications and operate in a true blue/green deployment scenario. Red Hat Service Mesh provides all of these capabilities without the need to change the application code!

Red Hat Livestreaming

March 17, 2022

More Decks by Red Hat Livestreaming

Other Decks in Technology


  1. penShift Service Mesh rtwin Schneider Principal Technical Marketing Manager, Red

    Hat 1
  2. OpenShift Service Mesh Contents • What is a Service Mesh?

    • Why do I need a Service Mesh at all? • Key Capabilities and Usage Scenarios • OpenShift Service Mesh and how can I get it? • What type of apps is it for? • Should I care about a Service Mesh? Personas • What is the Overhead? • Service Mesh across clusters - Mesh Federation • Roadmap & FAQ
  3. What is a Service Mesh? “Proxies and a Control Plane”

  4. What is a Service Mesh ? A programmable network?! •

    ... a bunch of userspace proxies as sidecar next to your services • ... a controlplane with management components managing the proxies and providing an API • ... proxies intercept calls and “do” something with them • … the proxies are Layer 7-aware and act as proxies and reverse proxies
  5. Istio Service Mesh - Architecture

  6. Connecting Services within the Mesh • All service pods are

    given an Envoy proxy as a sidecar container. Together, these form the Data Plane. • All communications occur through these proxies. • This creates a mesh of communication that has full visibility and control of all traffic. • The proxies - and thus the mesh, are configured and managed by a central Control Plane. Service A Envoy Proxy Service B Envoy Proxy Service C Envoy Proxy Control Plane
  7. What is a Service Mesh ? An Abstraction of Microservice

    Connectivity • ... a strict control mechanism over the communication of a set of microservices • ... a ‘firewall’ and a ‘router’ for incoming requests • ... it is completely abstracted and invisible to the microservices themselves • … it helps transition monolithic applications to distributed microservice architecture
  8. Key Capabilities of a Service Mesh

  9. Service Mesh - Key capabilities • Traffic Management ◦ Control

    the flow of traffic and API calls between services ◦ Make calls more reliable ◦ Make the network more robust in the face of adverse conditions ◦ Give applications greater flexibility for deployment • Observability ◦ Understand the dependencies between services ◦ Identify the nature and flow of traffic between them services ◦ Quickly identify issues ◦ Observe and demonstrate traffic flow and communication timing
  10. Service Mesh - Key capabilities • Policy Enforcement ◦ Apply

    organizational policy to the interaction between services ◦ Ensure access policies are enforced and resources are fairly distributed among consumers ◦ Policy changes are made by configuring the mesh, not by changing application code • Service Identity and Security ◦ Provide services in the mesh with a verifiable identity ◦ Protect service traffic as it flows over networks of varying degrees of trust
  11. Why do I need Service Mesh?

  12. Developing Microservices A Common Pattern • A common pattern when

    developing microservices. • In Development: ◦ New services are written. ◦ They are tested locally - looks good! ◦ The are tested in a staging cluster - looks good! • Ship it! Service A Service B Service C Gateway ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
  13. Microservice in Production A Common Pattern Service A Service B

    Service C Gateway X ? ? ? ? ? ? ? • In production, things become less predictable: ◦ Sporadic delays and failures are seen. ◦ Performance is not as expected. ◦ Security holes may be discovered. ◦ Fixes are made, but upgrades cause further issues. • Microservices are distributed systems and troubleshooting distributed systems is hard.
  14. The Fallacies of Distributed Computing Microservices are Distributed Systems Service

    A Service B Service C Gateway ? ? ? ? ? ? ? ? • These challenges are a result of the fallacies of distributed computing: ◦ The network is reliable. ◦ Latency is zero. ◦ Bandwidth is infinite. ◦ The network is secure. ◦ Topology doesn't change. ◦ There is one administrator. ◦ Transport cost is zero. ◦ The network is homogeneous.
  15. Why Service Mesh? Solving Microservices Challenges with Code • These

    challenges are often mitigated with: ◦ Code to handle failures between services. ◦ Logs, metrics and traces in source code. ◦ 3rd party libraries for managing deployments, security and more. • A wide range of open source libraries exist to managing these challenges (Netflix are best known) • This results in: ◦ Different solutions in different services. ◦ Boilerplate code. ◦ New dependencies to keep up date. Every Service Service ...and more boilerplate code. Traffic Management Code Failure Handling Code Metrics & Tracing Code Security Code Container Platform
  16. Why Service Mesh? An Abstraction for Microservice Challenges • Service

    Mesh solve distributed systems challenges at a common infrastructure layer. • This reduces boilerplate code and copy/paste errors across services. • Enforces common policies across all services. • Removes the obligation to implement cross cutting concerns from developers. Service Container Platform Service Mesh Services Without Service Mesh Services With Service Mesh Service ...and more boilerplate code. Traffic Management Code Failure Handling Code Metrics & Tracing Code Security Code Container Platform
  17. What type of app is it for? “Microservices, Monolithic, Serverless

  18. What type of apps ? • Microservices? Yes! That is

    what it’s “made” for • Monolithic? Yes, but … • Serverless? Yes • Jobs? Yes, but … • Event based, Kafka, Message Brokers? Yes, but … • Non containerized? VMs? Bare Metal? Yes
  19. OpenShift Service Mesh and how can I get it?

  20. OpenShift Service Mesh OpenShift Service Mesh • Based on the

    upstream Istio.io project, and maintained as the downstream Maistra.io project. • Built on the upstream Istio.io, though not the bleeding edge: ◦ Red Hat performs validation and QA on upstream Istio releases to ensure they are ready for production support. ◦ Fixes and enhancements are contributed to upstream Istio. ◦ Maistra.io maintains a unique set of features for OpenShift Service Mesh customers. ◦ OpenShift Service Mesh 2.1 is based on Istio 1.9.
  21. OpenShift Service Mesh Next generation service management through Open source

    software The service mesh - traffic management and control User interface for service communication visualisation Analytics and timing information for service communication
  22. Management, Monitoring & Observability • OpenShift Service Mesh includes a

    baked in stack for management, monitoring and observability: ◦ Kiali, with its topology view can be used to observe, manage and troubleshoot the mesh. ◦ Grafana and Prometheus provide out of the box metrics and monitoring for all services. ◦ Jaeger and ElasticSearch capture distributed traces providing a “per request” view for isolating bottlenecks between services.
  23. What's New in OpenShift 4.10 23 • Service mesh |

    Serverless • Builds | CI/CD pipelines • GitOps | Distributed Tracing • Log management • Cost management • Languages and runtimes • API management • Integration • Messaging • Process automation • Databases | Cache • Data ingest and preparation • Data analytics • AI/ML • Developer CLI | IDE • Plugins and extensions • CodeReady workspaces • CodeReady containers Developer services Developer productivity Kubernetes cluster services Install | Over-the-air updates | Networking | Ingress | Storage | Monitoring | Log forwarding | Registry | Authorization | Containers | VMs | Operators | Helm Linux (container host operating system) Kubernetes (orchestration) Physical Virtual Private cloud Public cloud Edge Cluster security Global registry Multicluster management Data services* Data-driven insights Application services* Build cloud-native apps Platform services Manage workloads * Red Hat OpenShift® includes supported runtimes for popular languages/frameworks/databases. Additional capabilities listed are from the Red Hat Application Services and Red Hat Data Services portfolios. ** Disaster recovery, volume and multicloud encryption, key management service, and support for multiple clusters and off-cluster workloads requires OpenShift Data Foundation Advanced Observability | Discovery | Policy | Compliance | Configuration | Workloads Image management | Security scanning | Geo-replication Mirroring | Image builds Declarative security | Container vulnerability management | Network segmentation | Threat detection and response RWO, RWX, Object | Efficiency | Performance | Security | Backup | DR Multicloud gateway Cluster data management Red Hat open hybrid cloud platform
  24. Connect, Secure, Control and Observe Services on OpenShift • A

    software infrastructure layer between Kubernetes and your services for managing communications. • Handles common “microservice” challenges, so that developers don’t have to: ◦ Security ◦ Monitoring & Observability ◦ Application Resilience ◦ Upgrades, Rollouts & A/B Testing ◦ And more... Product Manager: Jamie Longmuir and Mauricio "Maltron" Leal OPENSHIFT OpenShift Service Mesh Istio Jaeger Red Hat Enterprise Linux CoreOS Physical Virtual Private cloud Public cloud Services F * Eventing is currently in Technology Preview ** Functions are currently a work in progress initiative Kiali OpenShift Service Mesh Envoy Envoy Envoy
  25. Installation & Management • OpenShift Service Mesh is Operator driven,

    installed and upgraded via OpenShift’s OperatorHub. • A custom resource (CRD) called a ServiceMeshControlPlane is used for configuring control plane components, including: ◦ Number of replicas (for a highly available Control Plane) ◦ Resource requests ◦ Node affinity ◦ and more... • ServiceMeshMemberRoll and ServiceMeshMember resources configure which projects are part of the mesh.
  26. • OpenShift Service Mesh provides a multi-tenant topology where multiple

    service meshes are deployed within a single OpenShift cluster. • A mesh consists of one or more projects (namespaces). • Each mesh is isolated and managed independently. • Communication between meshes involves configuring one or more Gateways, as you would for accessing external services. Service A Service B Service Mesh: foo.com Service C Service D Service Mesh: bar.com Control Plane Control Plane Project: foo-istio-system Project: bar-istio-system Project: foo Project: bar Multi-Tenant Service Mesh
  27. Service Mesh with OpenShift Routes OpenShift Service Mesh • In

    Service Mesh, an Ingress Gateway is used for accessing services within the Mesh. • The Ingress Gateway is a standalone Envoy proxy that acts as an entry point into the mesh. • In OpenShift, a route* acts as an entrypoint into the cluster, backed by HAProxy. • OpenShift Service Mesh automatically creates and configures routes when Ingress Gateways are created. * OpenShift also supports Kubernetes Ingress (which were inspired by routes) and Red Hat is an active contributor in the next generation of Ingress - Service APIs. Ingress Gateway Service A Service B Service C OpenShift Route OpenShift Cluster Network
  28. Security & Compliance OpenShift Service Mesh • Reduced Permissions for

    Service Mesh administration: ◦ Upstream Istio requires users to have elevated privileges to manage a Service Mesh. ◦ In OpenShift, the Service Mesh Operator performs privileged operations on behalf of individual mesh installations. ◦ Control Plane and Data Plane components require no elevated permissions to be granted to users. ◦ Service Mesh components only have visibility within their mesh namespaces ◦ This reduces the level of permissions required to manage a service mesh, controlled using kubernetes RBAC. OpenShift Cluster Service Mesh Service Mesh Service Mesh Cluster Admin Service Mesh Operator Mesh User Mesh User Mesh User
  29. Security & Compliance OpenShift Service Mesh • OpenSSL Encryption: ◦

    OpenShift Service Mesh uses RHEL’s OpenSSL library in place of the BoringSSL library used by upstream Istio. ◦ OpenSSL is the standard cryptographic library within Red Hat, supported by the RHEL team at Red Hat. ◦ Facilitates FIPS compliance, taking advantage of the OpenSSL FIPS Object Module
  30. Service Mesh with API Management OpenShift Service Mesh • 3Scale

    is Red Hat’s API Management solution that makes it easy to share, secure, distribute, control and monetize your APIs. • Available as both a hosted SaaS offering, as well as on premises. • 3Scale integrates directly with OpenShift Service Mesh: ◦ As of 2.0, this integration uses Istio’s Mixer component (deprecated) ◦ As of Service Mesh 2.1, this will use a WebAssembly Extension plugin.
  31. Difference to Upstream Istio?

  32. CONFIDENTIAL Integrations with OpenShift components such as OperatorHub, OpenShift Routes

    and 3Scale API Management. OpenShift Integrations OpenShift Service Mesh OpenShift Service Mesh vs Istio Multiple meshes securely deployed within the same cluster, with each mesh isolated and managed independently. Multi-Tenant Architecture Pre-configured Kiali, Jaeger and Grafana for simplified management, monitoring and observability. Management, Monitoring & Observability Control Plane and Data Plane components execute with standard privileges. OpenSSL for FIPS compliance. Security & Compliance Focus Additions for Red Hat’s Enterprise & Public Sector Customers
  33. What is the difference between Istio and OSSM? 33 ◦

    For automatic injection, we use annotation on the deployment instead of the namespace label that upstream uses. All services to be included in the mesh must have these annotations. ◦ We add network policies which change the network behavior - restricting traffic from outside of the mesh, and opening traffic inside the mesh (to be managed by Mesh policies). This feature can optionally be disabled. ◦ We replaced MeshPolicy with ServiceMeshPolicy and ClusterRbacConfig with ServicemeshRbacConfig. ◦ We are multi-tenant by default. We have a servicemesh member roll/service mesh control plane that would need to be configured with the projects that are to be included in the mesh. We are exploring a cluster-wide installation option similar to upstream for 2.3/2.4.
  34. Service Mesh Scenarios “Connect, secure, observe and control traffic”

  35. Service Mesh in operation With service mesh External user UI

    Service Sidecar Container G VS Gateway Virtual service Application Container DR Destination rule Istiod External user UI Service Route FQDN Application Container Without service mesh
  36. Gateway and virtual service 36 • Gateway ◦ Ingress controller

    provided as part of the Istio control plane ◦ Standalone Envoy proxy at the edge of the mesh ◦ Load balancing for incoming traffic (layer 4 - 6) • Virtual Service ◦ Configuration of routing requirements ◦ Operates alongside destination rules ◦ Distributed to the sidecar proxy containers by istiod (control plane) ◦ Fine grained control of traffic management Layer 1 - physical structure Layer 2 - Data link (frames) Layer 3 - IP packets Layer 4 - Transport (TCP/UDP) Layer 5 - Session (API’s ) Layer 6 - Presentation (SSL) Layer 7 - Application (http) Virtual service OSI Stack Gateway
  37. Virtual service example - Traffic control 37 apiVersion: networking.istio.io/v1alpha3 kind:

    VirtualService metadata: name: layer2-a spec: hosts: - layer2-a http: - match: - uri: prefix: /call-layers - uri: exact: /get-info route: - destination: host: layer2-a port: number: 8080 subset: inst-1 weight: 80 - destination: host: layer2-a port: number: 8080 subset: inst-2 weight: 20 timeout: 1.500s Http uri matching using ‘prefix’ and ‘exact’ Prefix required for example /call-layers?key=value Route destination rules : 80% to layer-2a inst-1 20% to layer-2a inst-2 Timeout of 1.5 seconds after which communication is abandoned
  38. Releasing Services • Control all communications allows for fine-grained traffic

    control between services without source code changes or restarting services. • Create multiple “subsets” of a service (e.g. different versions) to enable: ◦ Canary Deployments - apply a small amount of traffic to a new subset using weights. ◦ A/B Testing - Apply a fraction of traffic to a different service using weights. ◦ Mirrored Launches - duplicate live traffic loads across services to see how a new service handles the real world. ◦ Header based routing. Service A V1 Service B Service C Ingress & Egress Control Plane Service A V2
  39. Securing Services • As all communication is via the proxies,

    we can enforce and manage security policies across services without source code changes. ◦ Enforce the use of mTLS encryption across all services. ◦ Authenticate requests using JSON Web Token (JWT) validation. ◦ Define service-to-service and user-to-service authorization policies. ▪ Facilitate Zero-trust networking. ◦ Secure the Service Mesh Control plane with RBAC policies. Service A Service B Service C Ingress & Egress Control Plane
  40. Monitoring & Observing Services • As communication is via proxies,

    we have full visibility of all traffic without source code changes*. ◦ Metrics and dashboards - request volumes, duration, success/failure rates, etc. ◦ Distributed Tracing - identify bottlenecks in slow request paths. ◦ OpenShift Service Mesh includes Kiali, Jaeger, Grafana, Prometheus, ElasticSearch. *Trace context propagation within services will require minor code changes. Service A Service B Service C Ingress & Egress Control Plane
  41. Securing Services - mTLS encryption • Enforce and manage security

    policies across services without source code changes. POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY TLS TLS
  42. Building Resilient Services • Timeout ◦ Allow more time for

    the application to respond to the Envoy proxy apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: layer2 spec: hosts: - layer2 http: - match: - uri: exact: /call-layers route: - destination: host: layer2 port: number: 8080 subset: v1 retries: attempts: 5 perTryTimeout: 10s POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY timeout: 10 sec retry: 5 timeout: 15 sec retry: 5
  43. Testing Service Resilience • Circuit Breaker configuration : ◦ Set

    threshold limits beyond which the circuit breaker "trips" ◦ Further traffic is prevented by the service mesh ◦ Build resilience into applications ◦ Provide a microservice with the time required to recover apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: userprofile spec: host: userprofile subsets: - name: v3 labels: version: '3.0' trafficPolicy: connectionPool: http: http1MaxPendingRequests: 1 maxRequestsPerConnection: 1 outlierDetection: consecutiveErrors: 1 interval: 1s baseEjectionTime: 10m maxEjectionPercent: 100 POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY
  44. Testing Service Resilience • Fault injection provides the ability to

    validate how services will perform when failures inevitably occur. ◦ Examples: ▪ 50% of requests to a service will fail with error code 503. ◦ Enables chaos engineering. apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: userprofile spec: hosts: - userprofile http: - fault: abort: httpStatus: 503 percentage: value: 50 route: - destination: host: userprofile subset: v3 POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY 503 status in 50% of responses
  45. Testing Service Resilience • Fault injection provides the ability to

    validate how services will perform when failures inevitably occur. ◦ Examples: ▪ 20% of services will be delayed by 5 seconds. ◦ Enables chaos engineering. apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: userprofile spec: hosts: - userprofile http: - fault: delay: fixedDelay: 5s percentage: value: 20 route: - destination: host: userprofile subset: v3 POD SERVICE A ENVOY POD SERVICE B ENVOY POD SERVICE C ENVOY 5 sec delay in 30% of requests
  46. Extending Service Mesh • The power of Envoy is that

    it is highly extensible with WebAssembly Extensions. • WebAssembly is a format that allows extensions to be written in more than 15 programming languages. • This will allow mesh operators to incorporate custom cross-cutting functionality at the proxy level. Service Service Service Ingress & Egress Control Plane
  47. Service Mesh Across Clusters OpenShift Service Mesh 2.1

  48. Upstream Istio offers multiple deployment models for multicluster services meshes.

    • Multi-Primary, Primary-Remote, External Control Plane, etc. • These assume that the mesh and cluster admins are part of the same administrative boundary - no multi-tenancy. • These topologies connect the IstioD control planes and the Kubernetes API servers of all involved meshes and clusters - a security risk. Istio Multicluster Topologies Istiod B Kubernetes API Server B Istiod A Kubernetes API Server A Istiod C Kubernetes API Server C
  49. • OpenShift Service Mesh’s Multi-Cluster Strategy aims to put security

    first and support multi-tenant environments. • Multi-Tenant support has been a pillar of OpenShift Service Mesh from day 1. • We have divided our multi-cluster approach into two categories: ◦ Service Mesh Federation - securely connecting distinct service meshes across multiple clusters to enable sharing, load balancing and failover scenarios. ◦ Multi-cluster Service Mesh - a single service mesh stretched across multiple clusters managed by a single control plane. OpenShift Service Mesh Multi-Cluster
  50. Service Mesh Federation - Topology 50 • Each mesh remains

    distinct with its own control plane. • Federated meshes may be in the same or different OpenShift clusters. • All traffic between meshes is via configurable Ingress/Egress Gateways ◦ Connectivity between Gateways is a prerequisite • For multi-cluster, there is no need to connect with the Kubernetes API server. Service Service Service Service Control Plane (Istiod) Control Plane (Istiod) Gateway Gateway
  51. Service Mesh Federation - Administration 51 • Provides a “need

    to know” model for multi-cluster service mesh • Decisions around exposing services between meshes are delegated to mesh administrators. • Services must explicitly be configured to be exported and visible to other meshes. ◦ Including configuring trust domains between meshes. Service A Service B Service Mesh: foo.com Service C Service D Service Mesh: bar.com Control Plane (Istiod) Control Plane (Istiod) Gateway Gateway
  52. Federation: New Configuration 52 • ServiceMeshPeer - Meshes are federated

    in pairs, and a MeshFederation is configured for each side of the pair. ◦ Gateway Configuration ◦ Root Trust Configuration • ExportedServiceSet - configures which services to export for a given federation. • ImportedServiceSet - configures which services to import from a given federation Service A Service B Service Mesh: foo.com Service C Service D Service Mesh: bar.com Control Plane (Istiod) Control Plane (Istiod) Gateway Gateway
  53. Federation: Imported Services 53 • Once a service has been

    imported it can be managed as if it was a local service. • By default, services will be identified by the remote mesh trust zone and namespace. • Policies can then be created using the remote service identity, for example: ◦ Authorization Policies ◦ mTLS encryption ◦ Routing rules for canary deployments, a/b testing, etc. ◦ Observability (to the egress only) ◦ Resilience & testing - timeouts, retries, circuit breakers, fault injection, etc. Service A Service B Service Mesh: foo.com bar.com/ namespace/ Service C Control Plane (Istiod)
  54. Federation: Multi-Mesh Services 54 • Services can be configured to

    be imported as if they were actually local services using importAsLocal. • If the service already exists, the endpoints of both services will be aggregated together as a single service. • This can be used to load balance traffic between meshes and mitigate local failures with available remote endpoints. Service A Service B Service Mesh: foo.com Service A’ Control Plane (Istiod)
  55. Service Mesh Federation in Kiali • Kiali will display the

    local meshes as well as services imported from other meshes. • Federated meshes may be in the same or different OpenShift clusters. • Services in different namespaces and clusters are given different boxes.
  56. Service Mesh Personas? “Who is using the Mesh?”

  57. Service Mesh Personas OpenShift Service Mesh • “Need to know”

    permissions for administration: ◦ Cluster Admins: ▪ Manage clusters and infrastructure. ▪ Often central ops/infra group. ◦ Mesh Admin(s): ▪ Manage one or more Service Mesh(es) - application connectivity and security within the mesh. ▪ Does not require cluster admin. ◦ Service Admin(s): ▪ Responsible for one or more services, though may not manage service mesh resources. OpenShift Cluster Service Mesh Service Mesh Service Mesh Cluster Admin(s) Service Mesh Operator Mesh Admin(s) Service Admin(s)
  58. • Software engineer, focus on business logic • Software Architect

    / Platform architect • Software Architect / Platform architect not using Kubernetes but Microservices • Platform Administrator • Cluster Administrator Personas
  59. What is the Overhead? “Proxies and a Control Plane”

  60. • The Envoy proxy uses 0.5 vCPU and 50 MB

    memory per 1000 requests per second going through the proxy. • Istiod uses 1 vCPU and 1.5 GB of memory. • The Envoy proxy adds 3.12 ms to the 90th percentile latency. https://istio.io/latest/docs/ops/deployment/performance-and-scalability/ Load test (1000 services 2000 sidecars)
  61. CPU consumption scales with the following factors: • The rate

    of deployment changes. • The rate of configuration changes. • The number of proxies connecting to Istiod. Control plane performance
  62. • Number of client connections • Target request rate •

    Request size and response size • Number of proxy worker threads • Protocol • CPU cores (proxy 0.6 vCPU per 1000 requests per second) • Number and types of proxy filters, specifically telemetry v2 related filters. (A large number of listeners, clusters, and routes can increase memory usage.) • Inside the mesh, a request traverses the client-side proxy and then the server-side proxy. In the default configuration of Istio 1.6.8 (that is, Istio with telemetry v2), the two proxies add about 3.12 ms and 3.13 ms to the 90th and 99th percentile latency, respectively, over the baseline data plane latency. Data plane performance
  63. P90 latency vs client connections (1.13)

  64. P99 latency vs client connections (1.13)

  65. Roadmap & FAQ

  66. What's New in OpenShift 4.10 66 OpenShift Service Mesh ▸

    OpenShift Service Mesh 2.2 (ETA: April 2022) will be based on Istio 1.12 and Kiali 1.47+. ▸ Istio 1.12 introduces WasmPlugin API which will deprecate the ServiceMeshExtensions API introduced in 2.0. ▸ Service Mesh 2.1.1+ and 2.2 allows users to override and customize Kubernetes NetworkPolicy creation. ▸ Kiali updates in Service Mesh 2.2: ▸ Enhancements to improve viewing and navigating large service meshes ▸ View internal certificate information ▸ Set Envoy proxy log levels ▸ New Service Mesh Federation demo
  67. Is OpenShift Service Mesh FIPS compliant? 67 OpenShift Service Mesh

    is FIPS compliant and supported on a FIPS enabled OpenShift clusters. OpenShift Service Mesh achieves FIPS compliance by ensuring that all encryption is performed using the FIPS validated OpenSSL module https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/3781 to perform all encryption (via dynamic linking). Note that as newer versions of RHEL are released, newer OpenSSL modules will need to go through NIST's extensive validation process, which can take up to 16 months. Thus, there may occasionally be lag between the latest version of OpenSSL being used with service mesh and full FIPS validation of the module.
  68. When will OSSM support IPv6 Dual Stack? 68 ◦ Q12022

    Status: We understand that IPv6 and dual stack is particularly desirable for large meshes and in particular telco use cases. ◦ Upstream Istio provides “alpha” support for IPv6 on Kubernetes, but not dual-stack. To support Service Mesh with IPv6, we will need to be able to validate and document Service Mesh on an OpenShift cluster with IPv6 configured. ◦ As of OCP 4.10, IPv6/dual-stack is supported on installer-provisioned bare metal clusters with OVN-Kubernetes. We will explore this feature in second half of 2022….
  69. Federation on ARO, ROSA, OSD, etc. • When will Service

    Mesh federation be supported on managed OpenShift environments such as ARO, ROSA, OSD, etc. ◦ Q12022 Status: To date, the service mesh team has not been able to test federation across these environments, nor do we have a time fame for doing so. The main area of concern is the configuration of the load balancers attached to the federation ingress gateways. Those need to be able to support raw tls traffic. Obviously our aim is to support federation between meshes in any OpenShift environment though. If a customer wants to federate meshes across managed OpenShift clusters, they should proceed to attempt it and document the steps you took to get there. These notes could then be helpful for us to build out our own docs and sufficient QE testing to declare support across these environments. This is not an issue that is current scheduled, and customer requests and field assistance will be needed to drive this forward. • Product Issue: https://issues.redhat.com/browse/OSSM-693
  70. Single Control Plane Multi-cluster Topology • When will Service Mesh

    support a single control plane managing a multi-cluster data plane? ◦ Q12022 Status: Service Mesh 2.1 introduced federation of meshes across clusters, but does not provide a central control plane for managing data planes across clusters. Upstream Istio provides this functionality by opening communication between clusters Kubernetes API servers - creating a significant security opening, unless all of the involve clusters are trusted as part of the same admin domain. This is not something we are able to facilitate in the context of a multi-tenant service mesh. Red Hat’s Advanced Cluster Management has begun work on a solution for managing multiple federated meshes from a single control plane, and this may in the future evolve into support for a multi-cluster service mesh between trusted clusters (ie single tenant). ACM will be the channel for OpenShift’s multi-cluster service mesh management. For a single control plane, multi-tenant, multi-cluster mesh solution, customers may need to consider a partner solution - such as Solo.io’s Gloo Mesh or Kong’s service mesh. • Product Issue:
  71. Navigating Scaling ` Integrations Navigating Scaling ` Integrations Navigating Scaling

    ` Integrations Roadmap Product Manager: Jamie Longmuir Near Term - 2022 Q1 OSSM 2.2 Mid Term - 2022 Q2 OSSM 2.3 Long Term - 2022 Q3+ OSSM 2.4+ • Continue to evolve federation for multi-cluster service mesh use cases • More flexible integration with Network Policies 71 • Internal improvements to increase release cadence - keeping closer to upstream Istio. • Kiali enhancements for large meshes and federation • OpenShift Console multi-cluster mesh admin • Support (unmanaged) Service Mesh on Red Hat OpenShift on AWS (ROSA) • Support for Service Mesh with OpenShift Virtualization (OCP 4.10) • Update to Istio 1.12 • Continue to evolve Federation for multi-cluster service mesh use cases • Cluster-wide installation option Service Mesh troubleshooting guide with Kiali • Kiali enhancements for managing and validating federated Service Meshes. • Service Mesh on external VMs • IPv6 support • Multi-cluster service mesh with ACM • Continue to optimize performance and scalability of Istio and Envoy • Kiali support for centralized multi-cluster service mesh • Enhanced CLI support for Service Mesh • Support Service Mesh and OpenShift Multi-Cluster management • Gateway API (Ingress v2) • Keep within 1 release of latest Istio • Update to Istio 1.14+
  72. Service Mesh 2.2 72 • Upgrade Istio to 1.12 •

    Internal enhancements to stay closer to upstream Istio over time ◦ Release OSSM at most 2 release behind Istio • Minor customer driven feature enhancements • Target: Late Q1 2022
  73. Service Mesh 2.3 73 • Upgrade Istio to 1.14+ •

    Candidate features: ◦ Service Mesh on external VMs ◦ Additional multi-cluster use cases • Target: Late Q2 2022
  74. Service Mesh with External VMs • Sometimes there are services

    outside of Kubernetes that you want to include in the Service Mesh. • Examples: ◦ Legacy services running on VMs or bare metal. ◦ External datastores. • Including these services in the mesh can provide the same security, observability and traffic management features as services within the cluster. Service A Service B Service Mesh: foo.com Service C Control Plane OpenShift Cluster Legacy Virtual Machine
  75. Late 2022: Additional Multi-Cluster Use Cases • Additional use cases

    - such as a central logical control plane to manage a service mesh dataplane across multiple clusters. • In conjunction with OpenShift’s Advanced Cluster Management (ACM). Service A Service B Service Mesh: foo.com Service C Service D Single Logical Service Mesh Control Plane Cluster 2 Cluster 1 Cluster
  76. linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat Thank You 76

  77. Backup Slides BACKUP 77

  78. Connecting Services Outside the Mesh • External communication occurs via

    Gateway proxies, that are also part of the mesh. • Ingress Gateways manage traffic entering the mesh. ◦ An alternative to Kubernetes Ingress, with additional mesh features. • Egress Gateways manage traffic exiting the mesh. ◦ Can require all external services to be registered. • On OpenShift, Service Mesh Ingress Gateways can be used in conjunction with an OpenShift route or on their own. Envoy Gateway(s) Service A Envoy Proxy Service B Envoy Proxy Service C Envoy Proxy Ingress & Egress Control Plane
  79. Gateway apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: layer1-gateway spec: selector:

    istio: ingressgateway # use istio default controller servers: - port: number: 80 name: http protocol: HTTP hosts: - “*” • Deployed to the Ingress envoy proxy pod within the istio control plane apiVersion: v1 kind: Pod metadata: openshift.io/scc: restricted labels: app: istio-ingressgateway istio: ingressgateway name: istio-ingressgateway-6f8cf6c85f-5989d namespace: istio-system • Listening on port 80 for http traffic • Accepting connections from any host • Can restrict access to specific hosts with a configuration such as : hosts: - myserver1.com - anotherserver.co.uk
  80. Virtual Service apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: layers spec:

    hosts: “*” gateways: - layer1-gateway http: route: - destination: host: layer1 port: number: 8080 • Defines a set of routing rules applied to a specific host • Associated with a gateway into which the rules are applied • spec.hosts - The destination service to which traffic is being sent ◦ Matches the hosts specification for the gateway when a gateway is referenced • Directs traffic to the identified Kubernetes destination service ◦ host: layer1 • http section may contain matching rules for the desired conditions : ◦ uri match ◦ Timeout / retry / redirect / fault injection ◦ Rewrite uri’s
  81. Virtual service and Destination rule apiVersion: .../v1alpha3 kind: VirtualService metadata:

    name: layer2-a spec: hosts: - layer2-a http: route: - destination: host: layer2-a port: number: 8080 subset: inst-1 weight: 80 - destination: host: layer2-a port: number: 8080 subset: inst-2 weight: 20 apiVersion: .../v1alpha3 kind: DestinationRule metadata: name: layer2-a spec: host: layer2-a subsets: - name: inst-1 labels: instance: instance1 - name: inst-2 labels: instance: instance2 apiVersion: v1 kind: Service metadata: name: layer2-a labels: app: layer2-a spec: ports: - name: http port: 8080 selector: app: layer2-a apiVersion: apps/v1 kind: Deployment metadata: mame: Deployment-1-inst1 labels: app: layer2-a instance: instance1 apiVersion: apps/v1 kind: Deployment metadata: mame: Deployment-1-inst2 labels: app: layer2-a instance: instance2 80% 20% VS Virtual service DR Destination rule UI Service Deployment -1 Instance 1 Instance 2
  82. Kiali service visualisation 82 • Graphical representation of service connectivity

    • Display options ◦ Applications, services, versioned applications, workload (no application grouping) ◦ Label connectors by percentage traffic, request per second, response time ◦ Annotation options ◦ Find / hide content ◦ Wide range of query logic
  83. 83 Kiali service visualisation

  84. 84 Kiali visualisation of mesh resources • View resources ◦

    Virtual services ◦ Gateway ◦ Destination rules • Analysis of errors and helpful annotation
  85. 85 Jaeger analytics