Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What is a SERVICE MESH and what it is good for?

Nane Kratzke
February 04, 2019

What is a SERVICE MESH and what it is good for?

Over the past years, service meshes have emerged as a critical but hidden component of the cloud-native stack. High-traffic companies like Paypal, Netflix, and more have all added a service mesh to their production applications. Linkerd is an open source service mesh for cloud-native applications that became an official project of the Cloud Native Computing Foundation. However, there is istio, conduit, kong, consul, aspen mesh. and many more. But what is a service mesh, exactly? Do we need more than one? And why is it suddenly relevant? This presentation explains the service mesh concept and provides a broad overview of the field.

TL;DR; A service mesh merely creates a network abstraction that is transparent to the service/application instances to simplify the managing of containerized applications and makes it easier to dynamically route, monitor and secure microservice-based applications. It enables resilient and fault-tolerant communication and traffic management between service instances of a microservice architecture.

Nane Kratzke

February 04, 2019
Tweet

More Decks by Nane Kratzke

Other Decks in Technology

Transcript

  1. What is a SERVICE MESH and what it is good

    for? Nane Kratzke, © 2019
  2. Outline A Sidecar Roadtrip through Microservica • SOME FACTS ABOUT

    MICROSERVICES AND THEIR PAINS AND GAINS • Good Old Times of the Enterprise Service Bus and the Raise of the Container • Review of Popular Service Meshs • Definition of a Service Mesh • Comparison and Selection of a Service Mesh
  3. According to a Dimensional Research Inc. survey study … •

    About 50%of survey participants have already deployed microservices in production. • Nearly all see microservices becoming the default application architecture. • 63%said the technology is meeting expectations. • However, that means: 37%see problems as well • harder to troubleshoot issues • struggle to analyze monitoring and tracing data
  4. Microservices Pains and Gains Results from industrial case studies Taken

    from: Soldani J, Tamburri DA, Van Den Heuvel WJ. "The pains and gains of microservices: A Systematic grey literature review", Journal of Systems and Software, Volume 146, 2018, Pages 215-232. DOI: 10.1016/j.jss.2018.09.082 What is the Software Engineering Research Community saying?
  5. Drivers and Barriers for Microservice Adaption Taken from: Knoche H,

    Hasselbring W. "Drivers and Barriers for Microservice Adoption – A Survey among Professionals in Germany", Journal of Enterprise Modelling and Information Systems Architectures, Vol. 14, No. 1 (2019). DOI: 10.18417/emisa.14.1 What is the Software Engineering Research Community saying?
  6. Reported Pains and Gains Lessons learned form industrial case studies

    Taken from: Soldani J, Tamburri DA, Van Den Heuvel WJ. "The pains and gains of microservices: A Systematic grey literature review", Journal of Systems and Software, Volume 146, 2018, Pages 215-232. DOI: 10.1016/j.jss.2018.09.082 Pains • Intrinsic complexity • Business logic distributed over independent evolving microservices • Retro-compatibility of versioned APIs • Security (access control and endpoint proliferation) • Increased attack surface • Distributed logging Gains • Ease of (single) microserivce development • Bounded contexts and self-contained • Cloud-native by design • Exploitation of design patterns (database per service, API gateway, circuit breaker, service discovery) • Ease of use for DevOps • Independent deployability • Straightforward integration into CI/CD chains
  7. Reported Barriers and Drivers Lessons learned form software professionals Taken

    from: Knoche H, Hasselbring W. "Drivers and Barriers for Microservice Adoption – A Survey among Professionals in Germany", Journal of Enterprise Modelling and Information Systems Architectures, Vol. 14, No. 1 (2019). DOI: 10.18417/emisa.14.1 Barriers • Insufficient Ops skills • Resistance by Ops • Insufficient Dev skills • Deployment complexity • Compatibility issues • Maturity of technologies • Resistance by Devs Drivers • Better scalability and elasticity • Better maintainability • Shorter time to market • Enabler for CD and DevOps • Cloud-native by implication • Organizational improvement • Polyglot programming • Polyglot persistance • Attractiveness as employer
  8. Outline A Sidecar Roadtrip through Microservica • some facts about

    microservices and their pains and gains • GOOD OLD TIMES OF THE ENTERPRISE SERVICE BUS AND THE RAISE OF THE CONTAINER • Review of Popular Service Meshs • Definition of a Service Mesh • Comparison and Selection of a Service Mesh
  9. In good old times of SOA ... ... we used

    enterprise service busses ESB to handle the pains. So, the concept of a „service mesh“ is not perfectly new.
  10. Maybe you heard recently about the invention of the container

    ... If a container is shipped like that, it is closely related to microservices.
  11. Outline A Sidecar Roadtrip through Microservica • some facts about

    microservices and their pains and gains • good old times of the enterprise service bus and the raise of the container • REVIEW OF POPULAR SERVICE MESHS • Definition of a Service Mesh • Comparison and Selection of a Service Mesh
  12. A SERVICE MESHis something like the NERVOUS SYSTEM of a

    modern cloud- native microservice architecture to reduce pains and barriers.
  13. What is a Service Mesh? The term service mesh is

    used to describe the network of microservices that make up such applications and the interactions between them. As a service mesh grows, it can become harder to understand and manage. Its requirements can include discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary releases, rate limiting, access control, and end-to-end authentication. Istio makes it easy to create a network of deployed services with load balancing, service-to-service authentication, monitoring, and more, without any changes in service code. You can configure and manage Istio using its control plane functionality, which includes: • Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic. • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection. • A pluggable policy layer and configuration API supporting access controls, rate limits and quotas. • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress. • Secure service-to-service communication in a cluster with strong identity-based authentication and authorization. What is NGINX saying?
  14. Features of nginMesh • Sidecar proxy. A sidecar proxy is

    a proxy instance that’s dedicated to a specific service instance. It communicates with other sidecar proxies and is managed by the orchestration framework. • Service discovery. When an instance needs to interact with a different service, it needs to find – discover – a healthy, available instance of the other service. The container management framework keeps a list of instances that are ready to receive requests. • Load balancing. In a service mesh, load balancing works from the bottom up. The list of available instances maintained by the service mesh is stack-ranked to put the least busy instances – that’s the load balancing part – at the top. • Encryption. The service mesh can encrypt and decrypt requests and responses, removing that burden from each of the services. The service mesh can also improve performance by prioritizing the reuse of existing, persistent connections, reducing the need for the computationally expensive creation of new ones. • Authentication and authorization. The service mesh can authorize and authenticate requests made from both outside and within the app, sending only validated requests to service instances. • Support for the circuit breaker pattern. The service mesh can isolate unhealthy instances, then gradually brings them back into the healthy instance pool if warranted.
  15. What is a Service Mesh? • A service mesh is

    a dedicated infrastructure layer for handling service-to-service communication. It’s responsible for the reliable delivery of requests through the complex topology of services. • In practice, the service mesh is typically implemented as an array of lightweight network proxies that are deployed alongside application code, without the application needing to be aware. • Reliably delivering requests in a cloud native application can be incredibly complex. A service mesh manages this complexity with a wide array of powerful techniques: circuit-breaking, latency-aware load balancing, eventually consistent (“advisory”) service discovery, retries, and deadlines. What is Linkerd saying?
  16. Features of Linkerd • Linkerd applies dynamic routing rules to

    determine which service the requester intended. Should the request be routed to a service in production or in staging? To a service in a local datacenter or one in the cloud? To the most recent version of a service that’s being tested or to an older one that’s been vetted in production? • Having found the correct destination, Linkerd retrieves the corresponding pool of instances from the relevant service discovery endpoint. • Linkerd chooses the instance most likely to return a fast response based on a variety of factors, including its observed latency for recent requests. • If the instance is down, unresponsive, or fails to process the request, Linkerd retries the request on another instance. • If an instance is consistently returning errors, Linkerd evicts it from the load balancing pool, to be periodically retried later. • If the deadline for the request has elapsed, Linkerd proactively fails the request (circuit-breaking) rather than adding load with further retries. • Linkerd captures every aspect of the above behavior in the form of metrics and distributed tracing, which are emitted to a centralized metrics system.
  17. What is a Service Mesh? The term service mesh is

    used to describe the network of microservices that make up such applications and the interactions between them. As a service mesh grows, it can become harder to understand and manage. Its requirements can include discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary releases, rate limiting, access control, and end-to-end authentication. Istio makes it easy to create a network of deployed services with load balancing, service-to-service authentication, monitoring, and more, without any changes in service code. You can configure and manage Istio using its control plane functionality, which includes: • Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic. • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection. • A pluggable policy layer and configuration API supporting access controls, rate limits and quotas. • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress. • Secure service-to-service communication in a cluster with strong identity-based authentication and authorization. What is istio saying?
  18. Features of Istio Traffic management: • Istio’s easy rules configuration

    and traffic routing lets you control the flow of traffic and API calls between services. • Istio simplifies configuration of service-level properties like circuit breakers, timeouts, and retries. • It is possible to set up important tasks like A/B testing, canary rollouts, and staged rollouts with percentage-based traffic splits. Security: • Istio provides the underlying secure communication channel, and manages authentication, authorization, and encryption of service communication at scale. • Service communications are secured by default, letting you enforce policies consistently across diverse protocols and runtimes. Observability: • Istio’s robust tracing, monitoring, and logging give you deep insights into your service mesh deployment. • Istio’s Mixer component is responsible for policy controls and telemetry collection. • It provides backend abstraction and intermediation, insulating the rest of Istio from the implementation details of individual infrastructure backends, and giving operators fine-grained control over all interactions between the mesh and infrastructure backends.
  19. Outline A Sidecar Roadtrip through Microservica • some facts about

    microservices and their pains and gains • good old times of the enterprise service bus and the raise of the container • review of popular service meshs • DEFINITION OF A SERVICE MESH • Comparison and Selection of a Service Mesh
  20. Service Mesh A Definition • Building microservices is easy. •

    Operating a microservice architecture is hard. Definition proposal: A Service Mesh creates a network abstraction to simplify the managing of containerized applications and makes it easier to dynamically route, monitor and secure microservice-based applications. OK, we found noteworthy similarities. Let us summarize them ...
  21. Common Features of a Service Mesh Traffic management Dynamic rule-based

    traffic routing Rule- and percentage based traffic split Rate limit Quotas Resiliency Circuit-breaker Timeouts Retries Recovery Latency-aware load balancing Health-aware service discovery Security Authentication Authorization Encryption of service communication TLS endpoint termination Certificate handling Observerability Tracing Monitoring Logging Telemetry collection
  22. Components of a Service Mesh Data plane Service to service

    communication Transparently via the sidecar proxy Services must be not aware of the service mesh Control plane Managing routes and rules Configuration of service-level properties (timeouts, retries, …) Service discovery Access control Policy control Telemetry collection Proxy Sidecar pattern Each service has a dedicated proxy Communicates with other sidecar proxies Collects tracing data Colects monitoring data Managed by the orchestration framework
  23. Architecture of a Service Mesh • Control plane • Service

    Discovery • Telemetry • Security • Data plane • Proxy (Sidecar pattern) • Automatic proxy configuration • Service health checking • Auto recovery Taking all together...
  24. Service Discovery • Service discovery for the proxy sidecars •

    Traffic management capabilities for intelligent routing • A/B tests, canary deployments, etc. • Resiliency (timeouts, retries, circuit breakers, etc.)
  25. Telemetry (Observability) • Enforces access control and usage policies •

    Collects telemetry data from the sidecar proxies • The sidecar proxies extract request level attributes sends them to the telemetry for evaluation
  26. Security • Secure by default • Service-to-service and end- user

    authentication • Identity and credential management • Upgrade unencrypted traffic in the service mesh • Enforce policies based on service identity rather than on network controls
  27. Resiliency • Circuit Breaker • Timeouts • Retries All done

    transparently by the proxies of the service mesh. Latency and fault data is even used for fault tolerant load balancing. This enables ... State model https://martinfowler.com/bliki/CircuitBreaker.html
  28. Traffic management • Canary Releases • Fault tolerant load-balancing (circuit

    breaker data) • Traffic splitting • Content based traffic steering This enables ... https://jaxenter.de/istio-einfuehrung-microservices-cloud-teil-1-71261
  29. Outline A Sidecar Roadtrip through Microservica • some facts about

    microservices and their pains and gains • good old times of the enterprise service bus and the raise of the container • review of popular service meshs • definition of a service mesh • COMPARISON AND SELECTION OF A SERVICE MESH
  30. Acknowledgement Picture reference • Neurons (CC0, pixabay.com) • Cross Sidecar

    (orangehat.nl, CC BY 2.5, Wikipedia) • Train in station (CC0, pixabay.com) • Golden Gate (CC0, pixabay.com) • Container train (CC0, pixabay.com) • Container art (CC0, pixabay.com) • Container terminal (CC0, pixabay.com) • Container slug (CC0, pixabay.com) • Light bulbs (CC0, pixabay.com) • Sidecar (www.sidecarsliderbar.com)
  31. About Nane Kratzke CoSA: http://cosa.fh-luebeck.de/en/contact/people/n-kratzke Blog: http://www.nkode.io Twitter: @NaneKratzke LinkedIn:

    https://de.linkedin.com/in/nanekratzke GitHub: https://github.com/nkratzke ResearchGate: https://www.researchgate.net/profile/Nane_Kratzke SlideShare: http://de.slideshare.net/i21aneka