Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What is a SERVICE MESH and what it is good for?

Nane Kratzke
February 04, 2019

What is a SERVICE MESH and what it is good for?

Over the past years, service meshes have emerged as a critical but hidden component of the cloud-native stack. High-traffic companies like Paypal, Netflix, and more have all added a service mesh to their production applications. Linkerd is an open source service mesh for cloud-native applications that became an official project of the Cloud Native Computing Foundation. However, there is istio, conduit, kong, consul, aspen mesh. and many more. But what is a service mesh, exactly? Do we need more than one? And why is it suddenly relevant? This presentation explains the service mesh concept and provides a broad overview of the field.

TL;DR; A service mesh merely creates a network abstraction that is transparent to the service/application instances to simplify the managing of containerized applications and makes it easier to dynamically route, monitor and secure microservice-based applications. It enables resilient and fault-tolerant communication and traffic management between service instances of a microservice architecture.

Nane Kratzke

February 04, 2019
Tweet

More Decks by Nane Kratzke

Other Decks in Technology

Transcript

  1. What is a
    SERVICE
    MESH
    and what it is good for?
    Nane Kratzke, © 2019

    View Slide

  2. Outline
    A Sidecar Roadtrip through
    Microservica
    • SOME FACTS ABOUT MICROSERVICES AND
    THEIR PAINS AND GAINS
    • Good Old Times of the Enterprise Service Bus
    and the Raise of the Container
    • Review of Popular Service Meshs
    • Definition of a Service Mesh
    • Comparison and Selection of a Service Mesh

    View Slide

  3. View Slide

  4. According to a Dimensional
    Research Inc. survey study …
    • About 50%of survey participants have
    already deployed microservices in production.
    • Nearly all see microservices becoming the
    default application architecture.
    • 63%said the technology is meeting expectations.
    • However, that means:
    37%see problems as well
    • harder to troubleshoot issues
    • struggle to analyze monitoring and tracing data

    View Slide

  5. Microservices Pains and Gains
    Results from industrial case studies
    Taken from: Soldani J, Tamburri DA, Van Den Heuvel WJ. "The pains and gains of microservices: A Systematic grey literature review", Journal of Systems and Software, Volume 146, 2018, Pages 215-232. DOI: 10.1016/j.jss.2018.09.082
    What is the Software
    Engineering Research
    Community saying?

    View Slide

  6. Drivers and Barriers
    for Microservice
    Adaption
    Taken from: Knoche H, Hasselbring W. "Drivers and Barriers for Microservice Adoption – A Survey
    among Professionals in Germany", Journal of Enterprise Modelling and Information Systems
    Architectures, Vol. 14, No. 1 (2019). DOI: 10.18417/emisa.14.1
    What is the Software
    Engineering Research
    Community saying?

    View Slide

  7. Reported Pains and Gains
    Lessons learned form industrial case studies
    Taken from: Soldani J, Tamburri DA, Van
    Den Heuvel WJ. "The pains and gains of
    microservices: A Systematic grey literature
    review", Journal of Systems and Software,
    Volume 146, 2018, Pages 215-232. DOI:
    10.1016/j.jss.2018.09.082
    Pains
    • Intrinsic complexity
    • Business logic distributed over
    independent evolving microservices
    • Retro-compatibility of versioned APIs
    • Security (access control and endpoint
    proliferation)
    • Increased attack surface
    • Distributed logging
    Gains
    • Ease of (single) microserivce development
    • Bounded contexts and self-contained
    • Cloud-native by design
    • Exploitation of design patterns (database
    per service, API gateway, circuit breaker,
    service discovery)
    • Ease of use for DevOps
    • Independent deployability
    • Straightforward integration into CI/CD
    chains

    View Slide

  8. Reported Barriers and Drivers
    Lessons learned form software professionals
    Taken from: Knoche H, Hasselbring W.
    "Drivers and Barriers for Microservice
    Adoption – A Survey among Professionals
    in Germany", Journal of Enterprise
    Modelling and Information Systems
    Architectures, Vol. 14, No. 1 (2019). DOI:
    10.18417/emisa.14.1
    Barriers
    • Insufficient Ops skills
    • Resistance by Ops
    • Insufficient Dev skills
    • Deployment complexity
    • Compatibility issues
    • Maturity of technologies
    • Resistance by Devs
    Drivers
    • Better scalability and elasticity
    • Better maintainability
    • Shorter time to market
    • Enabler for CD and DevOps
    • Cloud-native by implication
    • Organizational improvement
    • Polyglot programming
    • Polyglot persistance
    • Attractiveness as employer

    View Slide

  9. Outline
    A Sidecar Roadtrip through
    Microservica
    • some facts about microservices and their
    pains and gains
    • GOOD OLD TIMES OF THE ENTERPRISE
    SERVICE BUS AND THE RAISE OF THE
    CONTAINER
    • Review of Popular Service Meshs
    • Definition of a Service Mesh
    • Comparison and Selection of a Service Mesh

    View Slide

  10. In good old times of SOA ...
    ... we used enterprise service busses
    ESB
    to handle the pains.
    So, the concept of a
    „service mesh“ is not
    perfectly new.

    View Slide

  11. Maybe you heard recently about the invention
    of the container ...
    If a container is shipped like that, it
    is closely related to microservices.

    View Slide

  12. Clever idea, it makes a lot more possible ...

    View Slide

  13. It can even boost
    your creativity ...

    View Slide

  14. The problem is – you have plenty of them!

    View Slide

  15. So, expect some aspects ... getting slightly harder!

    View Slide

  16. Outline
    A Sidecar Roadtrip through
    Microservica
    • some facts about microservices and their
    pains and gains
    • good old times of the enterprise service bus
    and the raise of the container
    • REVIEW OF POPULAR SERVICE MESHS
    • Definition of a Service Mesh
    • Comparison and Selection of a Service Mesh

    View Slide

  17. A
    SERVICE MESHis something like
    the
    NERVOUS SYSTEM
    of a modern cloud-
    native microservice architecture to reduce pains and barriers.

    View Slide

  18. What is a Service
    Mesh?
    The term service mesh is used to describe the network of microservices
    that make up such applications and the interactions between them. As a
    service mesh grows, it can become harder to understand and manage. Its
    requirements can include discovery, load balancing, failure recovery,
    metrics, and monitoring. A service mesh also often has more complex
    operational requirements, like A/B testing, canary releases, rate limiting,
    access control, and end-to-end authentication.
    Istio makes it easy to create a network of deployed services with load
    balancing, service-to-service authentication, monitoring, and more,
    without any changes in service code. You can configure and manage Istio
    using its control plane functionality, which includes:
    • Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic.
    • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault
    injection.
    • A pluggable policy layer and configuration API supporting access controls, rate limits and
    quotas.
    • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress
    and egress.
    • Secure service-to-service communication in a cluster with strong identity-based
    authentication and authorization.
    What is NGINX
    saying?

    View Slide

  19. Features of nginMesh
    • Sidecar proxy. A sidecar proxy is a proxy instance that’s dedicated to a specific
    service instance. It communicates with other sidecar proxies and is managed by
    the orchestration framework.
    • Service discovery. When an instance needs to interact with a different service, it
    needs to find – discover – a healthy, available instance of the other service. The
    container management framework keeps a list of instances that are ready to
    receive requests.
    • Load balancing. In a service mesh, load balancing works from the bottom up. The
    list of available instances maintained by the service mesh is stack-ranked to put
    the least busy instances – that’s the load balancing part – at the top.
    • Encryption. The service mesh can encrypt and decrypt requests and responses,
    removing that burden from each of the services. The service mesh can also
    improve performance by prioritizing the reuse of existing, persistent
    connections, reducing the need for the computationally expensive creation of
    new ones.
    • Authentication and authorization. The service mesh can authorize and
    authenticate requests made from both outside and within the app, sending only
    validated requests to service instances.
    • Support for the circuit breaker pattern. The service mesh can isolate unhealthy
    instances, then gradually brings them back into the healthy instance pool if
    warranted.

    View Slide

  20. What is a Service
    Mesh?
    • A service mesh is a dedicated infrastructure layer for
    handling service-to-service communication. It’s
    responsible for the reliable delivery of requests through
    the complex topology of services.
    • In practice, the service mesh is typically implemented as
    an array of lightweight network proxies that are
    deployed alongside application code, without the
    application needing to be aware.
    • Reliably delivering requests in a cloud native application
    can be incredibly complex. A service mesh manages this
    complexity with a wide array of powerful techniques:
    circuit-breaking, latency-aware load balancing,
    eventually consistent (“advisory”) service discovery,
    retries, and deadlines.
    What is Linkerd
    saying?

    View Slide

  21. Features of Linkerd
    • Linkerd applies dynamic routing rules to determine which service the
    requester intended. Should the request be routed to a service in
    production or in staging? To a service in a local datacenter or one in the
    cloud? To the most recent version of a service that’s being tested or to an
    older one that’s been vetted in production?
    • Having found the correct destination, Linkerd retrieves the corresponding
    pool of instances from the relevant service discovery endpoint.
    • Linkerd chooses the instance most likely to return a fast response based
    on a variety of factors, including its observed latency for recent requests.
    • If the instance is down, unresponsive, or fails to process the request,
    Linkerd retries the request on another instance.
    • If an instance is consistently returning errors, Linkerd evicts it from the
    load balancing pool, to be periodically retried later.
    • If the deadline for the request has elapsed, Linkerd proactively fails the
    request (circuit-breaking) rather than adding load with further retries.
    • Linkerd captures every aspect of the above behavior in the form of metrics
    and distributed tracing, which are emitted to a centralized metrics system.

    View Slide

  22. What is a Service Mesh?
    The term service mesh is used to describe the network of microservices that make up such
    applications and the interactions between them. As a service mesh grows, it can become harder to
    understand and manage. Its requirements can include discovery, load balancing, failure recovery,
    metrics, and monitoring. A service mesh also often has more complex operational requirements, like
    A/B testing, canary releases, rate limiting, access control, and end-to-end authentication.
    Istio makes it easy to create a network of deployed services with load balancing, service-to-service
    authentication, monitoring, and more, without any changes in service code. You can configure and
    manage Istio using its control plane functionality, which includes:
    • Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic.
    • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault
    injection.
    • A pluggable policy layer and configuration API supporting access controls, rate limits and
    quotas.
    • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and
    egress.
    • Secure service-to-service communication in a cluster with strong identity-based
    authentication and authorization.
    What is istio saying?

    View Slide

  23. Features of Istio
    Traffic management:
    • Istio’s easy rules configuration and traffic routing lets you control the flow of traffic
    and API calls between services.
    • Istio simplifies configuration of service-level properties like circuit breakers,
    timeouts, and retries.
    • It is possible to set up important tasks like A/B testing, canary rollouts, and staged
    rollouts with percentage-based traffic splits.
    Security:
    • Istio provides the underlying secure communication channel, and manages
    authentication, authorization, and encryption of service communication at scale.
    • Service communications are secured by default, letting you enforce policies
    consistently across diverse protocols and runtimes.
    Observability:
    • Istio’s robust tracing, monitoring, and logging give you deep insights into your
    service mesh deployment.
    • Istio’s Mixer component is responsible for policy controls and telemetry collection.
    • It provides backend abstraction and intermediation, insulating the rest of Istio from
    the implementation details of individual infrastructure backends, and giving
    operators fine-grained control over all interactions between the mesh and
    infrastructure backends.

    View Slide

  24. Outline
    A Sidecar Roadtrip through
    Microservica
    • some facts about microservices and their
    pains and gains
    • good old times of the enterprise service bus
    and the raise of the container
    • review of popular service meshs
    • DEFINITION OF A SERVICE MESH
    • Comparison and Selection of a Service Mesh

    View Slide

  25. Service Mesh
    A Definition
    • Building microservices is easy.
    • Operating a microservice architecture is hard.
    Definition proposal: A Service Mesh creates a
    network abstraction to simplify the managing of
    containerized applications and makes it easier to
    dynamically route, monitor and secure
    microservice-based applications.
    OK, we found noteworthy similarities. Let
    us summarize them ...

    View Slide

  26. Common Features of a Service Mesh
    Traffic management
    Dynamic rule-based traffic
    routing
    Rule- and percentage
    based traffic split
    Rate limit
    Quotas
    Resiliency
    Circuit-breaker
    Timeouts
    Retries
    Recovery
    Latency-aware load
    balancing
    Health-aware service
    discovery
    Security
    Authentication
    Authorization
    Encryption of service
    communication
    TLS endpoint termination
    Certificate handling
    Observerability
    Tracing
    Monitoring
    Logging
    Telemetry collection

    View Slide

  27. Components of a Service Mesh
    Data plane
    Service to service communication
    Transparently via the sidecar proxy
    Services must be not aware of the
    service mesh
    Control plane
    Managing routes and rules
    Configuration of service-level
    properties (timeouts, retries, …)
    Service discovery
    Access control
    Policy control
    Telemetry collection
    Proxy
    Sidecar pattern
    Each service has a dedicated proxy
    Communicates with other sidecar
    proxies
    Collects tracing data
    Colects monitoring data
    Managed by the orchestration
    framework

    View Slide

  28. Architecture of a
    Service Mesh
    • Control plane
    • Service Discovery
    • Telemetry
    • Security
    • Data plane
    • Proxy (Sidecar pattern)
    • Automatic proxy configuration
    • Service health checking
    • Auto recovery
    Taking all together...

    View Slide

  29. Service Discovery
    • Service discovery for the
    proxy sidecars
    • Traffic management
    capabilities for intelligent
    routing
    • A/B tests, canary
    deployments, etc.
    • Resiliency (timeouts, retries,
    circuit breakers, etc.)

    View Slide

  30. Telemetry
    (Observability)
    • Enforces access control and
    usage policies
    • Collects telemetry data from
    the sidecar proxies
    • The sidecar proxies extract
    request level attributes
    sends them to the telemetry
    for evaluation

    View Slide

  31. Security
    • Secure by default
    • Service-to-service and end-
    user authentication
    • Identity and credential
    management
    • Upgrade unencrypted traffic
    in the service mesh
    • Enforce policies based on
    service identity rather than
    on network controls

    View Slide

  32. Resiliency
    • Circuit Breaker
    • Timeouts
    • Retries
    All done transparently by the
    proxies of the service mesh.
    Latency and fault data is even
    used for fault tolerant load
    balancing.
    This enables ...
    State model
    https://martinfowler.com/bliki/CircuitBreaker.html

    View Slide

  33. Traffic
    management
    • Canary Releases
    • Fault tolerant load-balancing
    (circuit breaker data)
    • Traffic splitting
    • Content based traffic
    steering
    This enables ...
    https://jaxenter.de/istio-einfuehrung-microservices-cloud-teil-1-71261

    View Slide

  34. Outline
    A Sidecar Roadtrip through
    Microservica
    • some facts about microservices and their
    pains and gains
    • good old times of the enterprise service bus
    and the raise of the container
    • review of popular service meshs
    • definition of a service mesh
    • COMPARISON AND SELECTION OF A SERVICE
    MESH

    View Slide

  35. OK, I got it …
    But which one to choose?

    View Slide

  36. An incomplete list of Service Meshs

    View Slide

  37. https://layer5.io
    Service Mesh Landscape
    Might be helpful for comparisons

    View Slide

  38. Never underestimate a SIDECAR in action!

    View Slide

  39. Acknowledgement
    Picture reference
    • Neurons (CC0, pixabay.com)
    • Cross Sidecar (orangehat.nl, CC BY 2.5, Wikipedia)
    • Train in station (CC0, pixabay.com)
    • Golden Gate (CC0, pixabay.com)
    • Container train (CC0, pixabay.com)
    • Container art (CC0, pixabay.com)
    • Container terminal (CC0, pixabay.com)
    • Container slug (CC0, pixabay.com)
    • Light bulbs (CC0, pixabay.com)
    • Sidecar (www.sidecarsliderbar.com)

    View Slide

  40. About
    Nane Kratzke
    CoSA: http://cosa.fh-luebeck.de/en/contact/people/n-kratzke
    Blog: http://www.nkode.io
    Twitter: @NaneKratzke
    LinkedIn: https://de.linkedin.com/in/nanekratzke
    GitHub: https://github.com/nkratzke
    ResearchGate: https://www.researchgate.net/profile/Nane_Kratzke
    SlideShare: http://de.slideshare.net/i21aneka

    View Slide