Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Service Mesh - a good deal for Microservices?

Service Mesh - a good deal for Microservices?

Let’s be honest: Sometimes we wish we could go back to the good old monolith. A single application which can be easily operated, secured and monitored and that does not have to deal with all the challenges a network introduces.
But instead, many companies have decided to go with Microservices, for many good reasons such as faster delivery and more independence for developer teams.
Yet, the cross-cutting concerns developers implement around the business logic seem to have gotten a bit out of hand. Think about monitoring, circuit breaking, canary releasing, TLS termination. This is exactly what a Service Mesh promises to change. It lifts monitoring, resilience, routing, and security into the infrastructure. Sounds too good to be true? Indeed, a Service Mesh does not come without a price: cognitive complexity, increased resource consumption and latency.
We need to talk: about meaningful use cases for Service Meshes as well as the drawbacks and implementations such as Istio and Linkerd.

Hanna Prinz

July 15, 2020
Tweet

More Decks by Hanna Prinz

Other Decks in Programming

Transcript

  1. Service Mesh Metrics Config Retry Timeout Circuit Breaker Routing Encrypt

    Decrpyt Authorization Metrics ... } @INNOQ @HannaPrinz
  2. Microservices with Service Mesh Service Mesh Evolution Monolith Microservices In

    Theory Microservices in Practice @INNOQ @HannaPrinz
  3. Infrastruktur-Service Y Service Mesh Architecture Microservice 1 Microservice 2 Proxy

    Proxy Control Plane App Infrastructure-Service X Application Data Plane Control Plane Infrastructure @INNOQ @HannaPrinz
  4. Monitoring A Service Mesh can automatically deliver all 4 "Golden

    Signals": Latency Traffic Volume Errors (Status Codes) Satuation ... but it cannot look into the Microservices' Business Logic https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/#xref_monitoring_golden-signals @INNOQ @HannaPrinz
  5. Monitoring mit Service Mesh Record Network Traffic Metrics -> Latency

    / Response Time -> HTTP Status Codes -> Requests per Second ... make them available to a Monitoring-System ... and visualize them with dashboards @INNOQ @HannaPrinz
  6. Order Shipping Invoicing Postgres Demo Application Service use neither code

    nor libraries for monitoring! https://github.com/ewolff/microservice-istio
  7. Routing Typically implemented in the Edge Router / API Gateway

    e.g. NGINX, Envoy, Ambassador, Traefik Instance A Instance B Load Balancing Instance A Instance B Path-based Routing /a /b Instance A Instance B Blue/Green Deployment Instance A Instance B A/B-Testing 50% 50% Instance A Instance B Canary Releasing London World 17 @INNOQ @HannaPrinz
  8. Routing with a Service Mesh Microservice 1 Microservice 2 Proxy

    Proxy Control Plane App Application Data Plane Control Plane Routing Rules 18 @INNOQ @HannaPrinz
  9. Routing mit Service Mesh GET /new GET / 90% 10%

    Service 1 Service 2A Proxy Proxy Service 2B Proxy Complex Routing Rules for A/B Testing and Canary Releasing Service 1 Service 2 Proxy Proxy Service 2 Proxy PRODUKTION STAGING Traffic Mirroring locality=London locality=* 19 @INNOQ @HannaPrinz
  10. Resilience What if a service is not available as expected?

    Goal: Overall system continues to function ... with restrictions where necessary Methods: Retry, Timeout, Circuit Breaking 21 500 @INNOQ @HannaPrinz
  11. Resilience with a Service Mesh Microservice 1 Microservice 2 Proxy

    Proxy Control Plane App Application Data Plane Control Plane Resilience Rules 22 @INNOQ @HannaPrinz
  12. Resilience with a Service Mesh Fault Injection Delay Injection Service

    1 Service 2 Proxy Proxy Timeout Retry Service 1 Service 2 Proxy Proxy 4s 502 23 @INNOQ @HannaPrinz
  13. Security with a Service Mesh Microservice 1 Microservice 2 Proxy

    Proxy Application Data Plane Control Plane Control Plane App Authorization Rules TLS-Certificate 25 @INNOQ @HannaPrinz
  14. Security with a Service Mesh Service 1 Service 2 Proxy

    Proxy Authentication with mTLS Authorization Service 1 Service 2 Proxy Proxy GET /api GET / Authorization Rule TLS-Certificate 26 @INNOQ @HannaPrinz "Service 1"
  15. Service Mesh Features Network metrics and access logs Emit tracing

    data to backend Timeouts & Retries Circuit Breaking Business metrics or logs Passing on tracing headers Alerting Use cache or standard responses in Circuit Breaker Automatic Canary Releasing Authentication with mTLS Authorization Complex routing rules Canary Releasing & A/B-Testing Observability Resilience Routing Security @INNOQ @HannaPrinz
  16. Istio VS Linkerd 2 *2017 By Google & IBM optimized

    for feature-richness and configurability optimized for Kubernetes, but not exclusive *2017 by Buoyant optimized for usability and performance Kubernetes only @INNOQ @HannaPrinz
  17. Features @INNOQ @HannaPrinz Network metrics and access logs Emit tracing

    data to backend Timeouts & Retries Circuit Breaking Authentication with mTLS Authorization Complex routing rules Canary Releasing & A/B-Testing Observability Resilience Routing Security Istio ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ ✗
  18. apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: istio-attributegen-filter spec: workloadSelector: labels:

    app: reviews configPatches: - applyTo: HTTP_FILTER match: context: SIDECAR_INBOUND proxy: proxyVersion: '1\.6.*' listener: filterChain: filter: name: "envoy.http_connection_manager" subFilter: name: "istio.stats" patch: operation: INSERT_BEFORE value: name: istio.attributegen typed_config: "@type": type.googleapis.com/udpa.type.v1.TypedStruct type_url: type.googleapis.com/envoy.extensions.filters. value: config: configuration: | { "attributes": [ { "output_attribute": "istio_operationId", "match": [ { "value": "GET /users", "condition": "request.url_path == '/use }, { "value": "POST /order", "condition": "request.url_path == '/ord }, { "value": "GET /invoice/{id}", "condition": "request.url_path.matches( && request.method == 'GET'" } ] } ] } vm_config: runtime: envoy.wasm.runtime.null code: local: { inline_string: "envoy.wasm.attributege Service Mesh Magic is based on a lot of YAML apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: istio-attributegen-filter spec: workloadSelector: labels: app: reviews configPatches: - applyTo: HTTP_FILTER match: context: SIDECAR_INBOUND proxy: proxyVersion: '1\.6.*' listener: filterChain: filter: name: "envoy.http_connection_manager" subFilter: name: "istio.stats" patch: operation: INSERT_BEFORE value: name: istio.attributegen typed_config: "@type": type.googleapis.com/udpa.type.v1.TypedStruct type_url: type.googleapis.com/envoy.extensions.filters.h value: config: configuration: | { "attributes": [ { "output_attribute": "istio_operationId", "match": [ { "value": "GET /users", "condition": "request.url_path == '/user }, { "value": "POST /order", "condition": "request.url_path == '/orde }, { "value": "GET /invoice/{id}", "condition": "request.url_path.matches(' && request.method == 'GET'" } ] } ] } vm_config: runtime: envoy.wasm.runtime.null code: local: { inline_string: "envoy.wasm.attributegen
  19. Service 11ms GET /users POST /order GET /invoice/42 Service 4ms

    17ms 2ms Monitoring Precision @INNOQ @HannaPrinz by Service by Endpoint
  20. Monitoring Precision by Endpoint with apiVersion: linkerd.io/v1alpha1 kind: ServiceProfile metadata:

    name: service-b.default.svc.cluster.local namespace: default spec: routes: - name: GET /users condition: method: GET pathRegex: /users - name: POST /order condition: method: POST pathRegex: /order - name: GET /invoice/{id} condition: method: GET pathRegex: /invoice/[^/]* Service A GET /users POST /order GET /invoice/42 Service B 4ms 17ms 2ms @INNOQ @HannaPrinz Linkerd 2
  21. Monitoring Precision by Endpoint with apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata:

    name: istio-attributegen-filter spec: workloadSelector: labels: app: reviews configPatches: - applyTo: HTTP_FILTER match: context: SIDECAR_INBOUND proxy: proxyVersion: '1\.6.*' listener: filterChain: filter: name: "envoy.http_connection_manager" subFilter: name: "istio.stats" patch: operation: INSERT_BEFORE value: name: istio.attributegen typed_config: "@type": type.googleapis.com/udpa.type.v1.TypedStruct type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm value: config: configuration: | { "attributes": [ { "output_attribute": "istio_operationId", "match": [ { "value": "GET /users", "condition": "request.url_path == '/users' && request.method == 'GET'" }, { "value": "POST /order", "condition": "request.url_path == '/order' && request.method == 'POST'" }, { "value": "GET /invoice/{id}", "condition": "request.url_path.matches('^/invoice/[[:alnum:]]*$') && request.method == 'GET'" } ] } ] } vm_config: runtime: envoy.wasm.runtime.null code: local: { inline_string: "envoy.wasm.attributegen" } Service A GET /users POST /order GET /invoice/42 Service B 4ms 17ms 2ms @INNOQ @HannaPrinz Istio Experimental in Version 1.6
  22. 41 Performance & Ressourcen •Latency - highly dependent on traffic

    •Istio: additional ca. 3ms Latency - per call between services! •Linkerd 2: no current numbers, similar to Istio in earlier versions •Resources •Additional containers for Control Plane & each sidecar •→ Increased CPU & memory consumption But: Depending on the concrete project → make your own benchmark! @INNOQ @HannaPrinz
  23. Service Mesh Solves many essential problems of microservices + Another

    complex piece of technology – ... without changing the code! Increased latency and resource consumption 43 @INNOQ @HannaPrinz
  24. Decision support Service Mesh Indicators Selection criteria • Many microservices,

    many synchronous calls • Many unsolved problems in monitoring, routing, resilience and/or security • Most services run in Kubernetes • Which features are really missing? • Existing infrastructure - Kubernetes, Consul, AWS, ... • Temporal and cognitive capacity in the team • Activity of the Community @INNOQ @HannaPrinz Objective: As much complexity as necessary, but as little as possible
  25. More Service Mesh • Service Mesh Comparison at servicemesh.es https://servicemesh.es/

    • Blog Post: Happy without a Service Mesh https://innoq.com/en/blog/happy-without-a-service-mesh/ • Example-Application on GitHub https://github.com/ewolff/microservice-istio • Linkerd Tutorial https://linkerd.io/2/tasks/ • Istio Tutorial https://istio.io/docs/setup/getting-started/ @INNOQ @HannaPrinz
  26. Krischerstr. 100 40789 Monheim am Rhein Germany +49 2173 3366-0

    Ohlauer Str. 43 10999 Berlin Germany +49 2173 3366-0 Ludwigstr. 180E 63067 Offenbach Germany +49 2173 3366-0 Kreuzstr. 16 80331 München Germany +49 2173 3366-0 Hermannstrasse 13 20095 Hamburg Germany +49 2173 3366-0 Gewerbestr. 11 CH-6330 Cham Switzerland +41 41 743 0116 innoQ Deutschland GmbH innoQ Schweiz GmbH www.innoq.com Thank you! Questions? Hanna Prinz [email protected] @HannaPrinz Icons made by srip, Smashicons, Nikita Golubev, Freepik, surang and Darius Dan from www.flaticon.com and licensed by CC 3.0 BY Service Mesh Primer - 2nd Edition Free at leanpub.com/service-mesh-primer