Seeking Observability, Getting Started with Service Mesh

Seeking Observability, Getting Started with Service Mesh

923237754890d536819892ad42451555?s=128

sakajunquality

November 16, 2019
Tweet

Transcript

  1. Seeking Observability: Getting Started with Service Mesh on GCP 16

    November 2019 #DevFestLondon Jun Sakata @sakajunquality Google Developers Expert, Cloud
  2. The Speaker - Jun Sakata - Google Developers Expert, Cloud

    - SRE at Ubie, Inc. - Social Media: @sakajunquality - First time
  3. “Observability”

  4. What is Observability?

  5. Observ/ab/ility

  6. Wikipedia says... In control theory, observability is a measure of

    how well internal states of a system can be inferred from knowledge of its external outputs. https://en.wikipedia.org/wiki/Observability
  7. Wikipedia says... In control theory, observability is a measure of

    how well internal states of a system can be inferred from knowledge of its external outputs. https://en.wikipedia.org/wiki/Observability
  8. In Software Engineering ... Observability: collecting diagnostics data all across

    the stack to identify and debug production problems and also to provide critical signals about usage to our highly adaptive and scalable environment. Jaana B. Dogan, Google https://medium.com/observability/googles-approach-to-observability-frameworks-c89fc1f0e058
  9. In Software Engineering ... Observability: collecting diagnostics data all across

    the stack to identify and debug production problems and also to provide critical signals about usage to our highly adaptive and scalable environment. Jaana B. Dogan, Google https://medium.com/observability/googles-approach-to-observability-frameworks-c89fc1f0e058
  10. Think what SREs / PEs do! SRE Books https://landing.google.com/sre/books/

  11. What do we need?

  12. We need metrics that matters!

  13. We need logs that matter!

  14. We need to trace what happened!

  15. “Microservices”

  16. Microservices (Generally speaking) Several, could be thousands of, services might

    be - written in Different Languages / Frameworks / Library - using Many Protocols - having Distributed system calls
  17. Microservices Observability Think what happens - when starting a new

    service in a new language - when communicating with a new procol - when making a breaking change to network and infrastructure
  18. Microservices Observability We want to implement something that - is

    decoupled from languages, frameworks and libraries - supports many protocols or other procedures - decouples applications and the whole infrastructure
  19. Microservices Observability We want to implement something that - is

    decoupled from languages, frameworks and libraries - supports many protocols or other procedures - decouples applications and the whole infrastructure
  20. “Service Mesh”

  21. Service Mesh - is a transparent network between services -

    Decoupled from application - Language independent - provides automated applications network functions - Observability - Service Discovery - Policy Enforcement - etc...
  22. Here’s what’s happening Let’s say we have two services written

    in different languages Service A (Java w/ Spring Boot) Service B (Python w/ Flask)
  23. Here’s what’s happening Without Service Mesh, one call the other

    directly Service A (Java w/ Spring Boot) Service B (Python w/ Flask)
  24. Here’s what’s happening For the observability, each services must implement

    things Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Metrics / Logs Service Metcics / Tracing Codes Metcics / Tracing Codes
  25. Here’s what’s happening What if another service is deployed...? and

    with new runtime or new protocol...? Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Service C (Go w/o Framework) Metrics / Logs Service Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes
  26. Here’s what’s happening Next thing you see might be ...

    Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Service C (Go w/o Framework) Service D (Scala w/ Play Framework) Service E (Python w/ Django) Service F (Python w/ own Framework) Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes
  27. Here’s what’s happening Next thing you see might be ...

    Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Service C (Go w/o Framework) Service D (Scala w/ Play Framework) Service E (Python w/ Django) Service F (Python w/ own Framework) Service (Go Service (C++ Service (Go Servic (Kotlin Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes
  28. Here’s what’s happening Can you update all of them? Hopefully

    in a short time of period? Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Service C (Go w/o Framework) Service D (Scala w/ Play Framework) Service E (Python w/ Django) Service F (Python w/ own Framework) Service (Go Service (C++ Service (Go Servic (Kotlin Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes Metcics / Tracing Codes
  29. Here’s what’s happening Instead of implementing those networking features in

    service applications, sidecar proxies are deployed along with them Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Metcics / Tracing Codes Metcics / Tracing Codes Sidecar Proxy Sidecar Proxy
  30. Here’s what’s happening Services, both internal and external, are called

    each other through sidecars Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Sidecar Proxy Sidecar Proxy
  31. Here’s what’s happening That way, we can let sidecar proxies,

    instead of applications, do what we need for observability! Service A (Java w/ Spring Boot) Service B (Python w/ Flask) Sidecar Proxy Sidecar Proxy Metrics / Logs Service
  32. Envoy - L7 proxy - Originally from Lyft - Configurable

    via API w/o restart - 100% OSS! No Premium Version - High Performance / High Reliability - Widely used in service to service proxy
  33. Envoy! Service A (Java w/ Spring Boot) Service B (Python

    w/ Flask) * envoy is not only the proxy for service mesh Universal dataplane proxy!
  34. Envoy “The network should be transparent to applications. When network

    and application problems do occur it should be easy to determine the source of the problem.” Matt Klein, Lyft https://www.envoyproxy.io/docs/envoy/latest/intro/what_is_envoy
  35. https://www.youtube.com/watch?v=55yi4MMVBi4&t=444s

  36. Okay, sidecar is nice. But still need something that orchestrates

    those distributed.
  37. Istio

  38. Istio - Open source Service Mesh - Originally from Google,

    Lyft and etc… - (Lyft is not using Istio) - Envoy is used as sidecar
  39. Istio in functionality Connect - Service Discovery Secure - Authentication

    - Authorization - Encryption Control - Policy like circuit breaker - A/B testing, Canary Release Observe - Monitor Traffic by telemetries 1 2 3 4 https://istio.io/
  40. Istio architecture https://istio.io/docs/concepts/what-is-istio/

  41. Istio architecture https://istio.io/docs/concepts/what-is-istio/ Complicated …?

  42. Istio architecture https://istio.io/docs/concepts/what-is-istio/ Dataplane

  43. Istio architecture https://istio.io/docs/concepts/what-is-istio/ Control Plane

  44. How it Works

  45. How it works... https://istio.io/docs/concepts/what-is-istio/

  46. How it works... https://istio.io/docs/concepts/what-is-istio/ Remember the sidecar!

  47. How it works... https://istio.io/docs/concepts/what-is-istio/

  48. How it works... https://istio.io/docs/concepts/what-is-istio/ Send telemetries to control plane!

  49. Metrics, Logs, Telemetry, whatever ... What kind of insights can

    we get from them?
  50. List services and those status

  51. How each services are connected

  52. Detailed metrics of each service

  53. How each services are connected

  54. How each services are connected

  55. Status of Istio itself

  56. Visualizing tools - Prometheus - Grafana - Kiali - Stackdriver

    - Datadog - and more...!
  57. Looks Great How can we start?

  58. How to start - Do the official “Getting Starter” -

    https://istio.io/docs/setup/getting-started/ - Install Istio - Install demo app: guest book - Do some “Tasks” - https://istio.io/docs/tasks/
  59. Read “ISTIO BY EXAMPLE” istiobyexample.dev

  60. Try on GKE? - Try “Istio on GKE” w/ mTLS

    permissive - https://cloud.google.com/istio/docs/istio-on-gke /overview - Just one click - Not recommended for production yet!
  61. What’s the latest update?

  62. Istio 1.3 is released last month!

  63. Istio 1.3 https://istio.io/news/2019/announcing-1.3/ - Performance Improvements - CLI Improvements -

    Dashboard Improvements - Intelligent Protocol Detection - Mixer-less HTTP Telemetry - Deployment Models Docs
  64. Istio 1.4 is released last thursday!

  65. Istio 1.4 https://istio.io/news/2019/announcing-1.4 - Mixer-less telemetry - Authorization policy model

    in beta - Improved troubleshooting - Better sidecar
  66. Performance and User Experience are improving !

  67. Any tips for GCP user?

  68. Save access log to BigQuery - Enable envoy access log

    w/ json, export to /dev/stdout --set global.proxy.accessLogFile="/dev/stdout" --set global.proxy.accessLogEncoding="JSON" - That way logs are collected to Stackdriver Logging - and you can sync logs to BigQuery
  69. Try in-proxy telemetry to stackdriver - In 1.4 mixerless telemetry

    is implemented for stackdriver - https://github.com/istio/proxy/blob/release-1.4/extensions/stackdriver/README.md
  70. Wait for Anthos Service Mesh - Formally called “Cloud Service

    Mesh” - To be a Fully-Managed for Istio-based service mesh platform
  71. Anthos Service Mesh Increase Observability With the Stackdriver Query Notation

    (Cloud Next '19) https://www.youtube.com/watch?v=NGFpGW8aQS8&t=2034s
  72. Anthos Service Mesh Increase Observability With the Stackdriver Query Notation

    (Cloud Next '19) https://www.youtube.com/watch?v=NGFpGW8aQS8&t=2034s
  73. Do I need a Service Mesh?

  74. Maybe Not

  75. Kelsey Hightower, Google https://twitter.com/kelseyhightower/status/1150158904900431873

  76. Think Carefully ... - If you’re running a single monolith

    application, apparently you don’t - If you’re running services with a single technology stack, maybe you don’t - e.g. Java ecosystem - If public cloud provides, complete availability and observability of network, we don’t! - (Speaking of Istio) If you don’t plan to use most of its functions, consider creating controlplane on you own!
  77. Takeaways

  78. Takeaways - With Service Mesh you can get a consistent

    function for observability, along with other functions, between languages and frameworks - Service Mesh decouples network and infrastructure functionality from applications - Service Mesh uses sidecar proxy for this - Istio is an all-in-one solution for Service Mesh
  79. Takeaways - Think if Service Mesh is a solution for

    you - So many ways to do this - Istio is not only option for Service Mesh
  80. Thank You

  81. None