Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learnings from Implementing Microservices Architecture with Kubernetes

Ray Tsang
October 29, 2019

Learnings from Implementing Microservices Architecture with Kubernetes

Ray has been on a 6-months rotation with an internal Google team to help bringing a project to public Cloud using cloud-native technology stack and Kubernetes. Ray will share the architecture, development environment technicals, devops tools, and some tough decisions that needed to be made to move the project along while being prepared for changes in the future.

This session to learn the journey including development environment tools choices (Docker Compose, Skaffold, Kustomize, Jib), to the stack (Gradle, Spring Boot, Kafka, PostgreSQL, gRPC, gRPC-Web), to mono-repo vs multi-repo, to the runtime infrastructure (Kubernetes, Istio, Prometheus, Grafana). With hindsight 20-20, we’ll visit some best practices, lessons learned, and how decisions/compromises are being made.

Ray Tsang

October 29, 2019
Tweet

More Decks by Ray Tsang

Other Decks in Technology

Transcript

  1. 8 @saturnism @gcpcloud 2 teams - working together Discuss business

    requirements, architecture, operations needs Application implementation team takes over operation
  2. 10 @saturnism @gcpcloud Kubernetes cluster / Private Cluster Istio Namespace

    Networking Istio Ingress Istio Project Namespace Virtual Services Istio Virtual Services Istio Frontend Deployment Backends Deployment Cloud Load Balancing Identity-Aware Proxy Istio Egress Istio Cloud NAT Third-Party Services PostgreSQL Cloud SQL Images Container Registry Prometheus Monitoring Grafana Monitoring Jaerger Distributed Trace
  3. 11 @saturnism @gcpcloud 11 #1 ClickOp → GitOp In the

    cloud, it's every easy to click It's hard to reproduce!
  4. 12 @saturnism @gcpcloud Infrastructure as Code Terraform for all cloud

    infrastructure - GKE, Cloud SQL, Static IPs, VPC... Laydown other Kubernetes infrastructure - Istio, OPA Gatekeeper...
  5. 13 @saturnism @gcpcloud Kubernetes cluster / Private Cluster Istio Namespace

    Networking Istio Ingress Istio Project Namespace Virtual Services Istio Virtual Services Istio Frontend Deployment Backends Deployment Cloud Load Balancing Identity-Aware Proxy Istio Egress Istio Cloud NAT Third-Party Services PostgreSQL Cloud SQL Images Container Registry Prometheus Monitoring Grafana Monitoring Jaeger Distributed Trace Terraformed
  6. 14 @saturnism @gcpcloud Kubernetes cluster / Private Cluster Istio Namespace

    Networking Istio Ingress Istio Project Namespace Virtual Services Istio Virtual Services Istio Frontend Deployment Backends Deployment Cloud Load Balancing Identity-Aware Proxy Istio Egress Istio Cloud NAT Third-Party Services PostgreSQL Cloud SQL Images Container Registry Prometheus Monitoring Grafana Monitoring Jaeger Distributed Trace Ready for deployment
  7. 15 @saturnism @gcpcloud There is a lot of YAML But

    at least, the process is repeatable
  8. 16 @saturnism @gcpcloud Check-in the final configuration helm template \

    istio-${ISTIO_VERSION}/install/kubernetes/helm/istio \ --name istio \ --namespace istio-system \ -f dev-values.yaml >> istio.yaml istio.yaml change → PR (review, see diffs) → merge → CI/CD (apply) (Depends on how much you trust the templating engine)
  9. 17 @saturnism @gcpcloud CI Pipeline (triggered by code commit) 1.

    Build, test application 2. Create container image 3. Commit deployment manifests with new container image tag/sha CD Pipeline (triggered by manifest commit) 1. Render full manifest if necessary 2. Apply the full manifest
  10. 18 @saturnism @gcpcloud 18 #2 Adopt incrementally Understand the requirements

    for production Have a roadmap to know what technology to adopt, when
  11. 19 @saturnism @gcpcloud Initial Learning Click-op to GKE Repeatable Infrastructure

    Infrastructure as code Terraform Networking / Ingress POC with L4 LB Kubernetes Moved to L7 LB with Ingress Kubernetes SSL Let's Encrypt? GCP Managed Certificates? URL mapping/routing Envoy → Istio Security Enable mTLS Istio Deny/allow Egress Istio Add security policies Kubernetes Deny/allow container images OPA Gatekeeper Monitoring Collect Metrics Prometheus Scrape the metrics Dashboards Grafana Alerts SLOs Infrastructure-level alerts Uptime checks
  12. 20 @saturnism @gcpcloud Somethings are is hard to change Be

    careful of one way doors Istio sidecar requires privileges → Reevaluate/reinstall Istio w/ CNI Public cluster to private cluster → Delete and recreate
  13. 21 @saturnism @gcpcloud 21 #3 Adopt carefully Don't go back

    and say "I need everything on that slide!" Consider what you really want to achieve Explore and _make_ sure everything works as advertised
  14. 22 @saturnism @gcpcloud Problem Statement Goal / Scope Solutions /

    Alternatives Pros / Cons Recommendation / Decision
  15. 23 @saturnism @gcpcloud 23 #4 - Consider non-technical factors Write

    a doc! Consider reality Weigh the risks Maintenance / Operations
  16. 27 @saturnism @gcpcloud Mono Repo Multi Repo Project Structure Multi-module/multi-project

    Single module/project Dependency Management Parent / Includes All dependencies are up to date Common Parent/BOM Automate dependency version updates Artifact Management Can avoid initially Need to publish artifacts Where to publish? Testing Easy Against Snapshots, Flaky CI Just one pipeline Builds everything Copy of pipeline per repo Build only service that changed CD Which service to deploy? Deploy service that changed Initial Velocity Fast Slow... Long Term Velocity Slow down over time Long builds Faster
  17. 28 @saturnism @gcpcloud We still chose Mono Repo... Team is

    already familiar with Mono Repo Fast ramp up and velocity Lack of existing infrastructure for dependency and artifact management Setting up one repo and pipeline was difficult enough...
  18. 29 @saturnism @gcpcloud We do this analysis for everything Every

    service have their own database? gRPC or REST? Kafka? Knative?
  19. 30 @saturnism @gcpcloud 30 #5 Anticipate changes Choices made today

    is made. Design to expect changes tomorrow. Avoid one way doors.
  20. 31 @saturnism @gcpcloud Anticipate Multi Repo We anticipate to out

    grow the Mono Repo Make sure the Mono Repo is still splittable!
  21. 33 @saturnism @gcpcloud project/ +-- build.gradle +-- services/ +-- common.gradle

    +-- auth/ +-- src/main/proto/auth.proto +-- user/ +-- email/ Project Structure
  22. 34 @saturnism @gcpcloud apply from: '../common.gradle' group = 'com.example.services' mainClassName

    = 'com.example.services.auth dependencies { implementation project(':common:user') protobuf project(path: ':services:user, configuration: 'proto') }
  23. 35 @saturnism @gcpcloud Anticipate Multi Repo As the team grows,

    and new teams comes to take over services... Successfully split out 3 services from the Mono Repo
  24. 36 @saturnism @gcpcloud 36 #6 Focus on your application Architecture

    and design - it has nothing to do with Kubernetes If you design well, you can almost always deploy into Kubernetes 12factor.net
  25. 37 @saturnism @gcpcloud Adopt Carefully Anticipate Changes Microservices architecture is

    not the answer to everything Monolith works too, as long as it is designed well!
  26. 38 @saturnism @gcpcloud 38 #7 Local != Production Do not

    bring Slide 10 to local development Focus on velocity of well-designed application Rely on self-encapsulating unit/integration tests
  27. 39 @saturnism @gcpcloud Why not Istio Locally? A lot to

    learn and troubleshoot Use less compute resource
  28. 40 @saturnism @gcpcloud Test Locally - without Kubernetes Unit tests

    Integrations tests Wiremocks Testcontainers
  29. 41 @saturnism @gcpcloud If you need to test something... Simple

    Envoy Proxy Local Kubernetes (k3s, minikube, ...)
  30. 42 @saturnism @gcpcloud Cloud Code / Skaffold After you test

    everything… want to see end-to-end result Continuous development loop
  31. 46 @saturnism @gcpcloud kubectl create deployment myservice --image=... --dry-run -oyaml

    > k8s/deployment.yaml kubectl create svc clusterip myservice --tcp=8080:8080 --dry-run -oyaml > k8s/service.yaml
  32. 49 @saturnism @gcpcloud 49 #9 Contracts with the runtime environment

    When is your application ready to serve traffic? When is it in trouble? How do you shutdown gracefully?
  33. 51 @saturnism @gcpcloud When to use? Failure Means... Practices Example

    Liveness Probe If application is alive. Application will be restarted, and that a restart will help recover. Runs on serving port of the application, e.g., 8080. Don't check dependency. E.g., don't check dependent database connection, etc. A simple /alive URL that returns 200. Readiness Probe Ready to serve requests. Take the pod instance out of load balancer. Flip to ready when application has done all the initializations (cache preloaded). Upon SIGTERM, flip readiness to false. See Graceful Shutdown. /actuator/health on the management port.
  34. 52 @saturnism @gcpcloud Anatomy of a Graceful Shutdown 1. Receive

    SIGTERM or PreStop Lifecycle Hook 2. Fail Readiness Probe 3. Receive requests until Kubernetes detects readiness probe failure 4. Kubernetes removes pod endpoint from Service 5. Finish serving in-flight requests 6. Shutdown
  35. 54 @saturnism @gcpcloud All the cross-cutting concerns are the same

    Monolith, microservices, Kubernetes, not Kubernetes... But the 2 teams now speak with the same nouns: Deployment, Service, Ingress, Virtual Service, ...