Difficult to understand change effect Difficult to test Difficult to on-board Difficult to isolate failure Difficult to scale independently Difficult to try new technologies
Microservices is a software development technique that structures an application as a collection of loosely coupled services with the smallest autonomous boundary.
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services Container
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services Container Over HTTP
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services Container Over HTTP SSL Termination DDoS Protection Cloud Amor?
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services Container Over HTTP Routing to microservices Protocol tranformation (HTTP to gRPC) Common logging & Tracing Request buffering SSL Termination DDoS Protection Cloud Amor?
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services Container Over HTTP Routing to microservices Protocol tranformation (HTTP to gRPC) Common logging & Tracing Request buffering SSL Termination DDoS Protection Cloud Amor? Common AuthZ/AuthN
API Gateway Google Cloud Load balancing Authority Service A Service B Sakura Service X Mercari API GCP Kubernetes Engine Cloud Resources Managed Services Container Over HTTP Routing to microservices Protocol tranformation (HTTP to gRPC) Common logging & Tracing Request buffering SSL Termination DDoS Protection Cloud Amor? Common AuthZ/AuthN Managed DB
Another important takeaway is that even though all of these listed items are important, ultimately the most critical thing is observability. As I like to say: observability, observability, observability - Matt Klein, Seeking SRE (Chapter6)
Service A Service B Network AuthN and AuthZ? API limit ? Load balancing ? Request timeout ? Request retry with backoff? Circuit breaking ? Logging? Tracing? (Observability) Network Logging? Tracing? (Observability)
Service A Service B Network AuthN and AuthZ? API limit ? Load balancing ? Request timeout ? Request retry with backoff? Circuit breaking ? Logging? Tracing? (Observability) Network Logging? Tracing? (Observability) Different protocols..
GCP project: GKE Production Production Cluster GCP project: GKE Development Development Cluster IAM: SRE IAM: SRE + α 1 cluster for 1 GCP project Only SRE can access cluster nodes
GCP project: GKE Production Production Cluster n1-standard-16 node pool n1-highmem-16 node pool Machine learning workloads Normal applications Auto scaling Enabled Automatic node repair Enabled Preemptible Enabled (only in US)
Each services has its own kubernetes namespace GCP project: GKE Production Namespace: Service A Pod: A Pod: A Pod: A Namespace: Service B Pod: B Pod: B Production Cluster RBAC: Team X RBAC: Team X Each team can only access its own kubernetes namespace
GCP project: GKE Production IAM: SRE Namespace: Service A Pod: A Pod: A Pod: A Namespace: Service B Pod: B Pod: B Production Cluster RBAC: Team X RBAC: Team Y
GCP project: GKE Production IAM: SRE Namespace: Service A Pod: A Pod: A Pod: A Namespace: Service B Pod: B Pod: B GCP project: Service A IAM: Team X + SRE GCP project: Service B IAM: Team Y + SRE Production Cluster Each services has its own GCP project RBAC: Team X RBAC: Team Y
GCP project: GKE Production IAM: SRE Namespace: Service A Pod: A Pod: A Pod: A Namespace: Service B Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Each services has its own GCP project RBAC: Team X RBAC: Team Y Service resources in its own GCP project
GCP project: GKE Production IAM: SRE Namespace: Service A Pod: A Pod: A Pod: A Namespace: Service B Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Each services has its own GCP project Each namespace has its own service account for its own GCP project RBAC: Team X RBAC: Team Y Service resources in its own GCP project
GCP project: GKE Production IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Each services has its own GCP project Each namespace has its own service account for its own GCP project Service resources in its own GCP project
IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster GCP project creation…? Setup Spanner or Cloud SQL ..? GCP project: GKE Production
IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Stackdriver GCP project: GKE Production
IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Logging…? Stackdriver GCP project: GKE Production
IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Logging…? Stackdriver GCP project: GKE Production
IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Stackdriver Big Query Big Query GCP project: GKE Production Create BQ for each services
IAM: SRE Namespace: Service A RBAC: Team X Pod: A Pod: A Pod: A Namespace: Service B RBAC: Team Y Pod: B Pod: B GCP project: Service A IAM: Team X + SRE Cloud SQL GCP project: Service B Spanner IAM: Team Y + SRE Production Cluster Create BQ sink for each services Stackdriver Big Query Big Query sink sink GCP project: GKE Production Create BQ for each services