Slide 1

Slide 1 text

Service orchestration patterns Mete Atamel Developer Advocate at Google @meteatamel atamel.dev speakerdeck.com/meteatamel

Slide 2

Slide 2 text

Service orchestration

Slide 3

Slide 3 text

How do you organize a group of microservices to cooperate towards a common goal?

Slide 4

Slide 4 text

Option 1: Direct service-to-service calls Services calling each other directly Frontend App Engine Order request Payment Processor Cloud Run Authorize & charge CC Shipper Cloud Functions Prepare & ship items Notifier Cloud Run Notify user

Slide 5

Slide 5 text

Direct service-to-service calls Pros ➕ Easy to implement: Services simply call each other Cons ➖ Too much coupling ➖ Each service can be a single point of failure ➖ Each service needs its own error / retry / timeout logic ➖ Who ensures the whole transaction is successful? (hint: saga pattern)

Slide 6

Slide 6 text

Option 2: Indirect via events (choreography) Event-driven services Frontend App Engine Order request Payment Processor Cloud Run Authorize & charge CC Shipper Cloud Functions Prepare & ship items Notifier Cloud Run Notify user Message Broker Google Cloud: Pub/Sub, Eventarc AWS: SQS, SNS, EventBridge Azure: Event Grid, Event Hubs, Service Bus Other: Kafka, Pulsar, Solace PubSub+, RabbitMQ, NATS...

Slide 7

Slide 7 text

Indirect via events Pros ➕ Services are loosely coupled ➕ Services can be changed/scaled independently ➕ No single point of failure ➕ Events are useful to extend the system Cons ➖ Difficult to monitor ➖ Errors / retries / timeouts are hard ➖ The business flow is not captured explicitly ➖ Who ensures the whole transaction is successful?

Slide 8

Slide 8 text

Imagine a more complex scenario

Slide 9

Slide 9 text

Option 3: A central orchestrator Orchestrated services Frontend App Engine Order request Payment Processor Cloud Run Authorize & charge CC Shipper Cloud Functions Prepare & ship items Notifier Cloud Run Notify user Orchestrator Google Cloud: Workflows, Cloud Composer AWS: Step Functions Azure: Logic Apps Other: CNCF Serverless Workflow, Apache Airflow, Camel, Camunda…

Slide 10

Slide 10 text

A central orchestrator Pros ➕ Business flow captured centrally, source controlled, versioned etc. ➕ Each step can be monitored ➕ Errors / retries / timeouts can be centralized ➕ Services are still independent Cons ➖ A new orchestrator service to learn and maintain ➖ Orchestrator could be a single point of failure ➖ Loss of eventing flexibility ➖ How do you compensate for failed steps? (hint: saga pattern)

Slide 11

Slide 11 text

Patterns and best practices

Slide 12

Slide 12 text

Make a conscious choice Indirect Events ➔ Services are not closely related ➔ Services are not executed in parallel or in no certain order ➔ Services can exist in different bounded contexts Central Orchestration ➔ Services are closely related ➔ Services are usually deployed and executed in the same order ➔ Can you describe the architecture in a flow chart? Direct Calls ➔ A simple architecture with a handful of services

Slide 13

Slide 13 text

Event-driven orchestration github.com/GoogleCloudPlatform/eventarc-samples/tree/main/processing-pipelines/image-v3

Slide 14

Slide 14 text

Handle errors with retries and saga pattern github.com/GoogleCloudPlatform/workflows-demos/tree/master/retries-and-saga

Slide 15

Slide 15 text

Wait for HTTP/event callbacks instead of polling github.com/GoogleCloudPlatform/workflows-demos/tree/master/callback-translation github.com/GoogleCloudPlatform/workflows-demos/tree/master/callback-event

Slide 16

Slide 16 text

Parallelize when you can github.com/GoogleCloudPlatform/workflows-demos/tree/master/bigquery-parallel Orchestration usually involves steps run sequentially one after another. Try to parallelize those steps when you can. Example: running BigQuery jobs against Wikipedia dataset with Workflows: ● Serial: 5 queries run sequentially each 20 seconds: Total 1 min ● Parallel: 5 queries run in parallel: Total 20 seconds

Slide 17

Slide 17 text

Combine serverful workloads with serverless orchestration Sometimes you can’t use serverless due to some limitation (time, memory, CPU) Instead you use a Virtual Machine (VM) with the configuration you need Automate the VM lifecycle with an orchestrator to have a serverless experience github.com/GoogleCloudPlatform/workflows-demos/tree/master/long-running-container

Slide 18

Slide 18 text

Manage long running batch jobs with serverless orchestration github.com/GoogleCloudPlatform/workflows-demos/tree/master/screenshot-jobs github.com/GoogleCloudPlatform/batch-samples/tree/main/primegen

Slide 19

Slide 19 text

Use GitOps to manage orchestration lifecycle github.com/GoogleCloudPlatform/workflows-demos/tree/master/gitops

Slide 20

Slide 20 text

Plan for multi-environment deployments github.com/GoogleCloudPlatform/workflows-demos/tree/master/multi-env-deployment

Slide 21

Slide 21 text

Thank you! Mete Atamel Developer Advocate at Google @meteatamel atamel.dev speakerdeck.com/meteatamel Feedback? bit.ly/atamel