Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Google Cloud Workflows: API automation, pattern...

Google Cloud Workflows: API automation, patterns and best practices

Google Cloud Workflows: orchestrate & automate API services with serverless workflows

- Workflows at a glance, benefits, key features, use cases
- UI interface in Google Cloud console
- Deep dive into the Workflows syntax
- Workflows connectors
- Demos
- Patterns and best practices

Guillaume Laforge

February 01, 2023
Tweet

More Decks by Guillaume Laforge

Other Decks in Technology

Transcript

  1. Proprietary + Confidential Workflows Orchestrate & automate API services with

    serverless workflows Guillaume Laforge — @glaforge Developer Advocate, Google Cloud February 2023
  2. Proprietary + Confidential Workflows — At a glance Serverless Compute

    External API’s Google API’s etc... Workflows - orchestrate & integrate SaaS API’s Private API’s Other Clouds
  3. Proprietary + Confidential Orchestrate work across any services & APIs

    you use Easy-to-use workflow orchestration managing the work across Google Cloud products or any HTTP-based APIs, including SaaS or private APIs. Workflows — Benefits Serverless scalability and managed infrastructure Focus on modeling your workflow logic and let Workflows completely manage the infrastructure with rapid scaling. Pay-per-use pricing model Pay only if your workflows run: scale your costs down to zero during times of inactivity.
  4. Proprietary + Confidential Define workflows with a YAML or JSON

    syntax. Visual representation of your workflows. Expression formulas supporting decision points, conditional step executions, and operations on variables. Passing information between steps with built-in JSON parsing and expression-based variable manipulations. Workflow definition and visualisation Built-in decisions and conditional step executions Passing variable values between workflow steps Workflows — Features
  5. Proprietary + Confidential Reliable workflow execution Low latency of execution

    Workflows — Features Execute workflows with reliability required for enterprise and line-of-business applications. Fast scheduling of workflow executions and transitions between tasks. Predictable performance with no cold starts.
  6. Proprietary + Confidential Built-in authentication for Google Cloud products Support

    for external API calls Built-in error handling Workflows — Features Orchestrate work of any Google Cloud product without worrying about authentication. Use a proper service account and let Workflows do the rest. Out-of-the-box support for calls to API endpoints outside of Google Cloud. Out of the box error handling for your workflow steps with configurable retry policies.
  7. Proprietary + Confidential Workflows — Use cases Reliable transactions Low-latency,

    conditional processes with 3rd party integration IT infrastructure automation
  8. Proprietary + Confidential Gcloud commands # Deploy a workflow gcloud

    workflows deploy my-workflow \ --source=workflow.yaml # Execute a workflow gcloud workflows execute my-workflow # See the result gcloud workflows executions \ describe <your-execution-id> \ --workflow my-workflow Deploy and execute a workflow. Inspect the result of the execution of a workflow.
  9. Proprietary + Confidential Sequences of steps Payment Processor Cloud Run

    Authorize & charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items - processPayment: call: http.post args: url: https://payment-processor.run.app/... body: input: ${paymentDetails} result: processResult - shipItems: call: http.post args: url: https://.../cloudfunctions.net/ship body: address: ${processResult.body.address} result: shipResult - notifyUser: call: http.post ...
  10. Proprietary + Confidential Variable passing & JSON parsing Payment Processor

    Cloud Run Authorize & charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items - processPayment: call: http.post args: url: https://payment-processor.run.app/... body: input: ${paymentDetails} result: processResult - shipItems: call: http.post args: url: https://.../cloudfunctions.net/ship body: address: ${processResult.body.address} result: shipResult - notifyUser: call: http.post ...
  11. Proprietary + Confidential Calling HTTP APIs Payment Processor Cloud Run

    Authorize & charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items - processPayment: call: http.post args: url: https://payment-processor.run.app/... body: input: ${paymentDetails} result: processResult - shipItems: call: http.post args: url: https://.../cloudfunctions.net/ship body: address: ${processResult.body.address} result: shipResult - notifyUser: call: http.post ...
  12. Proprietary + Confidential Authentication (OAuth2 | OIDC) Payment Processor Cloud

    Run Authorize & charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items - processPayment: call: http.post args: url: https://payment-processor.run.app/... body: input: ${paymentDetails} auth: type: OIDC result: processResult ... AUTHENTICATION
  13. Proprietary + Confidential Pause Payment Processor Cloud Run Authorize &

    charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items WAIT - pause: call: sys.sleep args: seconds: 60
  14. Proprietary + Confidential Logging Payment Processor Cloud Run Authorize &

    charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items LOG - log-processed: call: sys.log args: text: "Payment processed" severity: INFO
  15. Proprietary + Confidential base64 • encode • decode text •

    encode • decode • find_all • find_all_regex • match_regex • replace_all • replace_all_regex • split • substring • to_lower • to_upper • url_encode • url_encode_plus Built-in functions http • get • post • put • patch • delete • request • default_retry_* sys • get_env • sleep • slee_until • now • log time • fomart • parse retry • always • default_backoff • never errors • type_error • value_error • index_error • key_error • not_implemented_error • recursion_error • zero_division_error • system_error • timeout_error • resource_limit_error json • encode • encode_to_string • decode events • await_callback • create_callback_endpoint math • abs • max • min list • concat map • get
  16. Proprietary + Confidential Error handling, conditionals, jumps Payment Processor Cloud

    Run Authorize & charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items - processPayment: try: call: http.post args: url: https://payment-processor.run.app/... body: input: ${paymentDetails} result: processResult except: as: e steps: - known_errors: switch: - condition: ${not("HttpError" in e.tags)} next: connectionError - condition: ${e.code == 404} return: "Sorry, URL wasn't found." - unhandled_exception: raise: ${e} ERROR CHECKING
  17. Proprietary + Confidential Retry & backoff Payment Processor Cloud Run

    Authorize & charge CC Notifier Cloud Run Notify user Shipper Cloud Functions Prepare & ship items - processPayment: try: call: http.post args: url: https://payment-processor.run.app/... body: input: ${paymentDetails} result: processResult retry: max_retries: 5 backoff: initial_delay: 1 max_delay: 60 multiplier: 2 MAX: 5 times BACKOFF
  18. Proprietary + Confidential Like programming language subroutines or functions Subworkflows

    main: steps: - call_fullname: call: get_fullname args: first_name: "Sherlock" last_name: "Holmes" result: output - return_message: return: ${output} get_fullname: params: [first_name, last_name] steps: - prepMessage: return: ${first_name + " " + last_name}
  19. Proprietary + Confidential Iterations - init: assign: - results: {}

    - tables: - 202202h - 202203h - 202204h - 202205h - runQueries: for: value: table index: table_idx in: ${tables} steps: - runQuery: call: googleapis.bigquery.v2.jobs.query args: … Loop over lists, maps, and over a range of values. Optional index. Use break and continue to short circuit the loop, with a next jump.
  20. Proprietary + Confidential Parallel iterations - init: assign: - results:

    {} - tables: - 202202h - 202203h - 202204h - 202205h - runQueries: parallel: shared: [results] for: value: table index: table_idx in: ${tables} steps: - runQuery: call: googleapis.bigquery.v2.jobs.query args: … Add a parallel block to parallelize iterations. Specify shared to list the variables that can be accessed with write-access in parallel.
  21. Proprietary + Confidential Parallel steps - parallelSteps: parallel: shared: [metadataResp,

    indexResp, collageResp] branches: - storeMetadataBranch: steps: - storeMetadata: call: http.request args: … result: metadataResp - indexPictureMetadataBranch: steps: - indexPictureMetadata: call: http.post args: … result: indexResp - collageCallBranch: steps: - collageCall: call: http.get args: … result: collageResp Branches of steps can be run in parallel, with the parallel keyword, and variables can be shared with write access with the shared keyword.
  22. Proprietary + Confidential Simplifies access to Google Cloud products within

    a workflow: • No need to tweak the URLs to call, or specify authentication • Transparent handling of errors and retries (improves reliability and service SLA through retries) • Handles long-running operations (transparent polling till result is ready, using a backoff) Connectors
  23. Proprietary + Confidential Connectors - runQuery: call: googleapis.bigquery.v2.jobs.query args: projectId:

    ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")} body: useLegacySql: false useQueryCache: false timeoutMs: 30000 # Find top 100 titles with most views on Wikipedia query: ${ "SELECT TITLE, SUM(views) FROM `bigquery-samples.wikipedia_pageviews."+table+"` WHERE LENGTH(TITLE) > 10 GROUP BY TITLE ORDER BY SUM(VIEWS) DESC LIMIT 100" } result: queryResult Make it easier to work with certain Google Cloud services and APIs, like transparently waiting for long-running operations.
  24. Proprietary + Confidential Dataflow Document AI Firestore Google Forms ßeta

    Integrations ßeta Natural Language Machine Learning ßeta Pub/Sub Cloud Run BigQuery BigQuery Data Transfer ßeta Cloud Build Cloud Functions Cloud Resource Manager ßeta Cloud Scheduler Cloud Tasks Compute Container Connectors: currently available, but more to come… Secret Manager Google Sheets ßeta Spanner SQL Admin Cloud Storage Cloud Storage Transfer Translate Workflows Executions Workflows
  25. Proprietary + Confidential Stopping a Compute Engine VM without connectors

    Need to poll till the VM is really stopped Without connector - stop_machine: try: call: http.post args: url: ${"https://compute.googleapis.com/compute/v1/projects/"+project+"/zones/"+zone+"/instances/"+instanceName+"/stop"} auth: type: OAuth2 result: stop_resp retry: ${http.default_retry} - check_status: try: steps: - sleep: call: sys.sleep args: seconds: ${polling_delay} - adjust_delay: assign: - polling_delay: ${polling_delay * multiplier} - poll_status: call: http.get args: url: ${stop_resp.body.selfLink} auth: type: OAuth2 result: status_resp - compare: switch: - condition: ${status_resp.body.status == "DONE"} next: successfully_stopped - condition: ${status_resp.body.status == "RUNNING" or status_resp.body.status == "PENDING"} next: poll_status - condition: ${"error" in status_resp.body} next: failed retry: ${http.default_retry} - successfully_stopped: return: "VM instance successfully stopped!" - failed: return: ${status_resp.body.error}
  26. Proprietary + Confidential Stopping a Compute Engine VM with the

    connector No need to poll: the connector waits for the end of a “long running operation” With connector - stop_machine: call: googleapis.compute.v1.instances.stop args: instance: ${instanceName} project: ${project} zone: ${zone} # Optional connector parameters connector_params: timeout: 100 polling_policy: initial_delay: 1 multiplier: 1.25
  27. Proprietary + Confidential • How to invoke a workflow execution

    programmatically ◦ Using the multi-language client libraries • How to schedule a workflow execution ◦ Thanks to Cloud Scheduler • How to use functions to palliate the lack of expressiveness ◦ Taking advantage of Cloud Functions for logic impossible with Workflows • How to access secrets from Secret Manager ◦ Avoid hard-coding secrets • How to send a Pub/Sub message Workflows in action!
  28. Proprietary + Confidential Plenty of demos & samples! https://github.com/GoogleCloudPlatform/workflows-demos Syntax

    cheat sheet https://cloud.google.com/workflows/docs/reference/syntax/syntax-cheat-sheet A “smart” expense report workflow https://github.com/GoogleCloudPlatform/smart-expenses More examples of Workflows
  29. Proprietary + Confidential Márton Kodok (Google Developer Expert) shared concrete

    use cases with Workflows https://martonkodok.medium.com/ • Automate the execution of BigQuery queries with Cloud Workflows • Firestore backups the easy way with Cloud Workflows • Run shell commands and orchestrate Compute Engine VMs with Cloud Workflows • Using Cloud Workflows to load Cloud Storage files into BigQuery Concrete examples from the Community
  30. Proprietary + Confidential Direct calls ➔ A simple architecture with

    a handful of services that do not change that often Make a conscious choice Event-driven architecture ➔ Services are not closely related ➔ Services are not executed in parallel or in no certain order ➔ Services can exist in different bounded contexts Central orchestrator ➔ Services are closely related ➔ Services are usually deployed and executed in the same order ➔ Can you describe the architecture in a flow chart?
  31. Proprietary + Confidential Handle errors with retries and saga pattern

    github.com/GoogleCloudPlatform/workflows-demos/tree/master/retries-and-saga
  32. Proprietary + Confidential Wait for HTTP / event callbacks instead

    of polling github.com/GoogleCloudPlatform/workflows-demos/tree/master/callback-translation github.com/GoogleCloudPlatform/workflows-demos/tree/master/callback-event
  33. Proprietary + Confidential Orchestration usually involves steps run sequentially one

    after another. Try to parallelize those steps when you can. Example: running BigQuery jobs against Wikipedia dataset with Workflows: • Serial: 5 queries run sequentially each 20 seconds: Total 1min40s • Parallel: 5 queries run in parallel: Total 20 seconds Parallelize when you can github.com/GoogleCloudPlatform/workflows-demos/tree/master/bigquery-parallel
  34. Proprietary + Confidential Sometimes you can’t use serverless due to

    some limitation (time, memory, CPU) Instead you use a Virtual Machine (VM) with the configuration you need Automate the VM lifecycle with an orchestrator to have a serverless experience Combine serverful workloads with serverless orchestration github.com/GoogleCloudPlatform/workflows-demos/tree/master/long-running-container
  35. Proprietary + Confidential Manage long running batch jobs with serverless

    orchestration github.com/GoogleCloudPlatform/workflows-demos/tree/master/screenshot-jobs github.com/GoogleCloudPlatform/batch-samples/tree/main/primegen