Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Google Cloud Workflows: API automation, patterns and best practices

Google Cloud Workflows: API automation, patterns and best practices

Google Cloud Workflows: orchestrate & automate API services with serverless workflows

- Workflows at a glance, benefits, key features, use cases
- UI interface in Google Cloud console
- Deep dive into the Workflows syntax
- Workflows connectors
- Demos
- Patterns and best practices

Guillaume Laforge

February 01, 2023
Tweet

More Decks by Guillaume Laforge

Other Decks in Technology

Transcript

  1. Proprietary + Confidential
    Workflows
    Orchestrate & automate
    API services with
    serverless workflows
    Guillaume Laforge — @glaforge
    Developer Advocate, Google Cloud
    February 2023

    View Slide

  2. Proprietary + Confidential
    Workflows at a glance,
    benefits, and key features

    View Slide

  3. Proprietary + Confidential
    Workflows — At a glance
    Serverless
    Compute
    External
    API’s
    Google
    API’s
    etc...
    Workflows - orchestrate & integrate
    SaaS
    API’s
    Private
    API’s
    Other
    Clouds

    View Slide

  4. Proprietary + Confidential
    Orchestrate work across
    any services & APIs you use
    Easy-to-use workflow
    orchestration managing the
    work across Google Cloud
    products or any HTTP-based
    APIs, including SaaS or
    private APIs.
    Workflows — Benefits
    Serverless scalability and
    managed infrastructure
    Focus on modeling your
    workflow logic and let
    Workflows completely
    manage the infrastructure
    with rapid scaling.
    Pay-per-use pricing model
    Pay only if your workflows
    run: scale your costs down
    to zero during times of
    inactivity.

    View Slide

  5. Proprietary + Confidential
    Define workflows with a YAML or JSON syntax.
    Visual representation of your workflows.
    Expression formulas supporting decision points,
    conditional step executions, and operations on variables.
    Passing information between steps with built-in JSON
    parsing and expression-based variable manipulations.
    Workflow definition
    and visualisation
    Built-in decisions and
    conditional step executions
    Passing variable values
    between workflow steps
    Workflows — Features

    View Slide

  6. Proprietary + Confidential
    Reliable workflow execution
    Low latency of execution
    Workflows — Features
    Execute workflows with reliability required
    for enterprise and line-of-business applications.
    Fast scheduling of workflow executions
    and transitions between tasks.
    Predictable performance with no cold starts.

    View Slide

  7. Proprietary + Confidential
    Built-in authentication for
    Google Cloud products
    Support for external API calls
    Built-in error handling
    Workflows — Features
    Orchestrate work of any Google Cloud product
    without worrying about authentication. Use a proper
    service account and let Workflows do the rest.
    Out-of-the-box support for calls to API endpoints
    outside of Google Cloud.
    Out of the box error handling for your workflow steps
    with configurable retry policies.

    View Slide

  8. Proprietary + Confidential
    Workflows — Use cases
    Reliable
    transactions
    Low-latency, conditional processes
    with 3rd party integration
    IT infrastructure
    automation

    View Slide

  9. Proprietary + Confidential
    Workflows
    UI interface

    View Slide

  10. View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. View Slide

  15. View Slide

  16. Proprietary + Confidential
    Gcloud commands
    # Deploy a workflow
    gcloud workflows deploy my-workflow \
    --source=workflow.yaml
    # Execute a workflow
    gcloud workflows execute my-workflow
    # See the result
    gcloud workflows executions \
    describe \
    --workflow my-workflow
    Deploy and execute
    a workflow.
    Inspect the result of
    the execution
    of a workflow.

    View Slide

  17. Proprietary + Confidential
    Workflows syntax
    deep dive

    View Slide

  18. Proprietary + Confidential
    Sequences of steps
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    - processPayment:
    call: http.post
    args:
    url: https://payment-processor.run.app/...
    body:
    input: ${paymentDetails}
    result: processResult
    - shipItems:
    call: http.post
    args:
    url: https://.../cloudfunctions.net/ship
    body:
    address: ${processResult.body.address}
    result: shipResult
    - notifyUser:
    call: http.post
    ...

    View Slide

  19. Proprietary + Confidential
    Variable passing &
    JSON parsing
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    - processPayment:
    call: http.post
    args:
    url: https://payment-processor.run.app/...
    body:
    input: ${paymentDetails}
    result: processResult
    - shipItems:
    call: http.post
    args:
    url: https://.../cloudfunctions.net/ship
    body:
    address: ${processResult.body.address}
    result: shipResult
    - notifyUser:
    call: http.post
    ...

    View Slide

  20. Proprietary + Confidential
    Calling HTTP APIs
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    - processPayment:
    call: http.post
    args:
    url: https://payment-processor.run.app/...
    body:
    input: ${paymentDetails}
    result: processResult
    - shipItems:
    call: http.post
    args:
    url: https://.../cloudfunctions.net/ship
    body:
    address: ${processResult.body.address}
    result: shipResult
    - notifyUser:
    call: http.post
    ...

    View Slide

  21. Proprietary + Confidential
    Authentication
    (OAuth2 | OIDC)
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    - processPayment:
    call: http.post
    args:
    url: https://payment-processor.run.app/...
    body:
    input: ${paymentDetails}
    auth:
    type: OIDC
    result: processResult
    ...
    AUTHENTICATION

    View Slide

  22. Proprietary + Confidential
    Pause
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    WAIT
    - pause:
    call: sys.sleep
    args:
    seconds: 60

    View Slide

  23. Proprietary + Confidential
    Logging
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    LOG
    - log-processed:
    call: sys.log
    args:
    text: "Payment processed"
    severity: INFO

    View Slide

  24. Proprietary + Confidential
    base64
    ● encode
    ● decode
    text
    ● encode
    ● decode
    ● find_all
    ● find_all_regex
    ● match_regex
    ● replace_all
    ● replace_all_regex
    ● split
    ● substring
    ● to_lower
    ● to_upper
    ● url_encode
    ● url_encode_plus
    Built-in functions
    http
    ● get
    ● post
    ● put
    ● patch
    ● delete
    ● request
    ● default_retry_*
    sys
    ● get_env
    ● sleep
    ● slee_until
    ● now
    ● log
    time
    ● fomart
    ● parse
    retry
    ● always
    ● default_backoff
    ● never
    errors
    ● type_error
    ● value_error
    ● index_error
    ● key_error
    ● not_implemented_error
    ● recursion_error
    ● zero_division_error
    ● system_error
    ● timeout_error
    ● resource_limit_error
    json
    ● encode
    ● encode_to_string
    ● decode
    events
    ● await_callback
    ● create_callback_endpoint
    math
    ● abs
    ● max
    ● min
    list
    ● concat
    map
    ● get

    View Slide

  25. Proprietary + Confidential
    Error handling,
    conditionals, jumps
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    - processPayment:
    try:
    call: http.post
    args:
    url: https://payment-processor.run.app/...
    body:
    input: ${paymentDetails}
    result: processResult
    except:
    as: e
    steps:
    - known_errors:
    switch:
    - condition: ${not("HttpError" in e.tags)}
    next: connectionError
    - condition: ${e.code == 404}
    return: "Sorry, URL wasn't found."
    - unhandled_exception:
    raise: ${e}
    ERROR
    CHECKING

    View Slide

  26. Proprietary + Confidential
    Retry & backoff
    Payment Processor
    Cloud Run
    Authorize & charge CC
    Notifier
    Cloud Run
    Notify user
    Shipper
    Cloud Functions
    Prepare & ship items
    - processPayment:
    try:
    call: http.post
    args:
    url: https://payment-processor.run.app/...
    body:
    input: ${paymentDetails}
    result: processResult
    retry:
    max_retries: 5
    backoff:
    initial_delay: 1
    max_delay: 60
    multiplier: 2
    MAX: 5 times
    BACKOFF

    View Slide

  27. Proprietary + Confidential
    Like programming language
    subroutines or functions
    Subworkflows main:
    steps:
    - call_fullname:
    call: get_fullname
    args:
    first_name: "Sherlock"
    last_name: "Holmes"
    result: output
    - return_message:
    return: ${output}
    get_fullname:
    params: [first_name, last_name]
    steps:
    - prepMessage:
    return: ${first_name + " " + last_name}

    View Slide

  28. Proprietary + Confidential
    Iterations
    - init:
    assign:
    - results: {}
    - tables:
    - 202202h
    - 202203h
    - 202204h
    - 202205h
    - runQueries:
    for:
    value: table
    index: table_idx
    in: ${tables}
    steps:
    - runQuery:
    call: googleapis.bigquery.v2.jobs.query
    args: …
    Loop over lists, maps, and over
    a range of values.
    Optional index.
    Use break and continue to
    short circuit the loop, with a
    next jump.

    View Slide

  29. Proprietary + Confidential
    Parallel iterations
    - init:
    assign:
    - results: {}
    - tables:
    - 202202h
    - 202203h
    - 202204h
    - 202205h
    - runQueries:
    parallel:
    shared: [results]
    for:
    value: table
    index: table_idx
    in: ${tables}
    steps:
    - runQuery:
    call: googleapis.bigquery.v2.jobs.query
    args: …
    Add a parallel block to
    parallelize iterations.
    Specify shared to list the
    variables that can be accessed
    with write-access in parallel.

    View Slide

  30. Proprietary + Confidential
    Parallel steps - parallelSteps:
    parallel:
    shared: [metadataResp, indexResp, collageResp]
    branches:
    - storeMetadataBranch:
    steps:
    - storeMetadata:
    call: http.request
    args: …
    result: metadataResp
    - indexPictureMetadataBranch:
    steps:
    - indexPictureMetadata:
    call: http.post
    args: …
    result: indexResp
    - collageCallBranch:
    steps:
    - collageCall:
    call: http.get
    args: …
    result: collageResp
    Branches of steps can be run
    in parallel, with the parallel
    keyword, and variables can be
    shared with write access with
    the shared keyword.

    View Slide

  31. Proprietary + Confidential
    Connectors

    View Slide

  32. Proprietary + Confidential
    Simplifies access to Google Cloud products within a workflow:
    ● No need to tweak the URLs to call, or specify authentication
    ● Transparent handling of errors and retries
    (improves reliability and service SLA through retries)
    ● Handles long-running operations
    (transparent polling till result is ready, using a backoff)
    Connectors

    View Slide

  33. Proprietary + Confidential
    Connectors - runQuery:
    call: googleapis.bigquery.v2.jobs.query
    args:
    projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
    body:
    useLegacySql: false
    useQueryCache: false
    timeoutMs: 30000
    # Find top 100 titles with most views on Wikipedia
    query: ${
    "SELECT TITLE, SUM(views)
    FROM `bigquery-samples.wikipedia_pageviews."+table+"`
    WHERE LENGTH(TITLE) > 10
    GROUP BY TITLE
    ORDER BY SUM(VIEWS) DESC
    LIMIT 100"
    }
    result: queryResult
    Make it easier to work with
    certain Google Cloud services
    and APIs, like transparently
    waiting for long-running
    operations.

    View Slide

  34. Proprietary + Confidential
    Dataflow
    Document AI
    Firestore
    Google Forms ßeta
    Integrations ßeta
    Natural Language
    Machine Learning ßeta
    Pub/Sub
    Cloud Run
    BigQuery
    BigQuery Data Transfer ßeta
    Cloud Build
    Cloud Functions
    Cloud Resource Manager ßeta
    Cloud Scheduler
    Cloud Tasks
    Compute
    Container
    Connectors: currently available, but more to come…
    Secret Manager
    Google Sheets ßeta
    Spanner
    SQL Admin
    Cloud Storage
    Cloud Storage Transfer
    Translate
    Workflows Executions
    Workflows

    View Slide

  35. Proprietary + Confidential
    Stopping a Compute Engine
    VM without connectors
    Need to poll till the VM
    is really stopped
    Without connector - stop_machine:
    try:
    call: http.post
    args:
    url: ${"https://compute.googleapis.com/compute/v1/projects/"+project+"/zones/"+zone+"/instances/"+instanceName+"/stop"}
    auth:
    type: OAuth2
    result: stop_resp
    retry: ${http.default_retry}
    - check_status:
    try:
    steps:
    - sleep:
    call: sys.sleep
    args:
    seconds: ${polling_delay}
    - adjust_delay:
    assign:
    - polling_delay: ${polling_delay * multiplier}
    - poll_status:
    call: http.get
    args:
    url: ${stop_resp.body.selfLink}
    auth:
    type: OAuth2
    result: status_resp
    - compare:
    switch:
    - condition: ${status_resp.body.status == "DONE"}
    next: successfully_stopped
    - condition: ${status_resp.body.status == "RUNNING" or status_resp.body.status == "PENDING"}
    next: poll_status
    - condition: ${"error" in status_resp.body}
    next: failed
    retry: ${http.default_retry}
    - successfully_stopped:
    return: "VM instance successfully stopped!"
    - failed:
    return: ${status_resp.body.error}

    View Slide

  36. Proprietary + Confidential
    Stopping a Compute Engine
    VM with the connector
    No need to poll: the connector
    waits for the end of a “long
    running operation”
    With connector
    - stop_machine:
    call: googleapis.compute.v1.instances.stop
    args:
    instance: ${instanceName}
    project: ${project}
    zone: ${zone}
    # Optional connector parameters
    connector_params:
    timeout: 100
    polling_policy:
    initial_delay: 1
    multiplier: 1.25

    View Slide

  37. Proprietary + Confidential
    Workflows in action!

    View Slide

  38. Proprietary + Confidential
    ● How to invoke a workflow execution programmatically
    ○ Using the multi-language client libraries
    ● How to schedule a workflow execution
    ○ Thanks to Cloud Scheduler
    ● How to use functions to palliate the lack of expressiveness
    ○ Taking advantage of Cloud Functions for logic impossible with Workflows
    ● How to access secrets from Secret Manager
    ○ Avoid hard-coding secrets
    ● How to send a Pub/Sub message
    Workflows in action!

    View Slide

  39. Proprietary + Confidential
    Plenty of demos & samples!
    https://github.com/GoogleCloudPlatform/workflows-demos
    Syntax cheat sheet
    https://cloud.google.com/workflows/docs/reference/syntax/syntax-cheat-sheet
    A “smart” expense report workflow
    https://github.com/GoogleCloudPlatform/smart-expenses
    More examples of Workflows

    View Slide

  40. Proprietary + Confidential
    Márton Kodok (Google Developer Expert)
    shared concrete use cases with Workflows
    https://martonkodok.medium.com/
    ● Automate the execution of BigQuery queries with Cloud Workflows
    ● Firestore backups the easy way with Cloud Workflows
    ● Run shell commands and orchestrate Compute Engine VMs with Cloud Workflows
    ● Using Cloud Workflows to load Cloud Storage files into BigQuery
    Concrete examples from the Community

    View Slide

  41. Proprietary + Confidential
    Patterns & best practices

    View Slide

  42. Proprietary + Confidential
    Direct calls
    ➔ A simple architecture
    with a handful of
    services that do not
    change that often
    Make a conscious choice
    Event-driven architecture
    ➔ Services are not
    closely related
    ➔ Services are not executed
    in parallel or in no certain
    order
    ➔ Services can exist in
    different bounded
    contexts
    Central orchestrator
    ➔ Services are closely related
    ➔ Services are usually
    deployed and executed in
    the same order
    ➔ Can you describe the
    architecture in a flow
    chart?

    View Slide

  43. Proprietary + Confidential
    Event-driven orchestration
    github.com/GoogleCloudPlatform/eventarc-samples/tree/main/processing-pipelines/image-v3

    View Slide

  44. Proprietary + Confidential
    Handle errors with retries and saga pattern
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/retries-and-saga

    View Slide

  45. Proprietary + Confidential
    Wait for HTTP / event callbacks instead of polling
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/callback-translation
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/callback-event

    View Slide

  46. Proprietary + Confidential
    Orchestration usually involves steps run sequentially one after another.
    Try to parallelize those steps when you can.
    Example: running BigQuery jobs against Wikipedia dataset with Workflows:
    ● Serial: 5 queries run sequentially each 20 seconds: Total 1min40s
    ● Parallel: 5 queries run in parallel: Total 20 seconds
    Parallelize when you can
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/bigquery-parallel

    View Slide

  47. Proprietary + Confidential
    Sometimes you can’t use serverless
    due to some limitation (time, memory, CPU)
    Instead you use a Virtual Machine (VM)
    with the configuration you need
    Automate the VM lifecycle with an orchestrator
    to have a serverless experience
    Combine serverful workloads with serverless orchestration
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/long-running-container

    View Slide

  48. Proprietary + Confidential
    Manage long running batch jobs with serverless orchestration
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/screenshot-jobs
    github.com/GoogleCloudPlatform/batch-samples/tree/main/primegen

    View Slide

  49. Proprietary + Confidential
    Use GitOps to manage orchestration lifecycle
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/gitops

    View Slide

  50. Proprietary + Confidential
    Plan for multi-environment deployments
    github.com/GoogleCloudPlatform/workflows-demos/tree/master/multi-env-deployment

    View Slide

  51. Proprietary + Confidential
    Guillaume Laforge — @glaforge
    Developer Advocate, Google Cloud
    February 2023
    Thanks for your attention!
    Q & A

    View Slide