Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dagster & Geomagical

Dagster & Geomagical

Noah Kantrowitz

February 09, 2021
Tweet

More Decks by Noah Kantrowitz

Other Decks in Programming

Transcript

  1. Noah Kantrowitz > @kantrn - coderanger.net > Principal Ops @

    Geomagical > Part of the IKEA family > Augmented reality with furniture
  2. Starting Point > Celery & RabbitMQ > Each operation as

    its own daemon > celery.canvas > Custom DAG compiler
  3. Design Goals > Keeping most of the solid structure >

    Improved DAG expressiveness > Low fixed overhead, compatible with autoscaling > More detailed tracking and metrics
  4. Dagster > Met all our requirements for structural simplicity >

    DAG compiler was a bit limited but growing fast > Highly responsive team Dagster > No execution setup that met our needs
  5. But dagster_celery? > Solid and pipeline code commingled > Single

    runtime environment > Hard to build a workflow around at scale
  6. But dagster_k8s? > Fine for infrequent or non-customer facing tasks

    > Do not put kube-apiserver in your hot path > No really, I mean it
  7. Autoscaling > KEDA watching RabbitMQ > Zero-scale: only Dagit and

    gRPC daemons > task_acks_late = True > worker_prefetch_multiplier = 1
  8. Remote Solids > Independent release cycles for each Solid >

    Can run multiple versions in parallel > Testing in isolation
  9. Proxy Solids @celery_solid(queue='repo-something') def something(context, item): output = yield {

    'foo': item['bar'], } item['something'] = output yield Output(item)
  10. Workflow > One git repo per Dagster repo > main.py

    which holds "default" Pipeline > solids.py which defines proxy Solids > Misc other pipelines for testing and development
  11. CI/CD Briefly, since this is its own rabbit hole >

    Buildkite > kustomize edit set image > ArgoCD
  12. How It's Going > Happy with overall progress > Still

    dropping some tasks at load > Plan to move forward looks good
  13. Future Plans > Async execution support > Events from solid

    workers > Pipeline-level webhooks > Predictive auto-scaling? K8s Operator?