Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Dagster & Geomagical
Search
Noah Kantrowitz
February 09, 2021
Programming
0
140
Dagster & Geomagical
Noah Kantrowitz
February 09, 2021
Tweet
Share
More Decks by Noah Kantrowitz
See All by Noah Kantrowitz
What Python Can Learn From Other Languages
coderanger
0
39
What Python Can Learn From Other Languages (with notes)
coderanger
0
120
Swiss Army Django: Small Footprint ETL (with notes) - DjangoCon US
coderanger
0
190
Swiss Army Django: Small Footprint ETL - DjangoCon US
coderanger
0
44
How to look at space: PyCon AU
coderanger
0
86
Swiss Army Django: Small Footprint ETL
coderanger
0
72
Swiss Army Django: Small Footprint ETL (with notes)
coderanger
0
66
Minimum Viable Kubernetes
coderanger
0
30
Minimum Viable Kubernetes (with notes)
coderanger
0
390
Other Decks in Programming
See All in Programming
ファインディLT_ポケモン対戦の定量的分析
fufufukakaka
0
920
Djangoアプリケーション 運用のリアル 〜問題発生から可視化、最適化への道〜 #pyconshizu
kashewnuts
1
260
Honoとフロントエンドの 型安全性について
yodaka
7
1.4k
Djangoにおける複数ユーザー種別認証の設計アプローチ@DjangoCongress JP 2025
delhi09
PRO
4
460
「個人開発マネタイズ大全」が教えてくれたこと
bani24884
1
180
複数のAWSアカウントから横断で 利用する Lambda Authorizer の作り方
tc3jp
0
110
15分で学ぶDuckDBの可愛い使い方 DuckDBの最近の更新
notrogue
3
490
プログラミング言語学習のススメ / why-do-i-learn-programming-language
yashi8484
0
160
1年目の私に伝えたい!テストコードを怖がらなくなるためのヒント/Tips for not being afraid of test code
push_gawa
1
510
pylint custom ruleで始めるレビュー自動化
shogoujiie
0
150
dbt Pythonモデルで実現するSnowflake活用術
trsnium
0
260
Datadog Workflow Automation で圧倒的価値提供
showwin
1
160
Featured
See All Featured
We Have a Design System, Now What?
morganepeng
51
7.4k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
49
2.3k
Unsuck your backbone
ammeep
669
57k
Automating Front-end Workflow
addyosmani
1368
200k
The Power of CSS Pseudo Elements
geoffreycrofte
75
5.5k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
40
2k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
10
510
Music & Morning Musume
bryan
46
6.4k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
12
990
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
How to Ace a Technical Interview
jacobian
276
23k
Building Your Own Lightsaber
phodgson
104
6.2k
Transcript
Geomagical & Dagster Dagster Community Meeting
Noah Kantrowitz > @kantrn - coderanger.net > Principal Ops @
Geomagical > Part of the IKEA family > Augmented reality with furniture
Our Product
Starting Point > Celery & RabbitMQ > Each operation as
its own daemon > celery.canvas > Custom DAG compiler
Design Goals > Keeping most of the solid structure >
Improved DAG expressiveness > Low fixed overhead, compatible with autoscaling > More detailed tracking and metrics
Dagster > Met all our requirements for structural simplicity >
DAG compiler was a bit limited but growing fast > Highly responsive team Dagster > No execution setup that met our needs
But dagster_celery? > Solid and pipeline code commingled > Single
runtime environment > Hard to build a workflow around at scale
But dagster_k8s? > Fine for infrequent or non-customer facing tasks
> Do not put kube-apiserver in your hot path > No really, I mean it
None
Autoscaling > KEDA watching RabbitMQ > Zero-scale: only Dagit and
gRPC daemons > task_acks_late = True > worker_prefetch_multiplier = 1
Remote Solids > Independent release cycles for each Solid >
Can run multiple versions in parallel > Testing in isolation
Writing A Remote Solid app = SolidCelery('repo-something') @app.task(bind=True) def something(self,
foo: str) -> str: return f'Hello {foo}'
Proxy Solids @celery_solid(queue='repo-something') def something(context, item): output = yield {
'foo': item['bar'], } item['something'] = output yield Output(item)
Workflow > One git repo per Dagster repo > main.py
which holds "default" Pipeline > solids.py which defines proxy Solids > Misc other pipelines for testing and development
CI/CD Briefly, since this is its own rabbit hole >
Buildkite > kustomize edit set image > ArgoCD
Downsides > Slow cold start > No feedback during long
tasks > New and exciting bugs
How It's Going > Happy with overall progress > Still
dropping some tasks at load > Plan to move forward looks good
Future Plans > Async execution support > Events from solid
workers > Pipeline-level webhooks > Predictive auto-scaling? K8s Operator?
Can I Use This? Kinda sorta geomagical/dagster_geomagical
Thank You Questions?