Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Dagster & Geomagical
Search
Noah Kantrowitz
February 09, 2021
Programming
0
140
Dagster & Geomagical
Noah Kantrowitz
February 09, 2021
Tweet
Share
More Decks by Noah Kantrowitz
See All by Noah Kantrowitz
What Python Can Learn From Other Languages
coderanger
0
37
What Python Can Learn From Other Languages (with notes)
coderanger
0
120
Swiss Army Django: Small Footprint ETL (with notes) - DjangoCon US
coderanger
0
180
Swiss Army Django: Small Footprint ETL - DjangoCon US
coderanger
0
44
How to look at space: PyCon AU
coderanger
0
84
Swiss Army Django: Small Footprint ETL
coderanger
0
69
Swiss Army Django: Small Footprint ETL (with notes)
coderanger
0
66
Minimum Viable Kubernetes
coderanger
0
29
Minimum Viable Kubernetes (with notes)
coderanger
0
380
Other Decks in Programming
See All in Programming
プログラミング言語学習のススメ / why-do-i-learn-programming-language
yashi8484
0
110
Amazon Q Developer Proで効率化するAPI開発入門
seike460
PRO
0
100
How mixi2 Uses TiDB for SNS Scalability and Performance
kanmo
7
3.4k
Software Architecture
hschwentner
6
2.1k
定理証明プラットフォーム lapisla.net
abap34
1
1.7k
TokyoR116_BeginnersSession1_環境構築
kotatyamtema
0
110
Djangoアプリケーション 運用のリアル 〜問題発生から可視化、最適化への道〜 #pyconshizu
kashewnuts
1
210
Formの複雑さに立ち向かう
bmthd
0
230
『品質』という言葉が嫌いな理由
korimu
0
150
Writing documentation can be fun with plugin system
okuramasafumi
0
120
Simple組み合わせ村から大都会Railsにやってきた俺は / Coming to Rails from the Simple
moznion
3
4.3k
Swiftコンパイラ超入門+async関数の仕組み
shiz
0
210
Featured
See All Featured
Done Done
chrislema
182
16k
Become a Pro
speakerdeck
PRO
26
5.1k
The Cult of Friendly URLs
andyhume
78
6.2k
A Tale of Four Properties
chriscoyier
158
23k
The Invisible Side of Design
smashingmag
299
50k
Fireside Chat
paigeccino
34
3.2k
Why You Should Never Use an ORM
jnunemaker
PRO
55
9.2k
Speed Design
sergeychernyshev
25
780
Building Applications with DynamoDB
mza
93
6.2k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
9
1.3k
Automating Front-end Workflow
addyosmani
1367
200k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Transcript
Geomagical & Dagster Dagster Community Meeting
Noah Kantrowitz > @kantrn - coderanger.net > Principal Ops @
Geomagical > Part of the IKEA family > Augmented reality with furniture
Our Product
Starting Point > Celery & RabbitMQ > Each operation as
its own daemon > celery.canvas > Custom DAG compiler
Design Goals > Keeping most of the solid structure >
Improved DAG expressiveness > Low fixed overhead, compatible with autoscaling > More detailed tracking and metrics
Dagster > Met all our requirements for structural simplicity >
DAG compiler was a bit limited but growing fast > Highly responsive team Dagster > No execution setup that met our needs
But dagster_celery? > Solid and pipeline code commingled > Single
runtime environment > Hard to build a workflow around at scale
But dagster_k8s? > Fine for infrequent or non-customer facing tasks
> Do not put kube-apiserver in your hot path > No really, I mean it
None
Autoscaling > KEDA watching RabbitMQ > Zero-scale: only Dagit and
gRPC daemons > task_acks_late = True > worker_prefetch_multiplier = 1
Remote Solids > Independent release cycles for each Solid >
Can run multiple versions in parallel > Testing in isolation
Writing A Remote Solid app = SolidCelery('repo-something') @app.task(bind=True) def something(self,
foo: str) -> str: return f'Hello {foo}'
Proxy Solids @celery_solid(queue='repo-something') def something(context, item): output = yield {
'foo': item['bar'], } item['something'] = output yield Output(item)
Workflow > One git repo per Dagster repo > main.py
which holds "default" Pipeline > solids.py which defines proxy Solids > Misc other pipelines for testing and development
CI/CD Briefly, since this is its own rabbit hole >
Buildkite > kustomize edit set image > ArgoCD
Downsides > Slow cold start > No feedback during long
tasks > New and exciting bugs
How It's Going > Happy with overall progress > Still
dropping some tasks at load > Plan to move forward looks good
Future Plans > Async execution support > Events from solid
workers > Pipeline-level webhooks > Predictive auto-scaling? K8s Operator?
Can I Use This? Kinda sorta geomagical/dagster_geomagical
Thank You Questions?