Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Tech x Marketing #4 Airflowでもサブワークフロー単位で分割開発したい!
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Naoki Matsuda
August 27, 2020
Programming
0
200
Tech x Marketing #4 Airflowでもサブワークフロー単位で分割開発したい!
Naoki Matsuda
August 27, 2020
Tweet
Share
More Decks by Naoki Matsuda
See All by Naoki Matsuda
[PyCon JP 2019] 新米Pythonistaが贈るAirflow入門&活用事例紹介
matsudan
2
6.9k
Other Decks in Programming
See All in Programming
AHC061解説
shun_pi
0
350
AWS×クラウドネイティブソフトウェア設計 / AWS x Cloud-Native Software Design
nrslib
13
2.7k
エンジニアの「手元の自動化」を加速するn8n 2026.02.27
symy2co
0
110
RAGでハマりがちな"Excelの罠"を、データの構造化で突破する
harumiweb
9
2.6k
猫の手も借りたい!ので AIエージェント猫を作って社内に放した話 Claude Code × Container Lambda の Slack Bot "DevNeko"
naramomi7
0
260
DSPy入門 Pythonで実現する自動プロンプト最適化 〜人手によるプロンプト調整からの卒業〜
seaturt1e
1
610
15年目のiOSアプリを1から作り直す技術
teakun
1
610
AI Assistants for Your Angular Solutions
manfredsteyer
PRO
0
110
AIコーディングの理想と現実 2026 | AI Coding: Expectations vs. Reality 2026
tomohisa
0
1.2k
あなたはユーザーではない #PdENight
kajitack
4
340
S3ストレージクラスの「見える」「ある」「使える」は全部違う ─ 体験から見た、仕様の深淵を覗く
ya_ma23
0
120
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
340
Featured
See All Featured
Context Engineering - Making Every Token Count
addyosmani
9
740
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
100
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
1
1.2k
The SEO Collaboration Effect
kristinabergwall1
0
380
Into the Great Unknown - MozCon
thekraken
40
2.3k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.4k
How to Talk to Developers About Accessibility
jct
2
150
Bash Introduction
62gerente
615
210k
Being A Developer After 40
akosma
91
590k
Impact Scores and Hybrid Strategies: The future of link building
tamaranovitovic
0
220
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
120
Transcript
AirflowͰαϒϫʔΫϑϩʔ ୯ҐͰׂ։ൃ͍ͨ͠ʂ Tech x Marketing meetup #4 גࣜձࣾ ి௨σδλϧ দా
थ 2020/08/27
ࣗݾհ - ໊લɿদా थ (·ͭͩ ͳ͓͖) - ॴଐɿגࣜձࣾ ి௨σδλϧ (2018ೖࣾ)
- ۀɿόοΫΤϯυαʔϏεAPI࣮ ETLߏஙͳͲ (༻ݴޠɿGo / Python) - ֶੜ࣌ɿےܹ࣌ͷΤωϧΪʔফඅ/࢈ੜͷݚڀ AirflowͷTalk & Blogࣥචɿ - ৽ถPythonista͕ଃΔAirflowೖ&׆༻ࣄྫհ(PyConJP2019) - AirflowͷλεΫ࣮ߦڥΛ͢Δ(DentsuDigital Tech Blog)
༰ Γ͍ͨ͜ͱ AirflowͷDAG (ϫʔΫϑϩʔ)ΛෳਓͰ։ൃ͍ͨ͠ʂ ݕ౼ͨ͠ํ๏ X SubDagOperatorʢฒྻ࣮ߦͷ੍ޚෆՄɺϚωʔδυαʔϏεͰ༻͕ਪ͞Εͯͳ͍ʣ X TriggerDagRunOperator /
ExternalTaskSensorʢґଘ͕ؔΘ͔ΓͮΒ͘ͳΔʣ ̋ λεΫΛ࡞ΔؔΛ࡞ׂͬͯʢఏҊख๏ʣ
Apache Airflowͱ - PythonͰهड़͞ΕͨϫʔΫϑϩʔ(DAG)ͷ࣮ߦɾࢹπʔϧ - ApacheτοϓϨϕϧϓϩδΣΫτͷͻͱͭ ίϯτϦϏϡʔλɿ1000 + ελʔɿ17000 +
Apache Airflowͱ - DAGɿTaskͷ࣮ߦॱংΛܾఆ͢Δάϥϑ - OperatorɿςϯϓϨʔτԽ͞Ε࣮ͨߦ୯Ґ - Taskɿύϥϝʔλ͕༩͑ΒΕͨOperator Operators (Python,
HTTP, MySQL, KubernetesPod…) A B C D https://www.slideshare.net/potix2_jp/airflow-224004058 DAG - Taskґଘؔɿ >> Ͱఆٛ ྫ) A >> [B, C] >> D Task
Γ͍ͨ͜ͱ AirflowͷDAG (ϫʔΫϑϩʔ)ΛෳਓͰ։ൃ͍ͨ͠ʂ ݕ౼ͨ͠ํ๏ X SubDagOperatorʢฒྻ࣮ߦͷ੍ޚෆՄɺϚωʔδυαʔϏεͰ༻͕ਪ͞Εͯͳ͍ʣ X TriggerDagRunOperator / ExternalTaskSensorʢґଘ͕ؔΘ͔ΓͮΒ͘ͳΔʣ
̋ λεΫΛ࡞ΔؔΛ࡞ׂͬͯʢఏҊख๏ʣ ͷഎܠ ༰
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
DAGׂ(ػೳ)͕ҟͳΔϑϩʔʹΑΓߏ͞Ε͏Δ
DAGΛෳਓͰ։ൃ͍ͨ͠ʂ ɾɾɾ DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ
σʔλϚʔτΛߋ৽͢ΔDAG ׂ(ػೳ)͕ҟͳΔϑϩʔ֤ʑҟͳΔγεςϜͱ࿈ܞ͢Δ ࿈ܞઌγεςϜ
DAGΛෳਓͰ։ൃ͍ͨ͠ʂ DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG
1ਓͰΔͱେมʂʂ ɾɾɾ ׂ(ػೳ)͕ҟͳΔϑϩʔ֤ʑҟͳΔγεςϜͱ࿈ܞ͢Δ ࿈ܞઌγεςϜ
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
࿈ܞઌγεςϜ༷ʹৄ͍͠ਓʹ͓ئ͍͍ͨ͠ DWHৄ͍͠ਓ εϓϨουγʔ τपΓৄ͍͠ਓ
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
DWHৄ͍͠ਓ εϓϨουγʔ τपΓৄ͍͠ਓ αϒϫʔΫϑϩʔ ࿈ܞઌγεςϜ༷ʹৄ͍͠ਓʹ͓ئ͍͍ͨ͠
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
DAGجຊతʹ1ͭͷϑΝΠϧ DWHৄ͍͠ਓ εϓϨουγʔ τपΓৄ͍͠ਓ
ϫʔΫϑϩʔΛׂ։ൃ͢ΔͨΊͷ݅
ϫʔΫϑϩʔΛׂ։ൃ͢ΔͨΊͷ݅ • ϫʔΫϑϩʔ(DAG)Λߏ͢ΔαϒϫʔΫϑϩʔ͕ϑΝΠϧͰ͞ΕΔ • ϑΝΠϧؒͰґଘ͕ؔఆٛͰ͖Δ • ϑΝΠϧؒͰఆٛ͞Εͨґଘ͕ؔUI্ͰՄࢹԽ͞ΕΔ • ฒྻ࣮ߦ੍͕ޚͰ͖Δ αϒϫʔΫϑϩʔ1
αϒϫʔΫϑϩʔ2 αϒϫʔΫϑϩʔ3 ґଘؔ
ఏҊํ๏ λεΫΛ࡞ΔؔΛ࡞ׂͬͯ
λεΫΛ࡞ΔؔΛ࡞ׂͬͯ • ֤αϒϫʔΫϑϩʔͷఆٛϑΝΠϧͰɺͦͷαϒϫʔΫϑϩʔͷ࠷ॳͱ࠷ޙͷλεΫΛ ฦ͢ → αϒϫʔΫϑϩʔͷ࠷ॳͱ࠷ޙͷλεΫ͕͔ΕɺαϒϫʔΫϑϩʔؒͷґଘؔఆٛ Ͱ͖Δ ˍ ͦͷґଘؔఆٛʹ͓͍ͯαϒϫʔΫϑϩʔͷதϒϥοΫϘοΫεʹͰ͖Δ ίϯηϓτ
αϒϫʔΫϑϩʔ1 αϒϫʔΫϑϩʔ2 αϒϫʔΫϑϩʔ3
λεΫΛ࡞ΔؔΛ࡞ׂͬͯ - αϒϫʔΫϑϩʔΛఆٛ͢ΔͨΊͷϑΝΠϧΛ࡞ - αϒϫʔΫϑϩʔؒͷґଘؔΛఆٛ αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py main.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯ αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py main.py - αϒϫʔΫϑϩʔΛఆٛ͢ΔͨΊͷϑΝΠϧΛ࡞ - αϒϫʔΫϑϩʔؒͷґଘؔΛఆٛ
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔͷఆٛ - DAGΦϒδΣΫτΛड͚औΓαϒϫʔΫϑ ϩʔͷ࠷ॳͱ࠷ޙͷλεΫΛฦؔ͢Λ࡞ - ͜ͷؔͰαϒϫʔΫϑϩʔͷλεΫͱͦ ͷґଘؔΛఆٛ sw1.py αϒϫʔΫϑϩʔ1ɿsw1.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔͷఆٛ sw1.py ؔͰฦ͢༻ʹ࠷ॳͱ࠷ޙͷλεΫఆٛ αϒϫʔΫϑϩʔͷػೳ෦ͷλεΫఆٛ ґଘؔఆٛ αϒϫʔΫϑϩʔ1ɿsw1.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔؒͷґଘؔఆٛ αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py main.py - αϒϫʔΫϑϩʔΛఆٛ͢ΔͨΊͷϑΝΠϧΛ࡞ - αϒϫʔΫϑϩʔؒͷґଘؔΛఆٛ
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔؒͷґଘؔఆٛ ֤αϒϫʔΫϑϩʔͷbuild_tasks͔Β࠷ॳͱ࠷ޙͷλ εΫ͕ฦΔ main.py αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔؒͷґଘؔఆٛ ֤αϒϫʔΫϑϩʔͷbuild_tasks͔Β࠷ॳͱ࠷ޙͷλ εΫ͕ฦΔ ্هͰฦͬͨλεΫΛͬͯґଘؔΛఆٛ main.py αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py
·ͱΊ - ػೳͷҟͳΔϑϩʔͰߏ͞ΕΔDAGଓઌ༷ʑͳͷͰ1ਓͰେมͳ ߹͕͋ͬͨ - λεΫΛ࡞ΔؔΛ༻͍ͯαϒϫʔΫϑϩʔ୯ҐͰϑΝΠϧׂͰ͖։ൃ͕ ָʹͳΓ·ͨ͠ - αϒϫʔΫϑϩʔͷ࠷ॳͱ࠷ޙͷλεΫΛ͔ͭͬͯґଘؔఆٛ -
αϒϫʔΫϑϩʔؒґଘؔఆٛͰαϒϫʔΫϑϩʔͷதϒϥοΫϘοΫεʹͰ͖Δ - UIͷGraph viewͰDAGશମΛݟΔͱ͖ʹগ͠ݟ௨͕͠ѱ͍͔ʁ - SubDagͰ͋Ε·ͱΊͯදࣔͯ͘͠ΕΔ
Appendix
SubDagOperator - ओʹ܁Γฦ͠ύλʔϯͰར༻͞ΕΔɻ - ฒྻλεΫΛSubDagΛ͑·ͱΊΒΕΔ
SubDagOperator - ओʹ܁Γฦ͠ύλʔϯͰར༻͞ΕΔɻ - ฒྻλεΫΛSubDagΛ͑·ͱΊΒΕΔ Cloud Composer Astronomer
SubDagOperator - ओʹ܁Γฦ͠ύλʔϯͰར༻͞ΕΔɻ - ฒྻλεΫΛSubDagΛ͑·ͱΊΒΕΔ Α͍ - ϑΝΠϧΛՄೳ - αϒϫʔΫϑϩʔΛ·ͱΊͯදࣔͰ͖Δ
- ґଘ͕ؔՄࢹԽ͞ΕΔ Α͘ͳ͍ - ݱঢ়SubDagΛ͏ʹҙ͕ଟ͘ɺ ༻͕ਪ͞Εͯͳ͍
TriggerDagRunOperator / ExternalTaskSensor Α͍ - TriggerSensorͰґଘؔΛఆٛՄೳ - (ผͷDAGͱͯ͠)ϑΝΠϧΛՄೳ Α͘ͳ͍ -
DAGؒͷґଘ͕ؔUI্ͰՄࢹԽ͞Εͳ͍ - αϒϫʔΫϑϩʔ͝ͱʹDAG࡞ʹͳΔ task 1-1 sensor task 2-1 task 2-2 DAG 1 DAG 2 DAG 1 DAG 2 trigger
ݕ౼ͨ͠ํ๏ͷൺֱ