Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Tech x Marketing #4 Airflowでもサブワークフロー単位で分割開発したい!
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Naoki Matsuda
August 27, 2020
Programming
0
200
Tech x Marketing #4 Airflowでもサブワークフロー単位で分割開発したい!
Naoki Matsuda
August 27, 2020
Tweet
Share
More Decks by Naoki Matsuda
See All by Naoki Matsuda
[PyCon JP 2019] 新米Pythonistaが贈るAirflow入門&活用事例紹介
matsudan
2
6.8k
Other Decks in Programming
See All in Programming
Rust 製のコードエディタ “Zed” を使ってみた
nearme_tech
PRO
0
210
Raku Raku Notion 20260128
hareyakayuruyaka
0
370
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
200
16年目のピクシブ百科事典を支える最新の技術基盤 / The Modern Tech Stack Powering Pixiv Encyclopedia in its 16th Year
ahuglajbclajep
5
1k
CSC307 Lecture 10
javiergs
PRO
1
660
OCaml 5でモダンな並列プログラミングを Enjoyしよう!
haochenx
0
150
AI時代の認知負荷との向き合い方
optfit
0
170
IFSによる形状設計/デモシーンの魅力 @ 慶應大学SFC
gam0022
1
310
CSC307 Lecture 01
javiergs
PRO
0
690
CSC307 Lecture 03
javiergs
PRO
1
490
NetBSD+Raspberry Piで 本物のPSGを鳴らすデモを OSC駆動の7日間で作った話 / OSC2026Osaka
tsutsui
1
100
AIと一緒にレガシーに向き合ってみた
nyafunta9858
0
260
Featured
See All Featured
Discover your Explorer Soul
emna__ayadi
2
1.1k
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
210
From π to Pie charts
rasagy
0
130
GitHub's CSS Performance
jonrohan
1032
470k
How to build a perfect <img>
jonoalderson
1
4.9k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
110
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
Impact Scores and Hybrid Strategies: The future of link building
tamaranovitovic
0
200
SEO for Brand Visibility & Recognition
aleyda
0
4.2k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.7k
Unsuck your backbone
ammeep
671
58k
Testing 201, or: Great Expectations
jmmastey
46
8.1k
Transcript
AirflowͰαϒϫʔΫϑϩʔ ୯ҐͰׂ։ൃ͍ͨ͠ʂ Tech x Marketing meetup #4 גࣜձࣾ ి௨σδλϧ দా
थ 2020/08/27
ࣗݾհ - ໊લɿদా थ (·ͭͩ ͳ͓͖) - ॴଐɿגࣜձࣾ ి௨σδλϧ (2018ೖࣾ)
- ۀɿόοΫΤϯυαʔϏεAPI࣮ ETLߏஙͳͲ (༻ݴޠɿGo / Python) - ֶੜ࣌ɿےܹ࣌ͷΤωϧΪʔফඅ/࢈ੜͷݚڀ AirflowͷTalk & Blogࣥචɿ - ৽ถPythonista͕ଃΔAirflowೖ&׆༻ࣄྫհ(PyConJP2019) - AirflowͷλεΫ࣮ߦڥΛ͢Δ(DentsuDigital Tech Blog)
༰ Γ͍ͨ͜ͱ AirflowͷDAG (ϫʔΫϑϩʔ)ΛෳਓͰ։ൃ͍ͨ͠ʂ ݕ౼ͨ͠ํ๏ X SubDagOperatorʢฒྻ࣮ߦͷ੍ޚෆՄɺϚωʔδυαʔϏεͰ༻͕ਪ͞Εͯͳ͍ʣ X TriggerDagRunOperator /
ExternalTaskSensorʢґଘ͕ؔΘ͔ΓͮΒ͘ͳΔʣ ̋ λεΫΛ࡞ΔؔΛ࡞ׂͬͯʢఏҊख๏ʣ
Apache Airflowͱ - PythonͰهड़͞ΕͨϫʔΫϑϩʔ(DAG)ͷ࣮ߦɾࢹπʔϧ - ApacheτοϓϨϕϧϓϩδΣΫτͷͻͱͭ ίϯτϦϏϡʔλɿ1000 + ελʔɿ17000 +
Apache Airflowͱ - DAGɿTaskͷ࣮ߦॱংΛܾఆ͢Δάϥϑ - OperatorɿςϯϓϨʔτԽ͞Ε࣮ͨߦ୯Ґ - Taskɿύϥϝʔλ͕༩͑ΒΕͨOperator Operators (Python,
HTTP, MySQL, KubernetesPod…) A B C D https://www.slideshare.net/potix2_jp/airflow-224004058 DAG - Taskґଘؔɿ >> Ͱఆٛ ྫ) A >> [B, C] >> D Task
Γ͍ͨ͜ͱ AirflowͷDAG (ϫʔΫϑϩʔ)ΛෳਓͰ։ൃ͍ͨ͠ʂ ݕ౼ͨ͠ํ๏ X SubDagOperatorʢฒྻ࣮ߦͷ੍ޚෆՄɺϚωʔδυαʔϏεͰ༻͕ਪ͞Εͯͳ͍ʣ X TriggerDagRunOperator / ExternalTaskSensorʢґଘ͕ؔΘ͔ΓͮΒ͘ͳΔʣ
̋ λεΫΛ࡞ΔؔΛ࡞ׂͬͯʢఏҊख๏ʣ ͷഎܠ ༰
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
DAGׂ(ػೳ)͕ҟͳΔϑϩʔʹΑΓߏ͞Ε͏Δ
DAGΛෳਓͰ։ൃ͍ͨ͠ʂ ɾɾɾ DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ
σʔλϚʔτΛߋ৽͢ΔDAG ׂ(ػೳ)͕ҟͳΔϑϩʔ֤ʑҟͳΔγεςϜͱ࿈ܞ͢Δ ࿈ܞઌγεςϜ
DAGΛෳਓͰ։ൃ͍ͨ͠ʂ DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG
1ਓͰΔͱେมʂʂ ɾɾɾ ׂ(ػೳ)͕ҟͳΔϑϩʔ֤ʑҟͳΔγεςϜͱ࿈ܞ͢Δ ࿈ܞઌγεςϜ
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
࿈ܞઌγεςϜ༷ʹৄ͍͠ਓʹ͓ئ͍͍ͨ͠ DWHৄ͍͠ਓ εϓϨουγʔ τपΓৄ͍͠ਓ
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
DWHৄ͍͠ਓ εϓϨουγʔ τपΓৄ͍͠ਓ αϒϫʔΫϑϩʔ ࿈ܞઌγεςϜ༷ʹৄ͍͠ਓʹ͓ئ͍͍ͨ͠
DWHͷσʔλΛՃ͢ Δϑϩʔ εϓϨουγʔτ͔Β σʔλΛऔಘͯ͠Ճ͢ Δϑϩʔ ՃσʔλΛಥ߹ͯ͠ σʔλϚʔτΛߋ৽͢Δ ϑϩʔ σʔλϚʔτΛߋ৽͢ΔDAG DAGΛෳਓͰ։ൃ͍ͨ͠ʂ
DAGجຊతʹ1ͭͷϑΝΠϧ DWHৄ͍͠ਓ εϓϨουγʔ τपΓৄ͍͠ਓ
ϫʔΫϑϩʔΛׂ։ൃ͢ΔͨΊͷ݅
ϫʔΫϑϩʔΛׂ։ൃ͢ΔͨΊͷ݅ • ϫʔΫϑϩʔ(DAG)Λߏ͢ΔαϒϫʔΫϑϩʔ͕ϑΝΠϧͰ͞ΕΔ • ϑΝΠϧؒͰґଘ͕ؔఆٛͰ͖Δ • ϑΝΠϧؒͰఆٛ͞Εͨґଘ͕ؔUI্ͰՄࢹԽ͞ΕΔ • ฒྻ࣮ߦ੍͕ޚͰ͖Δ αϒϫʔΫϑϩʔ1
αϒϫʔΫϑϩʔ2 αϒϫʔΫϑϩʔ3 ґଘؔ
ఏҊํ๏ λεΫΛ࡞ΔؔΛ࡞ׂͬͯ
λεΫΛ࡞ΔؔΛ࡞ׂͬͯ • ֤αϒϫʔΫϑϩʔͷఆٛϑΝΠϧͰɺͦͷαϒϫʔΫϑϩʔͷ࠷ॳͱ࠷ޙͷλεΫΛ ฦ͢ → αϒϫʔΫϑϩʔͷ࠷ॳͱ࠷ޙͷλεΫ͕͔ΕɺαϒϫʔΫϑϩʔؒͷґଘؔఆٛ Ͱ͖Δ ˍ ͦͷґଘؔఆٛʹ͓͍ͯαϒϫʔΫϑϩʔͷதϒϥοΫϘοΫεʹͰ͖Δ ίϯηϓτ
αϒϫʔΫϑϩʔ1 αϒϫʔΫϑϩʔ2 αϒϫʔΫϑϩʔ3
λεΫΛ࡞ΔؔΛ࡞ׂͬͯ - αϒϫʔΫϑϩʔΛఆٛ͢ΔͨΊͷϑΝΠϧΛ࡞ - αϒϫʔΫϑϩʔؒͷґଘؔΛఆٛ αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py main.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯ αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py main.py - αϒϫʔΫϑϩʔΛఆٛ͢ΔͨΊͷϑΝΠϧΛ࡞ - αϒϫʔΫϑϩʔؒͷґଘؔΛఆٛ
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔͷఆٛ - DAGΦϒδΣΫτΛड͚औΓαϒϫʔΫϑ ϩʔͷ࠷ॳͱ࠷ޙͷλεΫΛฦؔ͢Λ࡞ - ͜ͷؔͰαϒϫʔΫϑϩʔͷλεΫͱͦ ͷґଘؔΛఆٛ sw1.py αϒϫʔΫϑϩʔ1ɿsw1.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔͷఆٛ sw1.py ؔͰฦ͢༻ʹ࠷ॳͱ࠷ޙͷλεΫఆٛ αϒϫʔΫϑϩʔͷػೳ෦ͷλεΫఆٛ ґଘؔఆٛ αϒϫʔΫϑϩʔ1ɿsw1.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔؒͷґଘؔఆٛ αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py main.py - αϒϫʔΫϑϩʔΛఆٛ͢ΔͨΊͷϑΝΠϧΛ࡞ - αϒϫʔΫϑϩʔؒͷґଘؔΛఆٛ
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔؒͷґଘؔఆٛ ֤αϒϫʔΫϑϩʔͷbuild_tasks͔Β࠷ॳͱ࠷ޙͷλ εΫ͕ฦΔ main.py αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py
λεΫΛ࡞ΔؔΛ࡞ׂͬͯɿαϒϫʔΫϑϩʔؒͷґଘؔఆٛ ֤αϒϫʔΫϑϩʔͷbuild_tasks͔Β࠷ॳͱ࠷ޙͷλ εΫ͕ฦΔ ্هͰฦͬͨλεΫΛͬͯґଘؔΛఆٛ main.py αϒϫʔΫϑϩʔ1ɿsw1.py αϒϫʔΫϑϩʔ2ɿsw2.py αϒϫʔΫϑϩʔ3ɿsw3.py
·ͱΊ - ػೳͷҟͳΔϑϩʔͰߏ͞ΕΔDAGଓઌ༷ʑͳͷͰ1ਓͰେมͳ ߹͕͋ͬͨ - λεΫΛ࡞ΔؔΛ༻͍ͯαϒϫʔΫϑϩʔ୯ҐͰϑΝΠϧׂͰ͖։ൃ͕ ָʹͳΓ·ͨ͠ - αϒϫʔΫϑϩʔͷ࠷ॳͱ࠷ޙͷλεΫΛ͔ͭͬͯґଘؔఆٛ -
αϒϫʔΫϑϩʔؒґଘؔఆٛͰαϒϫʔΫϑϩʔͷதϒϥοΫϘοΫεʹͰ͖Δ - UIͷGraph viewͰDAGશମΛݟΔͱ͖ʹগ͠ݟ௨͕͠ѱ͍͔ʁ - SubDagͰ͋Ε·ͱΊͯදࣔͯ͘͠ΕΔ
Appendix
SubDagOperator - ओʹ܁Γฦ͠ύλʔϯͰར༻͞ΕΔɻ - ฒྻλεΫΛSubDagΛ͑·ͱΊΒΕΔ
SubDagOperator - ओʹ܁Γฦ͠ύλʔϯͰར༻͞ΕΔɻ - ฒྻλεΫΛSubDagΛ͑·ͱΊΒΕΔ Cloud Composer Astronomer
SubDagOperator - ओʹ܁Γฦ͠ύλʔϯͰར༻͞ΕΔɻ - ฒྻλεΫΛSubDagΛ͑·ͱΊΒΕΔ Α͍ - ϑΝΠϧΛՄೳ - αϒϫʔΫϑϩʔΛ·ͱΊͯදࣔͰ͖Δ
- ґଘ͕ؔՄࢹԽ͞ΕΔ Α͘ͳ͍ - ݱঢ়SubDagΛ͏ʹҙ͕ଟ͘ɺ ༻͕ਪ͞Εͯͳ͍
TriggerDagRunOperator / ExternalTaskSensor Α͍ - TriggerSensorͰґଘؔΛఆٛՄೳ - (ผͷDAGͱͯ͠)ϑΝΠϧΛՄೳ Α͘ͳ͍ -
DAGؒͷґଘ͕ؔUI্ͰՄࢹԽ͞Εͳ͍ - αϒϫʔΫϑϩʔ͝ͱʹDAG࡞ʹͳΔ task 1-1 sensor task 2-1 task 2-2 DAG 1 DAG 2 DAG 1 DAG 2 trigger
ݕ౼ͨ͠ํ๏ͷൺֱ