Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy...
Search
Hiroka Zaitsu
October 25, 2019
Technology
0
890
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy workflow deployment to Cloud Composer with trinity
2019.10.25 Fukuoka.go#14+Umeda.go
https://fukuokago.connpass.com/event/146447/
Hiroka Zaitsu
October 25, 2019
Tweet
Share
More Decks by Hiroka Zaitsu
See All by Hiroka Zaitsu
GMOペパボのデータ基盤とデータ活用の現在地 / Current State of GMO Pepabo's Data Infrastructure and Data Utilization
zaimy
3
290
ビジネス職が分析も担う事業部制組織でのデータ活用の仕組みづくり / Enabling Data Analytics in Business-Led Divisional Organizations
zaimy
1
610
Vertex AI Matching Engine と CLIP を使って EC サービスの類似画像検索機能を作る / Development of similar image search function for EC services using Vertex AI Matching Engine and CLIP
zaimy
0
760
BigQuery の日本語データを Dataflow と Vertex AI でトピックモデリング / Topic modeling of Japanese data in BigQuery with Dataflow and Vertex AI
zaimy
1
6k
データサイエンティストの仕事紹介 / Data Scientist Job Introduction
zaimy
1
630
GMOペパボのサービスと研究開発を支えるデータ基盤の裏側 / Inside Story of Data Infrastructure Supporting GMO Pepabo's Services and R&D
zaimy
1
1.8k
正則化とロジスティック回帰/machine-learning-lecture-regularization-and-logistic-regression
zaimy
0
8.8k
ECサイトにおける閲覧履歴を用いた購買に繋がる行動の変化検出 / Change Detection in Behavior Followed by Possible Purchase Using Electronic Commerce Site Browsing History
zaimy
1
950
ハンドメイド作品を対象としたECサイトにおける大量生産品の検出 / Detection of Mass-produced Goods at EC Site to Trade Handmade Goods
zaimy
3
4.8k
Other Decks in Technology
See All in Technology
AIとともに歩んでいくデザイナーの役割の変化
lycorptech_jp
PRO
0
800
Railsの話をしよう
yahonda
0
170
Findy Team+ QAチーム これからのチャレンジ!
findy_eventslides
0
500
クラウドとリアルの融合により、製造業はどう変わるのか?〜クラスメソッドの製造業への取組と共に〜
hamadakoji
0
340
「REALITY」3Dアバターシステムの7年分の拡張の歴史について
gree_tech
PRO
0
120
Databricks AI/BI Genie の「値ディクショナリー」をAmazonの奥地(S3)まで見に行く
kameitomohiro
1
380
組織改革から開発効率向上まで! - 成功事例から見えたAI活用のポイント - / 20251016 Tetsuharu Kokaki
shift_evolve
PRO
2
230
「魔法少女まどか☆マギカ Magia Exedra」のIPのキャラクターを描くための3Dルック開発
gree_tech
PRO
0
140
serverless team topology
_kensh
2
130
Data Hubグループ 紹介資料
sansan33
PRO
0
2.2k
[2025年10月版] Databricks Data + AI Boot Camp
databricksjapan
1
240
Linux カーネルが支えるコンテナの仕組み / LF Japan Community Days 2025 Osaka
tenforward
1
110
Featured
See All Featured
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Context Engineering - Making Every Token Count
addyosmani
7
290
Measuring & Analyzing Core Web Vitals
bluesmoon
9
630
GraphQLとの向き合い方2022年版
quramy
49
14k
A better future with KSS
kneath
239
18k
A Tale of Four Properties
chriscoyier
161
23k
How to Ace a Technical Interview
jacobian
280
24k
Six Lessons from altMBA
skipperchong
29
4k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
10
880
4 Signs Your Business is Dying
shpigford
185
22k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
920
Transcript
ࡒେՆ / Pepabo R&D Institute, GMO Pepabo, Inc. 2019.10.25 Fukuoka.go#14+Umeda.go
trinity Ͱ Cloud Composer ʹ ϫʔΫϑϩʔΛ؆୯σϓϩΠ
σʔλαΠΤϯςΟετ ࡒ େՆ / @zaimy 2 Hiroka Zaitsu ϖύϘݚڀॴ ݚڀһ
1. Cloud Composer ͱ 2. Cloud Composer ͷσϓϩΠ࣌ͷࠔΓ͝ͱ 3. trinity
ʹΑΔղܾͷࢼΈ 4. ࠓޙΔ͜ͱ 3 ࣍
1. Cloud Composer ͱ
• GCP ͷ "ϑϧϚωʔδυͷϫʔΫϑϩʔ ΦʔέετϨʔγϣϯ αʔϏε" • Apache Airflow Λ
GCP ্ʹߏங͢Δ • ϖύϘͷϩάج൫ʢDWHʣΛ Treasure Data ͔Β GCP Ҡߦத • ϫʔΫϑϩʔαʔϏε Treasure Workflow (Ϛωʔδυ Digdag) ͔Β Cloud Composer Ҡߦத 5 Cloud Composer ͷ֓ཁ
ϫʔΫϑϩʔͷίʔυϕʔε repository └ dags ɹ ├ workflowA ɹ │ ├
main.py ɹ │ └ hoge.sql ɹ └ workflowB ɹ ɹ ├ main.py ɹ ɹ └ piyo.sql 6 • dags σΟϨΫτϦԼʹϫʔΫϑϩʔ୯ҐͰ αϒσΟϨΫτϦΛΔ • ϫʔΫϑϩʔຊମʢDAGʣͷ python ίʔυ • ϫʔΫϑϩʔͰར༻͢ΔΫΤϦ • ઃఆϑΝΠϧɹͳͲ ※σΟϨΫτϦߏΛ Cloud Storage ͱ߹ΘͤΔ߹
ϫʔΫϑϩʔͷσϓϩΠʢՃͱߋ৽ʣ $ gcloud composer environments storage dags import \ --environment
ENVIRONMENT_NAME \ --location LOCATION \ --source LOCAL_FILE_TO_UPLOAD 7 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSJNQPSU
ϫʔΫϑϩʔͷআ ͦͷ1 - Cloud Storage ͔Βআ $ gcloud composer environments
storage dags delete \ --environment ENVIRONMENT_NAME \ --location LOCATION \ DAG_NAME.py 8 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF
ϫʔΫϑϩʔͷআ ͦͷ2 - Airflow ͔Βআ $ gcloud composer environments run
--location LOCATION \ ENVIRONMENT_NAME delete_dag -- DAG_NAME 9 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF@EBH
2. Cloud Composer ͷ σϓϩΠ࣌ͷࠔΓ͝ͱ
• ϫʔΫϑϩʔͷՃͱߋ৽ • import ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • import
Cloud Storage ͷϑΝΠϧΛ্ॻ͖͢Δ • ίʔυϕʔεͰআͨ͠ϑΝΠϧ ݸผʹআ͠ͳ͍ݶΓ Cloud Storage ʹΔ 11 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• ϫʔΫϑϩʔͷআ • delete ͱ Airflow ͷ dag_delete ͷ2ճίϚϯυΛ࣮ߦ͢Δඞཁ͕͋Δ •
delete ϑΝΠϧ୯Ґ, dag_delete ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϑΝΠϧ/ϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • ։ൃʹΑΓेݸͷϫʔΫϑϩʔʹʑ͕ࠩੜ·Ε͍ͯ͘ • ࠩΛػցతʹݕग़ͯ͠ Cloud Composer ʹಉظ͍ͨ͠ 12 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• όέοτ/σΟϨΫτϦؒͰϑΝΠϧΛಉظ͢Δ Cloud Storage ͷίϚϯυ • ϑΝΠϧͷߋ৽࣌ࠁʹࠩҟ͕͋Εಉظରͱఆ͞ΕΔ • ༰͕มߋ͞Ε͍ͯͳͯ͘ॲཧରʹͳͬͯ͠·͏ •
Cloud Storage ʹґଘ͢Δ • Airflow GCP Ҏ֎ͰߏஙͰ͖ΔͷͰଞͷετϨʔδʹରԠ͍ͨ͠ 13 gsutil rsync Ͳ͏͔ͳ
• ಛఆͷ git ϦϙδτϦͱಉظ͢Δ Airflow ͷػೳ • ୯ҰͷϒϥϯνͷΈࢦఆՄೳ • ຊ൪ڥʹ
master ͷίʔυΛಉظ͢Δʹྑͦ͞͏ • ςετڥ CI Ͱ feature branch ͷίʔυΛσϓϩΠ͍ͨ͠ 14 Airflow sync Ͳ͏͔ͳ
3. trinity ʹΑΔղܾͷࢼΈ
• ίʔυϕʔεͱ Cloud Storage ͱ Airflow ͷ3ͭΛಉظ͢Δ • ϫʔΫϑϩʔ୯ҐͰɺσΟϨΫτϦߏͱϑΝΠϧ༰͔ΒϋογϡΛܭࢉ •
͋Δ࣌ͷϫʔΫϑϩʔఆٛΛද͢ϋογϡ • ίʔυϕʔε͔Βܭࢉͨ͠ϋογϡͱ Cloud Storage ʹอଘ͞Ε͍ͯΔ ϋογϡ͕ҟͳΔϫʔΫϑϩʔΛಉظૢ࡞ͷରʹ͢Δ 16 trinity ͷํ
• https://github.com/zaimy/trinity • A tool to synchronize workflows between Codebase,
Cloud Storage and Airflow metadata. • ͳͥ Goʁ • ΫϩείϯύΠϧͰ Mac, Linux, Windows ʹରԠͰ͖Δ • ϫʔΫϑϩʔ୯ҐͰॲཧ͕ՄೳͳͷͰฒྻԽ͍ͨ͠ 17 trinity ͷ࣮ $ trinity --bucket=BUCKET_NAME \ --composer-env=COMPOSER_ENV_NAME
1. ίʔυϕʔεͰϋογϡΛܭࢉͯ͠ϫʔΫϑϩʔ͝ͱʹอଘ 2. ίʔυϕʔεͱ Cloud Storage ͷϫʔΫϑϩʔΛϦετͯ͠ൺֱ i. ίʔυϕʔεʹ͔͠ͳ͚Ε Cloud
Storage ʹΞοϓϩʔυʢՃʣ ii. Cloud Storage ʹ͔͠ͳ͚Ε Cloud Storage ͱ Airflow ͔Βআ iii. ྆ํʹ͋Είʔυϕʔεͱ Cloud Storage ͷϋογϡΛൺֱ a. ࠩҟ͕͋Ε Cloud Storage ͷϫʔΫϑϩʔΛஔʢߋ৽ʣ 18 ॲཧͷྲྀΕ
؆୯ʹಉظతͳσϓϩΠ͕ Ͱ͖ΔΑ͏ʹͳͬͨ !
• ςετՃͱϦϑΝΫλϦϯά • Go ͷ࡞๏ߟ͑ํʹԊ͍͖͍ͬͯͨ • ػೳՃ • Airflow ʹ
dags Ҏ֎ʹ plugins ͋ΔͷͰରԠ͢Δ • dry-run 20 ࠓޙΔ͜ͱ
None