Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy...
Search
Hiroka Zaitsu
October 25, 2019
Technology
0
910
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy workflow deployment to Cloud Composer with trinity
2019.10.25 Fukuoka.go#14+Umeda.go
https://fukuokago.connpass.com/event/146447/
Hiroka Zaitsu
October 25, 2019
Tweet
Share
More Decks by Hiroka Zaitsu
See All by Hiroka Zaitsu
GMOペパボのデータ基盤とデータ活用の現在地 / Current State of GMO Pepabo's Data Infrastructure and Data Utilization
zaimy
3
340
ビジネス職が分析も担う事業部制組織でのデータ活用の仕組みづくり / Enabling Data Analytics in Business-Led Divisional Organizations
zaimy
1
700
Vertex AI Matching Engine と CLIP を使って EC サービスの類似画像検索機能を作る / Development of similar image search function for EC services using Vertex AI Matching Engine and CLIP
zaimy
0
780
BigQuery の日本語データを Dataflow と Vertex AI でトピックモデリング / Topic modeling of Japanese data in BigQuery with Dataflow and Vertex AI
zaimy
1
6.2k
データサイエンティストの仕事紹介 / Data Scientist Job Introduction
zaimy
1
650
GMOペパボのサービスと研究開発を支えるデータ基盤の裏側 / Inside Story of Data Infrastructure Supporting GMO Pepabo's Services and R&D
zaimy
1
1.8k
正則化とロジスティック回帰/machine-learning-lecture-regularization-and-logistic-regression
zaimy
0
9k
ECサイトにおける閲覧履歴を用いた購買に繋がる行動の変化検出 / Change Detection in Behavior Followed by Possible Purchase Using Electronic Commerce Site Browsing History
zaimy
1
970
ハンドメイド作品を対象としたECサイトにおける大量生産品の検出 / Detection of Mass-produced Goods at EC Site to Trade Handmade Goods
zaimy
3
4.9k
Other Decks in Technology
See All in Technology
Application Performance Optimisation in Practice (60 mins)
stevejgordon
0
110
制約が導く迷わない設計 〜 信頼性と運用性を両立するマイナンバー管理システムの実践 〜
bwkw
2
760
クレジットカード決済基盤を支えるSRE - 厳格な監査とSRE運用の両立 (SRE Kaigi 2026)
capytan
6
1.8k
ZOZOにおけるAI活用の現在 ~開発組織全体での取り組みと試行錯誤~
zozotech
PRO
4
3.7k
プロダクト成長を支える開発基盤とスケールに伴う課題
yuu26
1
340
Embedded SREの終わりを設計する 「なんとなく」から計画的な自立支援へ
sansantech
PRO
2
1.5k
GCASアップデート(202510-202601)
techniczna
0
240
Mosaic AI Gatewayでコーディングエージェントを配るための運用Tips / JEDAI 2026 新春 Meetup! AIコーディング特集
genda
0
140
Amazon ElastiCacheのコスト最適化を考える/Elasticache Cost Optimization
quiver
0
390
20260129_CB_Kansai
takuyay0ne
1
260
ファインディの横断SREがTakumi byGMOと取り組む、セキュリティと開発スピードの両立
rvirus0817
1
910
AI時代、1年目エンジニアの悩み
jin4
1
150
Featured
See All Featured
The Curious Case for Waylosing
cassininazir
0
230
[SF Ruby Conf 2025] Rails X
palkan
0
740
Information Architects: The Missing Link in Design Systems
soysaucechin
0
750
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
0
310
Designing for humans not robots
tammielis
254
26k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.8k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
1
110
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
170
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
110
A Soul's Torment
seathinner
5
2.2k
Design in an AI World
tapps
0
140
Transcript
ࡒେՆ / Pepabo R&D Institute, GMO Pepabo, Inc. 2019.10.25 Fukuoka.go#14+Umeda.go
trinity Ͱ Cloud Composer ʹ ϫʔΫϑϩʔΛ؆୯σϓϩΠ
σʔλαΠΤϯςΟετ ࡒ େՆ / @zaimy 2 Hiroka Zaitsu ϖύϘݚڀॴ ݚڀһ
1. Cloud Composer ͱ 2. Cloud Composer ͷσϓϩΠ࣌ͷࠔΓ͝ͱ 3. trinity
ʹΑΔղܾͷࢼΈ 4. ࠓޙΔ͜ͱ 3 ࣍
1. Cloud Composer ͱ
• GCP ͷ "ϑϧϚωʔδυͷϫʔΫϑϩʔ ΦʔέετϨʔγϣϯ αʔϏε" • Apache Airflow Λ
GCP ্ʹߏங͢Δ • ϖύϘͷϩάج൫ʢDWHʣΛ Treasure Data ͔Β GCP Ҡߦத • ϫʔΫϑϩʔαʔϏε Treasure Workflow (Ϛωʔδυ Digdag) ͔Β Cloud Composer Ҡߦத 5 Cloud Composer ͷ֓ཁ
ϫʔΫϑϩʔͷίʔυϕʔε repository └ dags ɹ ├ workflowA ɹ │ ├
main.py ɹ │ └ hoge.sql ɹ └ workflowB ɹ ɹ ├ main.py ɹ ɹ └ piyo.sql 6 • dags σΟϨΫτϦԼʹϫʔΫϑϩʔ୯ҐͰ αϒσΟϨΫτϦΛΔ • ϫʔΫϑϩʔຊମʢDAGʣͷ python ίʔυ • ϫʔΫϑϩʔͰར༻͢ΔΫΤϦ • ઃఆϑΝΠϧɹͳͲ ※σΟϨΫτϦߏΛ Cloud Storage ͱ߹ΘͤΔ߹
ϫʔΫϑϩʔͷσϓϩΠʢՃͱߋ৽ʣ $ gcloud composer environments storage dags import \ --environment
ENVIRONMENT_NAME \ --location LOCATION \ --source LOCAL_FILE_TO_UPLOAD 7 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSJNQPSU
ϫʔΫϑϩʔͷআ ͦͷ1 - Cloud Storage ͔Βআ $ gcloud composer environments
storage dags delete \ --environment ENVIRONMENT_NAME \ --location LOCATION \ DAG_NAME.py 8 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF
ϫʔΫϑϩʔͷআ ͦͷ2 - Airflow ͔Βআ $ gcloud composer environments run
--location LOCATION \ ENVIRONMENT_NAME delete_dag -- DAG_NAME 9 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF@EBH
2. Cloud Composer ͷ σϓϩΠ࣌ͷࠔΓ͝ͱ
• ϫʔΫϑϩʔͷՃͱߋ৽ • import ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • import
Cloud Storage ͷϑΝΠϧΛ্ॻ͖͢Δ • ίʔυϕʔεͰআͨ͠ϑΝΠϧ ݸผʹআ͠ͳ͍ݶΓ Cloud Storage ʹΔ 11 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• ϫʔΫϑϩʔͷআ • delete ͱ Airflow ͷ dag_delete ͷ2ճίϚϯυΛ࣮ߦ͢Δඞཁ͕͋Δ •
delete ϑΝΠϧ୯Ґ, dag_delete ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϑΝΠϧ/ϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • ։ൃʹΑΓेݸͷϫʔΫϑϩʔʹʑ͕ࠩੜ·Ε͍ͯ͘ • ࠩΛػցతʹݕग़ͯ͠ Cloud Composer ʹಉظ͍ͨ͠ 12 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• όέοτ/σΟϨΫτϦؒͰϑΝΠϧΛಉظ͢Δ Cloud Storage ͷίϚϯυ • ϑΝΠϧͷߋ৽࣌ࠁʹࠩҟ͕͋Εಉظରͱఆ͞ΕΔ • ༰͕มߋ͞Ε͍ͯͳͯ͘ॲཧରʹͳͬͯ͠·͏ •
Cloud Storage ʹґଘ͢Δ • Airflow GCP Ҏ֎ͰߏஙͰ͖ΔͷͰଞͷετϨʔδʹରԠ͍ͨ͠ 13 gsutil rsync Ͳ͏͔ͳ
• ಛఆͷ git ϦϙδτϦͱಉظ͢Δ Airflow ͷػೳ • ୯ҰͷϒϥϯνͷΈࢦఆՄೳ • ຊ൪ڥʹ
master ͷίʔυΛಉظ͢Δʹྑͦ͞͏ • ςετڥ CI Ͱ feature branch ͷίʔυΛσϓϩΠ͍ͨ͠ 14 Airflow sync Ͳ͏͔ͳ
3. trinity ʹΑΔղܾͷࢼΈ
• ίʔυϕʔεͱ Cloud Storage ͱ Airflow ͷ3ͭΛಉظ͢Δ • ϫʔΫϑϩʔ୯ҐͰɺσΟϨΫτϦߏͱϑΝΠϧ༰͔ΒϋογϡΛܭࢉ •
͋Δ࣌ͷϫʔΫϑϩʔఆٛΛද͢ϋογϡ • ίʔυϕʔε͔Βܭࢉͨ͠ϋογϡͱ Cloud Storage ʹอଘ͞Ε͍ͯΔ ϋογϡ͕ҟͳΔϫʔΫϑϩʔΛಉظૢ࡞ͷରʹ͢Δ 16 trinity ͷํ
• https://github.com/zaimy/trinity • A tool to synchronize workflows between Codebase,
Cloud Storage and Airflow metadata. • ͳͥ Goʁ • ΫϩείϯύΠϧͰ Mac, Linux, Windows ʹରԠͰ͖Δ • ϫʔΫϑϩʔ୯ҐͰॲཧ͕ՄೳͳͷͰฒྻԽ͍ͨ͠ 17 trinity ͷ࣮ $ trinity --bucket=BUCKET_NAME \ --composer-env=COMPOSER_ENV_NAME
1. ίʔυϕʔεͰϋογϡΛܭࢉͯ͠ϫʔΫϑϩʔ͝ͱʹอଘ 2. ίʔυϕʔεͱ Cloud Storage ͷϫʔΫϑϩʔΛϦετͯ͠ൺֱ i. ίʔυϕʔεʹ͔͠ͳ͚Ε Cloud
Storage ʹΞοϓϩʔυʢՃʣ ii. Cloud Storage ʹ͔͠ͳ͚Ε Cloud Storage ͱ Airflow ͔Βআ iii. ྆ํʹ͋Είʔυϕʔεͱ Cloud Storage ͷϋογϡΛൺֱ a. ࠩҟ͕͋Ε Cloud Storage ͷϫʔΫϑϩʔΛஔʢߋ৽ʣ 18 ॲཧͷྲྀΕ
؆୯ʹಉظతͳσϓϩΠ͕ Ͱ͖ΔΑ͏ʹͳͬͨ !
• ςετՃͱϦϑΝΫλϦϯά • Go ͷ࡞๏ߟ͑ํʹԊ͍͖͍ͬͯͨ • ػೳՃ • Airflow ʹ
dags Ҏ֎ʹ plugins ͋ΔͷͰରԠ͢Δ • dry-run 20 ࠓޙΔ͜ͱ
None