Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
digdag-Introduction
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Masatoshi Shimada
August 19, 2016
Programming
1.4k
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
digdag-Introduction
Digdagを本番導入したので社内勉強会で発表した資料です。
Masatoshi Shimada
August 19, 2016
More Decks by Masatoshi Shimada
See All by Masatoshi Shimada
データプラットフォーム技術におけるメダリオンアーキテクチャという考え方/DataPlatformWithMedallionArchitecture
smdmts
12
4.1k
Delta Lakeを用いた LLM処理基盤 / Delta Lake with LLM on Dataplatform
smdmts
3
8.9k
Lakehouseプラットフォームを 採用するまでの話/Lakehouse Platform Adoption
smdmts
1
1.1k
Sparkから利用するAirframe/Spark-With-Airframe
smdmts
0
1.9k
Redashで何をみるのか/What Do You Wanna See Redash?
smdmts
1
1.8k
DatabricksとSparkではじめる [ビッグデータETL処理/データ可視化] 実践入門 / Databricks and Spark with ETL and Visualization
smdmts
1
1.8k
DatabricksとSparkではじめる [データ分析/機械学習] 実践入門 / Databrick and Spark with Data Analyze and ML for newbie.
smdmts
6
2.5k
作らない分析基板のススメ/DWH For Startup With YAGNI
smdmts
1
820
エンジニアのためのドメイン駆動設計実践入門 / DDD for Engineer newbie
smdmts
18
4k
Other Decks in Programming
See All in Programming
ECSアプリログをFireLensでコスト削減しようとしたけど諦めた話 in Fargate×Node.js
akihisaikeda
2
4k
Semantic Version 単位で戦略を柔軟に変えて、パッケージアップデートを自動化する
daitasu
0
210
Dataformのリポジトリを立ち上げるときにまずやること / dataform-day0-2026
snhryt
0
150
ふつうのFeature Flag実践入門
irof
7
3.7k
Copilot CLI の継戦能力を高める コンテキスト管理
nozomutu
1
1.2k
Oxcを導入して開発体験が向上した話
yug1224
4
310
技術記事、AIに書かせるか、自分で書くか? 〜それでも私が自分の手で書く理由〜 / #QiitaConference
jnchito
2
1.4k
AIで効率化できた業務・日常
ochtum
0
120
AIとASP.NET Coreで雑Webアプリを作った話
mayuki
0
500
Javaの型とAI時代に型が大事な理由 / java types and type in AI era
kishida
2
120
[2026年度第1回ORセミナー] 計画最適化ベンチャーと競技プログラミング人材
terryu16
0
260
AIチームを指揮するOSS「TAKT」活用術 / How to Use “TAKT,” an OSS Tool for Orchestrating AI Teams
nrslib
6
880
Featured
See All Featured
Design in an AI World
tapps
1
240
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
71
40k
HTML-Aware ERB: The Path to Reactive Rendering @ RubyCon 2026, Rimini, Italy
marcoroth
1
180
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
610
Test your architecture with Archunit
thirion
1
2.3k
The Mindset for Success: Future Career Progression
greggifford
PRO
0
360
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
770
Raft: Consensus for Rubyists
vanstee
141
7.5k
Thoughts on Productivity
jonyablonski
76
5.2k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.4k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
62k
Transcript
Introduction of Digdag.
Who am I. • Twitter/GitHub Account • @smdmts • Main
Fields • Scala & Java8 & React.js & Python • DDD CleanArchitecture @ akka-http • Workflow • Hive/Presto
Agenda • Digdag = Workflow automation system.
ϫʔΫϑϩʔΤϯδϯͷओͳཁ݅ • ఆظతͳλεΫͷ࣮ߦ • λεΫͷॱ࣮࣍ߦ • γεςϜؒͷσʔλࣗಈ࿈ܞ • όονʹΑΔσʔλूܭͷࣗಈԽ •
όονδϣϒྃޙͷϝʔϧ/SlackͳͲ௨ • ϦτϥΠ࣌ʹ͓͚Δႈੑ
Digdagͱ • DAGʢDirected acyclic graph)Λ࣮ݱ͢ΔϫʔΫϑ ϩʔΤϯδϯ • YAMLͰDAGΛදݱ͢ΔͨΊఆٛମGitཧՄೳ ʢWorkflow as
Codeʣ • LocalϞʔυͰ։ൃ͠ɺClient/ServerϞʔυͰຊ൪ ͰՔಇͤ͞Δ • Python/Ruby/Bash/DockerͳͲͰαϒλεΫ͕࣮ ߦՄೳ
DigdagͱʢClient/ServerϞʔυʣ • PostgreSQLͰQueueΛ࣮ݱ͍ͯ͠Δ • αϒλεΫຖͰQueueԽ͞Ε͓ͯΓαʔόෳ Ͱ࣮ߦڥ͕εέʔϧՄೳ • Workflowͷ࣮ମPostgreSQLʹӬଓԽ͞ΕΔ • Client͕ίϚϯυͰWorkflowΛpush͢Δ
• Workflowੈཧ͞ΕΔ • ࠶ىಈෆཁͰδϣϒొʗ࠶࣮ߦՄೳ
DAG (Directed acyclic graph)ͱ • DAGʢ༗ඇ८ճάϥϑʣͱ ʢwikipedia) άϥϑཧʹ͓͚Δด࿏ͷͳ͍༗άϥϑͷࣄ ༗άϥϑͱ༗ลʢํΛࣔ͢ҹ͖ ͷลʣ͔ΒͳΓɺลಉ࢜Λͭͳ͙͕ɺ͋Δ
v ͔Βग़ൃ͠ɺลΛͨͲΓɺ v ʹͬͯ ͜ͳ͍ͷ͕༗ඇ८ճάϥϑͰ͋Δɻ
DAG (Directed acyclic graph)ͱ • DAGʢ༗ඇ८ճάϥϑʣͱ • తʹݴ͏ͱऴ͕ଘࡏ͠։࢝ʹͬͯ͜ͳ ͍άϥϑ
DigdagͰͷදݱํ๏ • YAMLͰΦϖϨʔλΛఆٛ timezone: UTC _export: mail: ..... # Definition
of mail +step1_input: py>: tasks.load _error: mail>: body.txt subject: input error! to: [
[email protected]
] +step2_process: sh>: echo process. +step2_report: sh>: echo report.
δϣϒϑϩʔߏུ֓ਤ
δϣϒϑϩʔߏུ֓ਤʹ͓͚Δఆٛ timezone: UTC +prepare_load_aws_env: py>: tasks.load_aws_env +step1_produce_tasks: # Generate SQL
Queries for Redshift. !include : 'child_tasks/produce_tasks/bootstrap.dig' +step2_create_redshift_buffer: # Internal S3 or TreasureData to Redshift temporary buffer. !include : 'child_tasks/create_redshift_buffer/bootstrap.dig' +step3_create_publisher_s3: # Create Redshift buffer to publisher s3 bucket. !include : 'child_tasks/create_publisher_s3/bootstrap.dig'
։ൃ/ӡ༻ͯ͠Έͨײ • Workflow͕ίʔυͰදݱ͞Ε σόοά༰қ ͳ ͷͰ ී௨ͷ։ൃͷϊϦ Ͱॱ൪ͱΤϥʔϋϯυϦ ϯάΛҙࣝͨ͠δϣϒΛΧδϡΞϧʹ࡞Εͨ •
Πϯετʔϧͷ؆қੑɺ࠶ىಈෆཁͷδϣϒ࠶ ొʗ࣮ߦՄೳͳͲɺಋೖ/։ൃ/ӡ༻ָ͕ʹͳΔ ͜ͱ͕ҙࣝͯ͠ઃܭ͞Ε͍ͯΔҹ • ࣮ߦॱংɺฒྻԽɺΤϥʔϋϯυϦϯάͷ੍ޚ͕ ඇৗʹ༰қͳҝɺશόονܥΛDigdagʹҠ͢Δ ࣄΛܾఆ
։ൃ࣌ʹൃੜͨ͠/՝ • py operatorར༻࣌ʹগ͠ϋϚͬͨ • ςετίʔυΛॻͨ͘Ίʹimport digdagͷ ϞοΫίʔυ͕ඞཁ • !includeͰผσΟϨΫτϦʹdigΛஔ͘ͱ
PythonεΫϦϓτؒͷґଘղܾͷ࣮͕ඞཁ • λεΫؒͷม࿈ܞdigdag.env.storeͰɺ શλεΫԣஅͰΩʔͷ໊લΛҰҙʹ͢Δඞཁ͋ ΓʢಉҰ໊শͰ্ॻ͖͕ൃੜ͢Δ߹༗Γʣ
։ൃ࣌ʹൃੜͨ͠/՝ʢิʣ • rb operatorར༻ͯ͠·ͤΜ
ӡ༻࣌ʹൃੜͨ͠/՝ • ӡ༻Ͱཉ͍͠ػೳ͕͋Δঢ়گʢ։ൃதʁʣ • ֬ೝը໘ (ίϚϯυͰճආத) • ਐߦঢ়گɾ࣮ߦ݁ՌɾΤϥʔͳͲ • ϩάͷS3ӬଓԽ
(S3FSͰճආத)
·ͱΊ • ࣗಈԽਖ਼ٛʂ