Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cloud Composer & Dataflow によるバッチETLの再構築 #data_m...
Search
yuzutas0
PRO
July 19, 2019
Technology
11k
33
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Cloud Composer & Dataflow によるバッチETLの再構築 #data_ml_engineering / 20190719
データとML周辺エンジニアリングを考える会#2の発表資料です。
https://data-engineering.connpass.com/event/136756/
yuzutas0
PRO
July 19, 2019
More Decks by yuzutas0
See All by yuzutas0
OLSにおける推定量β1=共分散÷分散の導出 / 20230517
yuzutas0
PRO
2
690
民間企業におけるデータ整備の課題と工夫 / 20220305
yuzutas0
PRO
15
8.1k
累計参加者8,500名! #DataEngineeringStudy の43スライドから学ぶ、データエンジニアリングの羅針盤 / 20220224
yuzutas0
PRO
14
5.4k
あの人の自分戦略を聞きたい!2022 #devsumi / 20220218
yuzutas0
PRO
4
4.2k
データ基盤による利益最大化と初期構築プロセス / 20220209
yuzutas0
PRO
10
6.8k
Engineer Career Lounge#1「エンジニアの成長戦略を考える」 #ECLounge カンニングペーパー / 20211217
yuzutas0
PRO
3
1.5k
Data Management Guide - 事業成長を支えるデータ基盤のDev&Ops #TechMar / 20211210
yuzutas0
PRO
22
26k
[投影資料]『実践的データ基盤への処方箋』の刊行にあたって #TechMar / 20210210-2
yuzutas0
PRO
1
3.8k
DXを妨げる要因と実現へのアプローチ by @yuzutas0 / 20211022
yuzutas0
PRO
55
47k
Other Decks in Technology
See All in Technology
やさしいA2A入門
minorun365
PRO
10
1.5k
データ基盤をDataformで整えた話 〜 開発環境を添えて 〜
takapy
0
140
AI-DLCを活用した高品質・安全なAI駆動開発実践 / AI Driven Development with AI-DLC
yoshidashingo
0
160
2026 TECHFRESH 畢業分享會 - AI-Native 重塑軟體工程與虛擬講師
line_developers_tw
PRO
0
590
JSAI2026 オーガナイズドセッションOS-27「不動産とAI」趣旨説明 / JSAI2026 Organized Session OS-27 “Real Estate and AI”: Statement of Purpose
ykiyota
0
130
AI駆動開発が変える、大規模開発の前提 ーHuman in the Loop から Human on the Loop へ / AIE2026
visional_engineering_and_design
30
23k
AI Engineering Summit Tokyo 2026 AIの前に、やることがある 〜医療データ企業の4フェーズ〜
dtaniwaki
0
2.4k
「エンジニア進化論」2028年の開発完全自動化、エンジニアはどう進化するか
cyberagentdevelopers
PRO
4
3.4k
失敗を資産に変えるClaude Code
shinyasaita
0
210
Rancherの紹介&Update情報(RancherJP Online Meetup #09)
yoshiyuki_kono
0
140
AWSシリコン最前線 〜AI時代のチップ選択を読み解く〜
htokoyo
2
360
OCI Oracle AI Database Services新機能アップデート(2026/03-2026/05)
oracle4engineer
PRO
0
330
Featured
See All Featured
The World Runs on Bad Software
bkeepers
PRO
72
12k
The Mindset for Success: Future Career Progression
greggifford
PRO
0
360
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
1
1.3k
For a Future-Friendly Web
brad_frost
183
10k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
610
The untapped power of vector embeddings
frankvandijk
2
1.8k
Practical Orchestrator
shlominoach
191
11k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.3k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Tell your own story through comics
letsgokoyo
1
950
Rails Girls Zürich Keynote
gr2m
96
14k
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.3k
Transcript
Cloud Composer & Dataflow ʹΑΔ όονETLͷ࠶ߏங 2019-07-19 #data_ml_engineering presented by
@yuzutas0 https://www.pexels.com/photo/architecture-blur-building-colourful-392031/ https://www.pexels.com/photo/architecture-blur-building-colourful-392031/
WEBʹެ։ࡁΈͰ͢ #data_ml_engineering ɹࡱӨϝϞෆཁͰ͢ɻϦϥοΫεͯ͠ฉ͍͍͚ͯͨͩΕͱࢥ͍·͢ɻ εϥΠυ 70+ ຕ ɹΞδΣϯμʲ4ʳΛॏతʹɺଞϥΠτχϯάͰτʔΫ͠·͢ɻ ɹ࠙λΠϜɾSNSͰͷQ&AαϙʔτΛલఏͱͨ͠༰ʹͳΓ·͢ɻ ςΫϊϩδʔثͩͱࢥ͍ͬͯ·͢
ɹతɾ੍ʹԠ͍͚ͯ͡·͠ΐ͏ɻಛఆͷٕज़ཁૉΛਪ͢ΔൃදͰ͋Γ·ͤΜɻ ɹҙɾ໔
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹ@yuzutas0 ɹɹ
ɹաڈͷొஃࢿྉ σʔλج൫ͷϊϋɾݟΛఏڙ͍ͯ͠·͢ PyCon JP ϕεττʔΫΞϫʔυ༏ल σϒαϛՆ ΞϯέʔτຬNo.1
ʮ࠶ߏஙʯͷࣄྫΛఏڙ͢Δ ͋͘·Ͱ1ͭͷࣄྫͳͷͰ ࣗ͝ͷٕज़ཁૉ৫ঢ়گͱൺͳ͕Βߟ͑ͯ ࣗͳΓͷֶͼΛಘ͍ͯͩ͘͞ ɹຊͷझࢫ
ϩάऩूETLʹ͍ͭͯ γεςϜߏஙɾӡ༻ͷ࣮Λ୲͏ ιϑτΣΞΤϯδχΞ ͱɺͦͷΫϥΠΞϯτɾϚωʔδϟʔʢʹͳΔ༧ఆͷਓʣ ɹຊͷఆλʔήοτ
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹϝϧΧϦʢCtoCϑϦϚʣ
FY2019.6 3Q ܾࢉઆ໌ձࢿྉ https://pdf.irpocket.com/C4385/eHSm/vwwn/oECA.pdf ɹࣄۀʢʹσʔλ૿ྔʣ
ɹάϩʔόϧɾ৽نࣄۀ
https://speakerdeck.com/hik0107/mercari-bi-team-data-analytics-summit-2018 ɹੵۃతͳσʔλ׆༻
ɾϓϩμΫτ͕৳ͼ͍ͯΔ ɾσʔλྔ͕ٸܹʹ૿͍͑ͯΔ ɾάϩʔόϧ৽نࣄۀΛ৳͢ମ੍Λ࡞͍ͬͯΔ ɾੳMLͳͲσʔλΛੵۃతʹ׆༻͍ͯ͠Δ ɹ·ͱΊ of ಛ
ʮBQͷσʔλ͕ߋ৽͞Ε͍ͯͳ͍ΜͰ͚͢Ͳʂʯ ʢҰ෦ͷςʔϒϧ݄ࢭ·͍ͬͯͨʣ ɹݱͰੜ͍ͯͨ͡՝
ɹ௧ ϓϩμΫτˢ σʔλˢ ෛՙˢ ར༻ऀˢ Good Good Bad
Bad ❌ γεςϜɺବͰ͢ʂ ߋ৽͞Ε͍ͯͳ͍Μʂ
ɹྺ࢙తܦҢ ETL System ETL for US ETL for
JP ࡞ͬͨʂ ϝϯςʂ US Team ຊۀͷΒ ળҙͰαϙʔτ ʢਖ਼ݶք͕͋Δʣ JP SRE JP BI JPཉ͍͠ʂ ૬Γͤͯ͞ʂ ґཔ USΞϓϦΛ ྑ͘͢Δͧʂ JPΞϓϦຊ൪ڥ ͕࠷༏ઌͩʂ ੳۀʹ ઐ೦͢Δͧʂ ETL for UK
ɹ͜ͷҊ݅ͷείʔϓᶃ ϓϩμΫτ Ϣʔβʔ DBɾϩά ࢪࡦɾۀ BigQuery ऩू ૄ௨
׆༻ Ձ %BUB0QTʹ͓͍ͯ ࠷େԽ͖͢తม
ɹ͜ͷҊ݅ͷείʔϓᶄ Other Product DB .POPMJUI "11#& Other Other
BigQuery ॱ࣍Ҡ༧ఆ Read Only Replica ػີใ ϚεΩϯά DB .JDSP TFSWJDFT DB .JDSP TFSWJDFT DB .JDSP TFSWJDFT ੴङDC GCP
ɾ݄ߋ৽͞Ε͍ͯͳ͍σʔλ ɾ͋ͳͨͩͬͨΒͲ͏͠·͔͢ʁ ɹToday’s Issue
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹؔऀώΞϦϯά ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
ɹܭଌ͢Δ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
ؔऀҰಉʮ༧ΑΓ൵ࢂͳ͜ͱʹͳ͍ͬͯΔʯ ɹBQߋ৽ԆbotΛ࡞ͬͨ
ຖ࣮࣌ߦ dataset.__TABLES__ ΛSELECT ϝλใΛεφοϓγϣοτอଘ pandas.read_csv() Ͱऔಘ νΣοΫ࣌ؒɺରςʔϒϧ ௨ઌνϟϯωϧ pandas.read_gbq() Ͱ
ςʔϒϧ໊ͱ ࠷ऴߋ৽࣌Λऔಘ ߋ৽༗ແΛఆ slackweb.Slack(). notify() Ͱ ࢦఆνϟϯωϧʹ௨ ɹBQ update checker / implementation IUUQTXXXqBUJDPODPNGSFFJDPODTW@ ύωϧσʔλΛੳͰ͖ΔΑ͏ʹੵ
ɹBQ update checker / design http://yuzutas0.hatenablog.com/entry/2017/05/23/073000 BigQuery
ɹBQ update checker / docs for user (1)
ɹBQ update checker / docs for user (2)
ɹՄࢹԽ → ߹ҙܗ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ
σϕϩού ʮݴ͏΄Ͳ͔ʁʯ ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ ༏ઌॱΛ্͛ͯରԠʂ
ɹԆ໋͢Δ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
ΞφϦετͱҰॹʹʮͱΓ͋͑ͣϦτϥΠʯ Ԇ͍ͯ͠ͳ͍ςʔϒϧͷ࿈ܞ·Ͱಓ࿈ΕͰશ໓ ʢೋ࣍ࡂʣ ʮར༻ऀ͕ఆ͍ͯ͠Δ΄Ͳ؆୯ͳঢ়گͰͳ͍ʯ͕ՄࢹԽ͞Εͨ ɹఆରԠ IUUQTXXXQFYFMTDPNQIPUPCSPXOBOEXIJUFUBCCZLJUUFO
USݖݶΛఆൃߦͯ͠Βͬͯௐࠪ։࢝ ॏ͗ͯ͢ཧը໘͕։͚ͳ͍ ίπΛڭ͑ͯΒ͏ͱ͜Ζ͔Β…… http://{ip_or_domain}/admin/airflow/tree?dag_id={id}&num_runs=1 ɹ҉தࡧ IUUQTXXXQFYFMTDPNQIPUPHSFZDPODSFUFSPBE
ɾσʔλ૿Ճʹ͏λΠϜΞτ͕ଟൃ ɾશδϣϒ͕ྻ࣮ߦͰޙଓॲཧΛר͖ࠐΉ ʢJDBC→DBͷΞΫηεෛՙΛ͑ΔҙਤͰͷઃܭʣ ɾUSνʔϜಉ͡Έ͕ͩδϣϒͷ͚ํΛ ɾJPͦ͜·Ͱग़དྷ͍ͯͳ͔ͬͨ ʢ૬Γʴยखؒͷળҙαϙʔτͩͱݶք͕͋Δʣ ɹௐࠪ
Ԧಓͷखஈͱͯ͠USνʔϜͱಉ༷ͷνϡʔχϯά ʢ҆қͳ࠶ߏஙʹಀ͛ͳ͍ʂʣ ͨͩ͠ ɾΈΛΩϟονΞοϓ͢Δͱ͜Ζ͔Βελʔτ ɾෛՙͰΤϥʔ͕ى͖͍ͯΔطଘγεςϜӨڹΛߟྀ͠ͳ͕Β࡞ۀ ɹνϡʔχϯά͔ʁ
ϝϧϖΠDataplatformTeam͔ΒఏҊ ʮ͜ΜͳΜ࡞ͬͨΜ͚ͩͲྑ͔ͬͨΒԣల։͠·ͤΜʁʯ ɹϦϏϧυ͔ʁ ϝϧϖΠʹ͓͚Δେنόονॲཧ - Mercari Engineering Blog
https://tech.mercari.com/entry/2019/06/05/120000
̋ ̋ ˕ ˕ ɹൺֱݕ౼ γεςϜ αϙʔτ 64
&5-4ZTUFN "JSqPXPO(,& 4QBSLFBSMZ νϡʔχϯά͢Εػೳཁ݅ΛຬͨͤΔ ͣ ཧɾ͕࣌ࠩ͋Δ ඇಉظͰ૬ஊՄೳ .FSQBZ #BUDI1JQFMJOF $MPVE$PNQPTFS %BUBqPXMBUFMZ ػೳཁ݅ΛຬͨͤΔ GVMMNBOBHFEͰ૬ରతʹ͍͍͢ ͣ ཧతʹΦϑΟε͕͍ۙ ૬ஊ͍͢͠
໌Β͔ʹ “ETLγεςϜઃܭ” ͷͰͳ͘ ”JPઐϝϯςφͷظෆࡏ” ͱ “ͦ͏ͳΔʹࢸͬͨ৫తྗֶ” ͕ ਅʹղ͖͘Πγϡʔ
“σʔλૄ௨͕ࢭ·͍ͬͯΔ” ණࢁͷҰ֯ ͳΔ͘ϚΠϯυγΣΞΛׂ͔ͣʹࡁΉΑ͏ʹ “͍͔ʹٕज़໘ͰϥΫͯ͠ରԠ͢Δ͔” ͕ҙࢥܾఆͷ࣠ͱͳΔ ɹҙࢥܾఆͷϙΠϯτ IUUQTXXXJSBTVUPZBDPNCMPHQPTU@IUNM
https://www.pexels.com/photo/architecture-blur-building-colourful-392031/ ࠶ߏஙʴར༻ସͷ΄͏͕ૣྃ͘Ͱ͖Δͱஅ ʢ҆қͳ࠶ߏஙʹಀ͛·ͨ͠ʂʣ ɹϦϏϧυʂ ͪͳΈʹΦν ɹᶃϝϧϖΠͷύΠϓϥΠϯϑϧGCPલఏͷߏͳͷͰɺͦͷ··ͷԣల։ग़དྷͳ͔ͬͨ ɹᶄUSνʔϜUSνʔϜͰԆՄࢹԽΛड͚ͯJPͷδϣϒΛվमͯͩͬͨ͘͠͞
ɹՄࢹԽ → ߹ҙܗ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ
σϕϩού ʮݴ͏΄Ͳ͔ʁʯ ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ ϑΥʔΧε͢Δ ఆରԠʹ࣌ؒɾ࿑ྗΛׂ͔ͳ͍
ɾସςʔϒϧͷ֓ࢉͰࡁ·ͤΔ ɾBQʹͳ͍σʔλΛεΫϦϓτͰࢀর͢Δ ɾݟπʔϧΛੵۃతʹڞ༗͠߹͏ ෆ҆ఆͳγεςϜʹաґଘͤͣʹۀΛߦ͢ΔੌΈ͕͋Δοʂ ʢతʹḷΓணͨ͘Ίͷखஈɾܦ࿏1ͭͰͳ͍ʣ ɹΞφϦετͷ͕͋ͬͯͦ͜ https://www.pexels.com/photo/group-hand-fist-bump-1068523/
ɹ߹ҙܗ·ͱΊ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹγεςϜߏ Replica DB
ɹγεςϜߏ Replica DB ͜͜ !TJSPLFO͞Μ͕ ྑ͍ײ͡ʹ ͬͯ͘Ε·ͨ͠
ɹγεςϜߏ Replica DB ͜͜Λ͠·͢
ɹCloud Composer: DAG Runs ᶃόϦσʔγϣϯ ᶄDataflow࣮ߦ ᶅGCSϑΝΠϧऔಘ ᶆBQ
Load (ࠩ or શ݅)
ɹComposer → Dataflow ʢਖ਼֬ʹGCS্ʹඋ͞Ε͍ͯΔʣTemplate Λࢦఆͯ͠ Cloud Dataflow ʹ࣮ߦ໋ྩΛૹΔ
ɹCloud Dataflow: ETL ᶃGCS͔ΒdumpϑΝΠϧΛread ᶄѱຐվͷมॲཧͰσʔλΛmodify ᶅGCSʹBQ LoadableͳϑΝΠϧΛwrite ಈ࡞֬ೝͰΤϥʔΛ௵͠ͳ͕Β
มॲཧΛ࡞ΓࠐΉ ※ΤϯϋϯεͷͨΊ࠷৽ঢ়گͱဃ͕͋Γ·͢ɻ
ɹWhy Dataflow? ɾmysqldumpͷTSVϑΥʔϚοτͰBigQueryʹLoadͰ͖ͳ͍ → ཁܗ ɹɹɾdouble-quotation-marks escaped by
double-quotation-marks in double-quotation-marks ɹɹɾnew-line escaped by double backslashes ɾσʔλྔ͕ଟ͍ͷͰDBෛՙˍύϑΥʔϚϯε؍͔Β ɹεέʔϥϏϦςΟͷߴ͍DataflowʹॲཧΛدͤͨ ɾDataflowมஔͱͯ͠ΛׂΓ͍ͬͯΔͷͰ ɹDataflow → BigQuery ʹLoadͤͣɺGCSʹมޙϑΝΠϧΛஔ͍͍ͯΔ ɾ࣮ߦڥPython3.5 (supported at Apache Beam 2.11.0 / Mar 5, 2019)
ɹDataflow Onboard by @rilmayer_jp
ɹTest Code for Transform σόοάͰΤϥʔ͕ग़ͨ σʔλύλʔϯΛςετʹ͏ σόοάͰΤϥʔ͕ग़ͨ ςʔϒϧͷσʔλΛςετʹ͏
beamϞδϡʔϧ MagicMockʹͯ͠ ϩδοΫ෦͚ͩ ίʔυͰςετ
ɹComposer → BQ: શ݅ߋ৽ GCS → BQ Load
ɹComposer → BQ: ࠩߋ৽ ݩςʔϒϧ + tmpςʔϒϧ ˠ
Union ALL → ॏෳআڈ → ্ॻ͖ tmpςʔϒϧΛআ ࠩσʔλΛtmpςʔϒϧʹload ৄ͘͠ҎԼͷهࣄΛࢀর͍ͩ͘͞ʂ ඦGBͷσʔλΛMySQL͔ΒBigQueryಉظ͢Δ https://tech.mercari.com/entry/2018/06/28/100000
ɹRebuilt BQ / docs for user (1)
ɹRebuilt BQ / docs for user (2)
ɹRebuilt BQ / docs for user (3) ʢ݄์ஔ͞Ε͍ͯΔʣݱঢ়ΑΓ
ʮϚγʹͳΔʯͰσʔλར༻ऀͱѲΔ ɹɾա࣭ʹ͠ͳ͍ ɹɾܭଌʢԆࢹʣͱαϙʔτ໌ه ɹɾᐆດͳͷᐆດͰ͋Δ͜ͱΛ໌ه
Ұ෦νʔϜʹఏڙ → ڥґଘͷো → ݕɾՐফ͠ɾରԠϑϩʔͷඋ ɹCanary Release
Sprint + Increment: ܧଓతվળͷϦζϜΛ࡞Δ ɹִिසͰஈ֊ϦϦʔε W W W
0QT Ұ෦ͷνʔϜ͔Βఏڙ ࣍ͷνʔϜʹఏڙ ʜʜ ར༻ҊW 2"ɾϑΟʔυόοΫ ར༻ҊW 2"ɾϑΟʔυόοΫ ʜʜ %BUB શ݅ߋ৽ͰࡁΉςʔϒϧ ࠩߋ৽͠ͳ͍ͱਏ͍ςʔϒϧ ʜʜ NZTRMEVNQͰ$47ϑΝΠϧ͕ (#ҎԼʹׂ͞ΕΔςʔϒϧ %BUBqPXͰ$47Λׂ͠ͳ͍ͱ #2-PBE͕ࣦഊ͢Δςʔϒϧ ʜʜ վળ վળ վળ վળ վળ վળ վળ վળ
7hͰλΠϜΞτ͍ͯͨ͠ߪങσʔλ࿈ܞ͕ɺ2.5hͰແࣄʹSuccessʂ 01:00 02:00 03:00 04:00 05:00 06:00 07:00 08:00 09:00
Before After ɹ݁Ռ ❌ ✅ લͷॲཧ
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ݸੑ๛͔ͳλϨϯτϓϨΠϠʔ͕ଟ͍৫ͳͷͰ ࣗͷྲّྀઃܭࢥΛԡ͠௨͢ͷͰͳ͘ ӢͷΑ͏ʹॊೈʹܗΛม͑ͯʢCloudʣ ࢦشऀͷΑ͏ʹશମΛݟ͠ʢComposerʣ ใͷྲྀΕΛཧ͠ͳ͕ΒਐΊͨʢDataflowʣ ·͞ʹ "Cloud Composer & Dataflow
ʹΑΔόονETLͷ࠶ߏங” ɹҙࣝͨ͜͠ͱ https://www.pexels.com/photo/hd-457881/
[BI / PM] @mattsun, @shoei, @hase-ryo, @hikaru, @nakatomo, ɹɹɹɹ @natsume,
@igachan-san, @tsudar, @anboo, @hiza [JP Dev] @siroken3, @shoe116, @ichirin2501, @bokko, @catatsuy, @shinpei [Merpay Dev] @laughingman7743, @syucream, @cocoiti, @kazegusuri, @sfujjiwara [US Dev/ML] @hatone, @yu [JP ML / Search] @furusawa, @tairosan ɹSpecial Thanks account-name in team Slack
ɹࠓޙͷ՝ of Batch ETL in Mercari JP ظ
lΘΕΔzج൫ͷຏ͖ࠐΈ ϓϩμΫτϚωδϝϯτ γεςϜ։ൃ XJUI#*43&%BUB1MBUGPSN தظ lഁյͱz͔Βlܭଌͱվળzͷγϑτ αʔϏεϚωδϝϯτʢ*5*-ʣ σʔλϚωδϝϯτʢ%.#0,ʣ XJUIIBTFSZPTBO ظ lہॴ࠷దz͔Βͷ٫ શࣾσʔλઓུࡦఆʢ%BUB0QTʣ XJUIUBJSPTBO
݈શͳੳ ݈શͳσʔλͷ্ʹΓཱͪ·͢ ݈શͳσʔλ ݈શͳϓϩηεͱγεςϜͷ্ʹΓཱͪ·͢ ·ͣͷલͷখ͞ͳ1า͔Β σʔλΛඋ͍͖ͯ͠·͠ΐ͏ʂ ɹ·ͱΊ
๛ͳσʔλ׆༻ࣄྫͱ߹Θͤͯ Ҋ݅ɾϓϩηεɾγεςϜɾνʔϜɾΧϧνϟʔΛ ͍͔ʹ݈શͳঢ়ଶͱϋοΫ͢Δ͔͝հ ɹએ
ݽ܉ฃಆͰؤு͍ͬͯΔݱ୲ͷօ༷ ݱঢ়Λෆ҆ࢹ͍ͯ͠ΔϚωʔδϟʔͷօ༷ ͥͻ @yuzutas0 ʹֻ͓͚͍ͩ͘͞ AsIs → ToBe ొΓํͷཧΛ͓ख͍͠·͢
ɹަྲྀλΠϜʹ͚ͯ
ྫ͑Cloud DataflowखܰʹεέʔϧͰ͖ΔҰํͰίετֻ͔Γ·͢ ࣄۀن׆༻ํ๏ʹΑͬͯROI؍ͰϖΠ͠ͳ͍͔͠Ε·ͤΜ ɾεέʔϥϒϧͳγεςϜΛ࡞ΔલʹΔ͜ͱࢁఔ͋ΔͷͰʁ ɾද໘తͳٕज़ཁૉΛऔΓೖΕΔ͜ͱ͕తԽ͍ͯ͠ͳ͍ʁ ɾͦͷσʔλૄ௨ͰຊʹܦӦ՝ΛղܾͰ͖Δʁ ҆қͳγεςϜ։ൃʹඈͼͭ͘લʹɺͥͻҰߟ͑ͯΈ͍ͯͩ͘͞ ɹҙɿਖ਼͍͠ͷΛɺਖ਼͘͠࡞Γ·͠ΐ͏
ʮ࠶ߏஙʯͷࣄྫΛఏڙ͢Δ ͋͘·Ͱ1ͭͷࣄྫͳͷͰ ࣗ͝ͷٕज़ཁૉ৫ঢ়گͱൺͳ͕Βߟ͑ͯ ࣗͳΓͷֶͼΛಘ͍ͯͩ͘͞ ɹຊͷझࢫʢ࠶ܝʣ
ࢲ͜͏͠·ͨ͠ɻ ͋ͳͨͩͬͨΒͲ͏͠·͔͢ʁ
͋ͳ͕ͨ͝୲͍ͯ͠Δ ϏδωεɺϓϩηεɺγεςϜɺνʔϜɺΧϧνϟʔͱ Ͳ͕͜ಉ͡Ͱ͔ͨ͠ʁͲ͕͜ҧ͍·͔ͨ͠ʁ ͦͷڞ௨ɾࠩҟɺͳͥੜ͍ͯ͡·͔͢ʁ
͋ͳͨͷ୲ݱࠓͷঢ়ଶ͕ϕετͰ͔͢ʁ ͦΕͱվળ༨͋Γͦ͏Ͱ͔͢ʁ খ͍͍ͯ͘͞ͷͰม͑ΒΕΔ͜ͱ͋Γ·͔͢ʁ
ࠓ͙͢1ͭΞΫγϣϯΛى͜͢ͱͨ͠Β Կ͕Ͱ͖ͦ͏Ͱ͔͢ʁ
https://www.pexels.com/photo/architecture-blur-building-colourful-392031/ ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠