Slide 1

Slide 1 text

͜Μͳײ͡ͰσʔλύΠϓϥΠϯ࡞ͬͯ·͢
 ೥य़ !ZVLV@U %BUB1JQFMJOF$BTVBM5BML

Slide 2

Slide 2 text

∁ڮါٱ UXJUUFSDPNZVLV@U
 HJUIVCDPNZVLV
 RJJUBDPNZVLV@U 4PGUXBSF&OHJOFFS!'-:8)&&- '-:8)&&-ͰϨίϝϯυΤϯδϯͱ
 ͦͷͨΊͷύΠϓϥΠϯΛ։ൃ͍ͯ͠·͢ɻ
 ೥݄ΑΓݱ৬ɻ

Slide 3

Slide 3 text

ࠓ೔ͷ࿩ w σʔλύΠϓϥΠϯͬͯͦ΋ͦ΋ͳΜ͚ͩͬ w ϲ݄ؒύΠϓϥΠϯΛ࡞͖ͬͯͯͷॴײ w ෇࿥υΩϡϝϯτʹࡌͬͯͳ͍$MPVE$PNQPTFS5JQT

Slide 4

Slide 4 text

SOLUTION DATA DATA PIPELINE

Slide 5

Slide 5 text

w ༷ʑͳྺ࢙తܦҢͰग़དྷ্͕ͬͨ
 ෳࡶͰஅยԽͨ͠σʔλ w ϩά͕͋ΔΑ͏ͳͳ͍Α͏ͳ w ෆ҆ఆͳετϨʔδ DATA DATA PIPELINE

Slide 6

Slide 6 text

w ͍ͬ͢͝ਓ޻஌ೳ͕৭ʑͳ໰୊Λ ͍͍ײ͡ʹղܾͯ͘͠ΕΔ΍ͭ SOLUTION DATA PIPELINE

Slide 7

Slide 7 text

$0--&$5 .07&4503& &91-03&53"/4'03. "((3&("5&-"#&- -&"3/015*.*;& "*%&&1-&"3/*/( IUUQTIBDLFSOPPODPNUIFBJIJFSBSDIZPGOFFETGGDD 5)&%"5"4$*&/$& )*&3"3$):0'/&&%4 ਏ͍ ָ͍͠

Slide 8

Slide 8 text

CLIENT SOLUTION w ϨίϝϯυγεςϜ w ෺ྲྀ࠷దԽ w ݕࡧΤϯδϯ w ޿ࠂ഑৴γεςϜ w ͳͲ w ಺෦ʹ຾ΔେྔͷσʔλΛ׆༻ ͍ͨ͠ w ΞΠσΞΛ࣮ݱͰ͖Δਓࡐ͕͍ ͳ͍

Slide 9

Slide 9 text

'-:8)&&-㱠σʔλ෼ੳ4BB4 w σʔλΛूΊΔͱ͜Ζ͔ΒιϦϡʔγϣϯΛ࡞Δͱ͜Ζ·Ͱ w ΫϥΠΞϯτͷ"84ΞΧ΢ϯτ΍($1ϓϩδΣΫτ಺ʹιϑτ΢ΣΞΛ௚઀ల։͍ͯ͠ Δɻ w ٻΊΔػೳ͕ͦ͜ʹ͋Δͱ෼͔͍ͬͯͯ΋ɺ֎෦ʹσʔλΛग़͢͜ͱ͕Ͱ͖ͳ͍اۀ ΍ۀछ͕ଘࡏ͢ΔɻʢओʹͰ͔͍ͬاۀʣ w ϚϧνΫϥ΢υ؀ڥԼͰ͍͔ʹιϑτ΢ΣΞࢿ࢈Λங͖ɺύΠϓϥΠϯͰܨ͍Ͱ͍͔͕͘ ࠓޙͷେ͖ͳٕज़తͳνϟϨϯδͷҰͭɻ

Slide 10

Slide 10 text

ϚϧνΫϥ΢υύΠϓϥΠϯʂʂʂ

Slide 11

Slide 11 text

ϚϧνΫϥ΢υύΠϓϥΠϯʂʂʂ γΣϧεΫϦϓτ͔Βͷ୤٫

Slide 12

Slide 12 text

γΣϧεΫϦϓτʹΑΔύΠϓϥΠϯ w ΈΜͳੲ͸γΣϧεΫϦϓτͩͬͨ w DSPOͰఆظ࣮ߦ͞ΕΔγΣϧεΫϦϓτ͔Β࣮ߦ͞ΕΔ1ZUIPOεΫϦϓτ܈ w ਏ͞ w ෳࡶԽ͍ͯ͘͠ύΠϓϥΠϯʹରԠ͖͠Εͳ͍ w ͍ͭͲͷλεΫ͕ࣦഊͨ͠ͷ͔෼͔Βͳ͍ w ࣦഊͨ͠λεΫ͸TTIͯ͠࠶࣮ߦ

Slide 13

Slide 13 text

CLIENT
 DATA ϨίϝϯυύΠϓϥΠϯWʢγΣϧεΫϦϓτʣ JOINED DATA RECOMMENDER SYSTEM

Slide 14

Slide 14 text

CLIENT
 DATA DATA LAKE DATA WAREHOUSE DATA
 MART DATA SCIENTISTS ϨίϝϯυύΠϓϥΠϯWʢ$MPVE$PNQPTFSʣ RECOMMENDER SYSTEM

Slide 15

Slide 15 text

جຊํ਑ w σʔλج൫ͷ෼ྨͱਐԽతσʔλϞσϦϯά w %"5"-",&ˠ%"5"8"3&)064&ˠ%"5"."35 w σʔλ͸#JH2VFSZʹ஝ੵ͠ɺۃྗ42-Λ࢖͏ɻ42-Ͱ͸೉͍͠ͱ͜Ζ͚ͩ$MPVE %BUBQSPDʢ4QBSLʣΛ࢖࣮ͬͯ૷͢Δɻ w ϫʔΫϑϩʔ͸$MPVE$PNQPTFSʢ"JSqPXʣͰ؅ཧ͢Δɻ

Slide 16

Slide 16 text

#JH2VFSZͷ̏֊૚ w %"5"-",& w ΫϥΠΞϯτ͔Βఏڙ͞ΕΔੜσʔλΛ஝ੵ͢Δ w %"5"8"3&)064& w அยԽͨ͠σʔλΛ෮ݩɾඇਖ਼نԽɺ໋໊نଇʹҰ؏ੑΛ΋ͨͤΔɺ/6--Λഉআͯ͠ར༻͠ ΍͘͢͢ΔɺͳͲ w ΫϥΠΞϯτͷσʔλ෼ੳνʔϜʹఏڙ͢Δ͜ͱ΋ w %"5"."35 w $MPVE%BUBMBCͳͲͷ#*πʔϧ͔Βࢀর͢Δ w ఏڙ͍ͯ͠ΔϨίϝϯυγεςϜͷޮՌଌఆͱ͔

Slide 17

Slide 17 text

γΣϧεΫϦϓτஔ͖׵͑ਐΊͯΈͯ w "JSqPX͕ͲΜͳʹਏͯ͘΋ੲΛࢥ͍ग़ͤ͹ؤுΕΔ w 8FC6*͕͍͖ͭͯͯخ͍͠ w ؆୯ʹ࠶࣮ߦͰ͖ΔΑ͏ʹͳͬͯλεΫͷႈ౳ੑΛҙࣝ͢ΔΑ͏ʹͳͬͨ w ʢσʔλΛอଘ͢Δͱ͜Ζ͔Βίϯαϧ͠ͳ͍ͱμϝͳͷͰ͸ʜʁʣ

Slide 18

Slide 18 text

෇࿥υΩϡϝϯτʹࡌͬͯͳ͍
 $MPVE$PNQPTFSͷ஌ݟ

Slide 19

Slide 19 text

͚ͬ͜͏(,&LTྗΛٻΊΒΕΔ w $MPVE$PNQPTFS͸(,&ͷ্ʹσϓϩΠ͞ΕΔϑϧϚωʔδυ"JSqPXαʔϏεɻ
 ࠔͬͨ࣌ʹ͸(,&ܦ༝Ͱ"JSqPXʹ઀ଓͯ͠σόοάͨ͠Γ͢Δඞཁ͕͋ΔͷͰɺ
 (,&ͱLTʹ͍ͭͯͷجૅ஌ࣝ͘Β͍͸͍࣋ͬͯͳ͍ͱͭΒ͍ɻ w ͱΓ͋͑ͣLVCFDUMΛηοτΞοϓ͓ͯ͘͠ɻ GKE_CLUSTER="$(gcloud composer environments describe $COMPOSER_NAME \ --format='get(config.gkeCluster)')" GKE_LOCATION="$(gcloud composer environments describe $COMPOSER_NAME \ --format='get(config.nodeConfig.location)')" gcloud container clusters get-credentials $GKE_CLUSTER \ --zone $GKE_LOCATION

Slide 20

Slide 20 text

ϝϞϦ͕଍Γͳ͍ͱ໧ͬͯࢮ͵ w ϩάΛు͔ͣʹλεΫ͕ࣦഊ͢Δͱ͖͸ɺϝϞϦෆ଍ͰBJSqPXXPSLFS͝ͱLTʹࡴ͞Ε ͍ͯΔՄೳੑ͕͋Δɻ w ,VCFSOFUFT&OHJOF8PSLMPBETBJSqPXXPSLFSͰ&WJDUFEͳ1PEΛબ୒͢Δ͜ͱͰ ࢮҼΛ֬ೝՄೳɻ w OTUBOEBSEͩͱ"1*Λݺͼग़͚ͩ͢ͷ1ZUIPO0QFSBUPSͰ΋كʹࡴ͞ΕΔɻ w &WJDUFEͳ1PEΛҰ૟͢ΔίϚϯυͰఆظతʹ($͢Δɻ kubectl get pods -l run=airflow-worker \ | grep Evicted \ | awk '{print $1}' \ | xargs kubectl delete pod

Slide 21

Slide 21 text

1ZUIPOύοέʔδ͕ඍົʹݹ͍ w $MPVE$PNQPTFSͷBJSqPXXPSLFSʹ͸͍͔ͭ͘ͷ1ZUIPOύοέʔδ͕Πϯετʔϧ͞ Ε͍ͯΔ͕ɺυΩϡϝϯτʹόʔδϣϯ͕ॻ͔Ε͍ͯͳ͍ɻ͔͠΋ඍົʹݹ͍ɻ w 1PEͷதͰ1ZUIPOΛىಈͯ͠௚઀ௐ΂Δͷ͕खͬऔΓૣ͍ɻ

Slide 22

Slide 22 text

HTVUJMͰσϓϩΠͰ͖Δ w "JSqPX͸ϩʔΧϧϑΝΠϧΛϙʔϦϯά͢Δ͜ͱͰ%"(ఆٛϑΝΠϧΛݕग़͢Δɻ $MPVE$PNQPTFSͰ͸($4όέοτΛ($4'64&ͰϚ΢ϯτ͢Δ͜ͱͰ࣮ݱ͍ͯ͠ Δɻ w ($4ʹϑΝΠϧΛஔ͖͑͢͞Ε͹͍͍ͷͰɺHDMPVE4%,ͷTUPSBHFEBHTJNQPSUίϚϯ υ͸࢖Θͳͯ͘΋͍͍ɻ w $JSDMF$*͔ΒHTVUJMSTZODͰ؆୯ࣗಈσϓϩΠɻ

Slide 23

Slide 23 text

"JSqPX8FCϖʔδ΁ͷඈͼํ w ϒοΫϚʔΫͰ΋͍͍͕$PNQPTFS&OWJSPONFOUΛ࡞Γ௚͢ͱ63*͕มΘͬͯ͠·͏ɻ ͔ͱ͍ͬͯ$MPVE$POTPMFʹ౎౓ΞΫηε͢Δͷ͸໘౗ɻ w HDMPVE4%,Ͱ63*ΛऔಘͰ͖ΔͷͰɺͦΕΛPQFOίϚϯυʹ౉ͤ͹͍͍ɻ gcloud composer environments describe $COMPOSER_NAME \ --format='get(config.airflowUri)’

Slide 24

Slide 24 text

"JSqPXͷόʔδϣϯΞοϓ͕ਏ͍ w ʹϕʔλػೳͱͯ͠ఏڙ։࢝͞Ε͕ͨɺυΩϡϝϯτͷ௨Γಈ͔ͳ͍ɻ w Ҏલ͸HJUIVCDPN(PPHMF$MPVE1MBUGPSNQZUIPOEPDTTBNQMFTʹೖ͍ͬͯΔ DPQZ@FOWJSPONFOUQZͱ͍͏εΫϦϓτΛ࢖͏Α͏ʹͳ͍ͬͯͨɻ w DPQZ@FOWJSPONFOUQZΛಡΊ͹ҰԠԿΛͲ͜ʹίϐʔ͢Ε͹͍͍ͷ͔෼͔Δ͕ɺ
 "JSqPX͚ͩͰͳ͘LTྗ΋ͳ͍ͱ͔ͳΓݫ͍͠ɻ

Slide 25

Slide 25 text

$MPVE.FNPSZTUPSFʹ઀ଓͰ͖ͳ͍ w (,&͔Β.FNPSZTUPSFʢ3FEJTʣʹ઀ଓ͢Δʹ͸Ϋϥελ࡞੒࣌ʹ*1ΤΠϦΞεΛ༗ޮ ʹ͠ͳ͚Ε͹͍͚ͳ͍͕ɺ$MPVE$PNQPTFS͸ແޮʹͯ͠࡞ͬͯ͠·͏ɻ
 ແޮͷ৔߹ΫϥελʹJQUBCMFTͷϧʔϧΛ௥Ճ͢Δඞཁ͕͋Δɻ w $MPVE$PNQPTFS͕؅ཧ͍ͯ͠ΔLTΫϥελΛ͋·Γ৮Γͨ͘ͳ͔ͬͨͷͰɺ౿Έ୆ͱ ͳΔ($&ΠϯελϯεΛཱͯͯղܾͨ͠ɻ from redis import StrictRedis from sshtunnel import SSHTunnelForwarder with SSHTunnelForwarder((bastion_host, bastion_port), ssh_username="airflow", remote_bind_address=(redis_host, redis_port), local_bind_address=("127.0.0.1", local_port), allow_agent=False): client = StrictRedis(host="127.0.0.1", port=local_port) client.ping()

Slide 26

Slide 26 text

UFNQMBUF@FYU w ࢦఆͨ͠஋ͰऴΘΔจࣈྻΛϑΝΠϧύεͱͯ͠ղऍ͠ɺ࣮ମΛࢦఆ͞ΕͨϑΝΠϧͷத ਎Λ+JOKBͰϨϯμϦϯάͨ݁͠ՌͰஔ͖׵͑ΔͱΜͰ΋ͳ͍ศརͳػೳɻ w υΩϡϝϯτ͸ແ͍͕͠Εͬͱ#BTF0QFSBUPSʹ࣮૷͞Ε͍ͯΔɻ with open("foo/bar.sql") as f: sql = f.read() PythonOperator( template_dict={"sql": sql} # ... ) class SQLTemplateOperator(PythonOperator): template_ext = (".sql",) SQLTemplateOperator( template_dict={"sql": "foo/bar.sql"}, # ... ) IUUQTTUBDLPWFSqPXDPNB

Slide 27

Slide 27 text

IUUQTXXXqZXIFFMKQDBSFFST We are hiring ͨͷ͍͠Α