Slide 1

Slide 1 text

©2018 Wantedly, Inc. BigQuery ML Λ࢖ͬͯΈͨ࿩ Examples of BigQuery ML in Wantedly People Machine Learning Casual Talks #6 25.Sep.2018 - Yuya Matsumura - @yu-ya4

Slide 2

Slide 2 text

©2018 Wantedly, Inc. •Yuya Matsumuraʢদଜ ༏໵ʣ •Wantedly, Inc. (ࣾձਓ6ϲ݄໨) •Recommendation Engineer •From Kyoto Self Introduction @yu-ya4 @yu__ya4 https://www.wantedly.com/users/2390451

Slide 3

Slide 3 text

©2018 Wantedly, Inc. What is BigQuery ML??

Slide 4

Slide 4 text

©2018 Wantedly, Inc. • Google Cloud Next 2018 Ͱൃද • BigQuery + ML (ػցֶश) • ݱࡏαϙʔτ͍ͯ͠ΔϞσϧ͸ҎԼͷ 2 छ • ઢܗճؼϞσϧ (σʔλ͔Β਺஋ͷਪఆ) • 2 ߲ϩδεςΟοΫճؼϞσϧ (σʔλ͔Β true/false Λ൑ఆ) • BigQuery͚ͩͰ׬݁͢ΔͨΊखܰʂ • σʔλΛϩʔΧϧ؀ڥ౳ʹҠಈͤ͞Δඞཁͳ͠ • SQL ͚ͩهड़Ͱ͖ͨΒ͍͍ What is BigQuery ML??

Slide 5

Slide 5 text

©2018 Wantedly, Inc. How use BigQuery ML?

Slide 6

Slide 6 text

©2018 Wantedly, Inc. Wantedly People • χϡʔεػೳ (Timeline) • ͓͢͢ΊͷχϡʔεΛ 1 ೔ʹ 2 ճ (ேͱன) push ௨஌Ͱ஌Β͍ͤͯΔ

Slide 7

Slide 7 text

©2018 Wantedly, Inc. Motivations & Backgrounds • Push ௨஌Λ։͔ͳ͍ϢʔβʹͱͬͨΒ 1 ೔ʹ 2 ճ΋ૹΔͷ͸໎࿭͔΋ • 1 ೔ʹ 2 ճ Push ௨஌Λૹͬͯ΋େৎ෉ͦ͏ͳϢʔβΛݟ͚͍ͭͨ • ଞʹ΋༏ઌ౓ͷߴ͍λεΫ͕͋ΔͷͰ͋·Γͬ͘͡Γ࣌ؒΛ͔͚ΒΕͳ͍ • ͬ͘͞ͱͰ͖ͦ͏ͳͷͰ BigQuery ML Λ࢖ͬͯΈΔ

Slide 8

Slide 8 text

©2018 Wantedly, Inc. Problem Definition આ໌ม਺ɿ2೔લʙ28೔લͷظؒʹ͓͚ΔɼϢʔβͷ༷ʑͳΞΫγϣϯ਺ ʮաڈ 1 ϲ݄ͷΞΫςΟϒϢʔβ͔ΒɼPush ௨஌Λ։෧ͨ͠ ϢʔβΛ౰ͯΔʯ ໨తม਺ɿ1೔લͷ Push ௨஌Λ։͍͔ͨͲ͏͔ (1 or 0) • ։͍ͨ Push ௨஌਺ • χϡʔεهࣄͷӾཡ਺ • ໊ࢗͷεΩϟϯ਺ …etc.

Slide 9

Slide 9 text

©2018 Wantedly, Inc. Overview of architecture BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)

Slide 10

Slide 10 text

©2018 Wantedly, Inc. MODEL ͷ࡞੒ BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)

Slide 11

Slide 11 text

©2018 Wantedly, Inc. CREATE model_type: ΞϧΰϦζϜΛࢦఆ (ઢܗճؼ or ϩδεςΟοΫճؼ) label: ໨తม਺(2 ஋Ͱࢦఆ) weekly_open, timeline_show_count… : આ໌ม਺

Slide 12

Slide 12 text

©2018 Wantedly, Inc. EVALUATE

Slide 13

Slide 13 text

©2018 Wantedly, Inc. ΫΤϦͷεέδϡʔϥ΁ͷొ࿥ BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)

Slide 14

Slide 14 text

©2018 Wantedly, Inc. BigQuery by Ruby

Slide 15

Slide 15 text

©2018 Wantedly, Inc. Define Job

Slide 16

Slide 16 text

©2018 Wantedly, Inc. PREDICT ͷ࣮ߦ BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)

Slide 17

Slide 17 text

©2018 Wantedly, Inc. PREDICT predicted_label_probs.prob: ֤ϥϕϧͷ༧ଌ֬཰

Slide 18

Slide 18 text

©2018 Wantedly, Inc. Overview of architecture BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)

Slide 19

Slide 19 text

©2018 Wantedly, Inc. Results ࣮૷ظؒʢ΄΅ʣ 1 ೔Ͱ Push ௨஌ͷ։෧཰͕େ͖͘վળ

Slide 20

Slide 20 text

©2018 Wantedly, Inc. pros. w .-ϑϨʔϜϫʔΫ΍1ZUIPO౳ͷ஌͕ࣝෆཁ w 42-͚ͩॻ͚͹͍͍ͷͰɼ࣮૷ίετ͕௿͍ɽ w ্هͷ஌͕ࣝগͳ͍ σʔλΞφϦετ΍ϏδωεଆͷϝϯόʔͰ΋ར ༻Մೳ w σʔλΛҠಈͤ͞Δඞཁ͕ͳ͘ɼϞσϧͷ։ൃεϐʔυ͕ߴ·Δ w ແݶͷϦιʔε w (PPHMFઌੜʹ՝ۚ͢Δ͚ͩʂ

Slide 21

Slide 21 text

©2018 Wantedly, Inc. cons. w Ϟσϧͷ؅ཧ͕೉͍͠ w Ϟσϧ࡞੒࣌ͷΫΤϦͷอଘ w Ϟσϧ͕ফ͞ΕͨΒͲ͏͢Δʂʁ σʔληοτ͸ΈΜͳ৮ΕΔ  w Ϟσϧͷߋ৽΍ఆظతͳ1SFEJDU͸΍ΓͮΒ͍ w 8BOUFEMZͰ͸͍͍ײ͡ͷεέδϡʔϥͷ࢓૊Έ͕͋Δ͕ʜ w ෳࡶͳϞσϧͷ։ൃ͸·ͩͰ͖ͳ͍ w ΞϧΰϦζϜ͕छྨ͔͠ͳ͍ w ϞσϧͷαΠζʹ੍ݶ .#

Slide 22

Slide 22 text

©2018 Wantedly, Inc. Summary w #JH2VFSZ͚ͩͰ.-͕Ͱ͖Δ#JH2VFSZ.-ͷొ৔ w ͱΓ͋͑ͣγϯϓϧͳճؼΛࢼ͍ͨ͠ࡍʹųƄƃž w ຊ֨తͳ.-ϓϩδΣΫτΛ։࢝͢Δલͷௐࠪ σʔλΛ࢖͑Δ͔ͷ ௐࠪ ʹųƄƃž w ෳࡶͳϞσϧͷੜ੒ʹ͸·ͩ࢖͑ͳ͍ w ΫΤϦ΍Ϟσϧͷ؅ཧΛ޻෉͢Δඞཁ w ࠓޙʹظ଴✨