Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BigQuery ML を使ってみた話

Yuya Matsumura
September 25, 2018

BigQuery ML を使ってみた話

Machine Learning Casual Talks #6
(https://mlct.connpass.com/event/94911/) での発表資料です.
BigQuery ML の説明と,実サービスに導入した事例についての紹介を致しました.

Yuya Matsumura

September 25, 2018
Tweet

More Decks by Yuya Matsumura

Other Decks in Technology

Transcript

 1. ©2018 Wantedly, Inc. BigQuery ML Λ࢖ͬͯΈͨ࿩ Examples of BigQuery ML

  in Wantedly People Machine Learning Casual Talks #6 25.Sep.2018 - Yuya Matsumura - @yu-ya4
 2. ©2018 Wantedly, Inc. •Yuya Matsumuraʢদଜ ༏໵ʣ •Wantedly, Inc. (ࣾձਓ6ϲ݄໨) •Recommendation

  Engineer •From Kyoto Self Introduction @yu-ya4 @yu__ya4 https://www.wantedly.com/users/2390451
 3. ©2018 Wantedly, Inc. • Google Cloud Next 2018 Ͱൃද •

  BigQuery + ML (ػցֶश) • ݱࡏαϙʔτ͍ͯ͠ΔϞσϧ͸ҎԼͷ 2 छ • ઢܗճؼϞσϧ (σʔλ͔Β਺஋ͷਪఆ) • 2 ߲ϩδεςΟοΫճؼϞσϧ (σʔλ͔Β true/false Λ൑ఆ) • BigQuery͚ͩͰ׬݁͢ΔͨΊखܰʂ • σʔλΛϩʔΧϧ؀ڥ౳ʹҠಈͤ͞Δඞཁͳ͠ • SQL ͚ͩهड़Ͱ͖ͨΒ͍͍ What is BigQuery ML??
 4. ©2018 Wantedly, Inc. Motivations & Backgrounds • Push ௨஌Λ։͔ͳ͍ϢʔβʹͱͬͨΒ 1

  ೔ʹ 2 ճ΋ૹΔͷ͸໎࿭͔΋ • 1 ೔ʹ 2 ճ Push ௨஌Λૹͬͯ΋େৎ෉ͦ͏ͳϢʔβΛݟ͚͍ͭͨ • ଞʹ΋༏ઌ౓ͷߴ͍λεΫ͕͋ΔͷͰ͋·Γͬ͘͡Γ࣌ؒΛ͔͚ΒΕͳ͍ • ͬ͘͞ͱͰ͖ͦ͏ͳͷͰ BigQuery ML Λ࢖ͬͯΈΔ
 5. ©2018 Wantedly, Inc. Problem Definition આ໌ม਺ɿ2೔લʙ28೔લͷظؒʹ͓͚ΔɼϢʔβͷ༷ʑͳΞΫγϣϯ਺ ʮաڈ 1 ϲ݄ͷΞΫςΟϒϢʔβ͔ΒɼPush ௨஌Λ։෧ͨ͠

  ϢʔβΛ౰ͯΔʯ ໨తม਺ɿ1೔લͷ Push ௨஌Λ։͍͔ͨͲ͏͔ (1 or 0) • ։͍ͨ Push ௨஌਺ • χϡʔεهࣄͷӾཡ਺ • ໊ࢗͷεΩϟϯ਺ …etc.
 6. ©2018 Wantedly, Inc. Overview of architecture BigQuery merge Scheduler ʹొ࿥

  PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
 7. ©2018 Wantedly, Inc. MODEL ͷ࡞੒ BigQuery merge Scheduler ʹొ࿥ PREDICT

  ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
 8. ©2018 Wantedly, Inc. CREATE model_type: ΞϧΰϦζϜΛࢦఆ (ઢܗճؼ or ϩδεςΟοΫճؼ) label:

  ໨తม਺(2 ஋Ͱࢦఆ) weekly_open, timeline_show_count… : આ໌ม਺
 9. ©2018 Wantedly, Inc. ΫΤϦͷεέδϡʔϥ΁ͷొ࿥ BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ

  ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
 10. ©2018 Wantedly, Inc. PREDICT ͷ࣮ߦ BigQuery merge Scheduler ʹొ࿥ PREDICT

  ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
 11. ©2018 Wantedly, Inc. Overview of architecture BigQuery merge Scheduler ʹొ࿥

  PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
 12. ©2018 Wantedly, Inc. pros. w .-ϑϨʔϜϫʔΫ΍1ZUIPO౳ͷ஌͕ࣝෆཁ w 42-͚ͩॻ͚͹͍͍ͷͰɼ࣮૷ίετ͕௿͍ɽ w ্هͷ஌͕ࣝগͳ͍

  σʔλΞφϦετ΍ϏδωεଆͷϝϯόʔͰ΋ར ༻Մೳ w σʔλΛҠಈͤ͞Δඞཁ͕ͳ͘ɼϞσϧͷ։ൃεϐʔυ͕ߴ·Δ w ແݶͷϦιʔε w (PPHMFઌੜʹ՝ۚ͢Δ͚ͩʂ
 13. ©2018 Wantedly, Inc. cons. w Ϟσϧͷ؅ཧ͕೉͍͠ w Ϟσϧ࡞੒࣌ͷΫΤϦͷอଘ w Ϟσϧ͕ফ͞ΕͨΒͲ͏͢Δʂʁ

  σʔληοτ͸ΈΜͳ৮ΕΔ w Ϟσϧͷߋ৽΍ఆظతͳ1SFEJDU͸΍ΓͮΒ͍ w 8BOUFEMZͰ͸͍͍ײ͡ͷεέδϡʔϥͷ࢓૊Έ͕͋Δ͕ʜ w ෳࡶͳϞσϧͷ։ൃ͸·ͩͰ͖ͳ͍ w ΞϧΰϦζϜ͕छྨ͔͠ͳ͍ w ϞσϧͷαΠζʹ੍ݶ .#
 14. ©2018 Wantedly, Inc. Summary w #JH2VFSZ͚ͩͰ.-͕Ͱ͖Δ#JH2VFSZ.-ͷొ৔ w ͱΓ͋͑ͣγϯϓϧͳճؼΛࢼ͍ͨ͠ࡍʹųƄƃž w ຊ֨తͳ.-ϓϩδΣΫτΛ։࢝͢Δલͷௐࠪ

  σʔλΛ࢖͑Δ͔ͷ ௐࠪ ʹųƄƃž w ෳࡶͳϞσϧͷੜ੒ʹ͸·ͩ࢖͑ͳ͍ w ΫΤϦ΍Ϟσϧͷ؅ཧΛ޻෉͢Δඞཁ w ࠓޙʹظ଴✨