Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BigQuery ML を使ってみた話

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Yuya Matsumura Yuya Matsumura
September 25, 2018

BigQuery ML を使ってみた話

Machine Learning Casual Talks #6
(https://mlct.connpass.com/event/94911/) での発表資料です.
BigQuery ML の説明と,実サービスに導入した事例についての紹介を致しました.

Avatar for Yuya Matsumura

Yuya Matsumura

September 25, 2018
Tweet

More Decks by Yuya Matsumura

Other Decks in Technology

Transcript

  1. ©2018 Wantedly, Inc. BigQuery ML Λ࢖ͬͯΈͨ࿩ Examples of BigQuery ML

    in Wantedly People Machine Learning Casual Talks #6 25.Sep.2018 - Yuya Matsumura - @yu-ya4
  2. ©2018 Wantedly, Inc. •Yuya Matsumuraʢদଜ ༏໵ʣ •Wantedly, Inc. (ࣾձਓ6ϲ݄໨) •Recommendation

    Engineer •From Kyoto Self Introduction @yu-ya4 @yu__ya4 https://www.wantedly.com/users/2390451
  3. ©2018 Wantedly, Inc. • Google Cloud Next 2018 Ͱൃද •

    BigQuery + ML (ػցֶश) • ݱࡏαϙʔτ͍ͯ͠ΔϞσϧ͸ҎԼͷ 2 छ • ઢܗճؼϞσϧ (σʔλ͔Β਺஋ͷਪఆ) • 2 ߲ϩδεςΟοΫճؼϞσϧ (σʔλ͔Β true/false Λ൑ఆ) • BigQuery͚ͩͰ׬݁͢ΔͨΊखܰʂ • σʔλΛϩʔΧϧ؀ڥ౳ʹҠಈͤ͞Δඞཁͳ͠ • SQL ͚ͩهड़Ͱ͖ͨΒ͍͍ What is BigQuery ML??
  4. ©2018 Wantedly, Inc. Motivations & Backgrounds • Push ௨஌Λ։͔ͳ͍ϢʔβʹͱͬͨΒ 1

    ೔ʹ 2 ճ΋ૹΔͷ͸໎࿭͔΋ • 1 ೔ʹ 2 ճ Push ௨஌Λૹͬͯ΋େৎ෉ͦ͏ͳϢʔβΛݟ͚͍ͭͨ • ଞʹ΋༏ઌ౓ͷߴ͍λεΫ͕͋ΔͷͰ͋·Γͬ͘͡Γ࣌ؒΛ͔͚ΒΕͳ͍ • ͬ͘͞ͱͰ͖ͦ͏ͳͷͰ BigQuery ML Λ࢖ͬͯΈΔ
  5. ©2018 Wantedly, Inc. Problem Definition આ໌ม਺ɿ2೔લʙ28೔લͷظؒʹ͓͚ΔɼϢʔβͷ༷ʑͳΞΫγϣϯ਺ ʮաڈ 1 ϲ݄ͷΞΫςΟϒϢʔβ͔ΒɼPush ௨஌Λ։෧ͨ͠

    ϢʔβΛ౰ͯΔʯ ໨తม਺ɿ1೔લͷ Push ௨஌Λ։͍͔ͨͲ͏͔ (1 or 0) • ։͍ͨ Push ௨஌਺ • χϡʔεهࣄͷӾཡ਺ • ໊ࢗͷεΩϟϯ਺ …etc.
  6. ©2018 Wantedly, Inc. Overview of architecture BigQuery merge Scheduler ʹొ࿥

    PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
  7. ©2018 Wantedly, Inc. MODEL ͷ࡞੒ BigQuery merge Scheduler ʹొ࿥ PREDICT

    ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
  8. ©2018 Wantedly, Inc. CREATE model_type: ΞϧΰϦζϜΛࢦఆ (ઢܗճؼ or ϩδεςΟοΫճؼ) label:

    ໨తม਺(2 ஋Ͱࢦఆ) weekly_open, timeline_show_count… : આ໌ม਺
  9. ©2018 Wantedly, Inc. ΫΤϦͷεέδϡʔϥ΁ͷొ࿥ BigQuery merge Scheduler ʹొ࿥ PREDICT ΫΤϦͷ࣮ߦ

    ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
  10. ©2018 Wantedly, Inc. PREDICT ͷ࣮ߦ BigQuery merge Scheduler ʹొ࿥ PREDICT

    ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
  11. ©2018 Wantedly, Inc. Overview of architecture BigQuery merge Scheduler ʹొ࿥

    PREDICT ΫΤϦͷ࣮ߦ ֘౰Ϣʔβʹ push ௨஌Λૹ৴ SELECT user_id FROM `push_predict_results` WHERE prob > #{PROB_THRESHOLD} Predict ͷ݁ՌΛ BQ ʹετΞ CREATE MODEL ΫΤϦΛ࣮ߦ MODEL Λ BQ ʹετΞ ֘౰Ϣʔβͷ id ΛಡΈࠐΉ PREDICT ΫΤϦΛ࣮ߦ͢Δ Job هड़(΄΅ SQL)
  12. ©2018 Wantedly, Inc. pros. w .-ϑϨʔϜϫʔΫ΍1ZUIPO౳ͷ஌͕ࣝෆཁ w 42-͚ͩॻ͚͹͍͍ͷͰɼ࣮૷ίετ͕௿͍ɽ w ্هͷ஌͕ࣝগͳ͍

    σʔλΞφϦετ΍ϏδωεଆͷϝϯόʔͰ΋ར ༻Մೳ w σʔλΛҠಈͤ͞Δඞཁ͕ͳ͘ɼϞσϧͷ։ൃεϐʔυ͕ߴ·Δ w ແݶͷϦιʔε w (PPHMFઌੜʹ՝ۚ͢Δ͚ͩʂ
  13. ©2018 Wantedly, Inc. cons. w Ϟσϧͷ؅ཧ͕೉͍͠ w Ϟσϧ࡞੒࣌ͷΫΤϦͷอଘ w Ϟσϧ͕ফ͞ΕͨΒͲ͏͢Δʂʁ

    σʔληοτ͸ΈΜͳ৮ΕΔ  w Ϟσϧͷߋ৽΍ఆظతͳ1SFEJDU͸΍ΓͮΒ͍ w 8BOUFEMZͰ͸͍͍ײ͡ͷεέδϡʔϥͷ࢓૊Έ͕͋Δ͕ʜ w ෳࡶͳϞσϧͷ։ൃ͸·ͩͰ͖ͳ͍ w ΞϧΰϦζϜ͕छྨ͔͠ͳ͍ w ϞσϧͷαΠζʹ੍ݶ .#
  14. ©2018 Wantedly, Inc. Summary w #JH2VFSZ͚ͩͰ.-͕Ͱ͖Δ#JH2VFSZ.-ͷొ৔ w ͱΓ͋͑ͣγϯϓϧͳճؼΛࢼ͍ͨ͠ࡍʹųƄƃž w ຊ֨తͳ.-ϓϩδΣΫτΛ։࢝͢Δલͷௐࠪ

    σʔλΛ࢖͑Δ͔ͷ ௐࠪ ʹųƄƃž w ෳࡶͳϞσϧͷੜ੒ʹ͸·ͩ࢖͑ͳ͍ w ΫΤϦ΍Ϟσϧͷ؅ཧΛ޻෉͢Δඞཁ w ࠓޙʹظ଴✨