Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Feature StoreとVertex AIを使った機械学習基盤の実現 / Machine Learning infrastructure using Feature Store and Vertex AI

Feature StoreとVertex AIを使った機械学習基盤の実現 / Machine Learning infrastructure using Feature Store and Vertex AI

Feature StoreとVertex AIを使った機械学習基盤の実現と、1年間運用してみて感じたこと、future work等を話します。
------
Merpay Tech Fest 2022は3日間のオンライン技術カンファレンスです。
IT企業で働くソフトウェアエンジニアおよびメルペイの技術スタックに興味がある方々を対象に2022年8月23日(火)から8月25日(木)までの3日間、開催します。 Merpay Tech Festは事業との関わりから技術への興味を深め、プロダクトやサービスを支えるエンジニアリングを知れるお祭りです。 セッションでは事業を支える組織・技術・課題などへの試行錯誤やアプローチを紹介予定です。お楽しみに!

■イベント関連情報
- 公式ウェブサイト:https://events.merpay.com/techfest-2022/
- 申し込みページ:https://mercari.connpass.com/event/249428/
- Twitterハッシュタグ: #MerpayTechFest
■リンク集
- メルカリ・メルペイイベント一覧:https://mercari.connpass.com/
- メルカリキャリアサイト:https://careers.mercari.com/
- メルカリエンジニアリングブログ:https://engineering.mercari.com/blog/
- メルカリエンジニア向けTwitterアカウント:https://twitter.com/mercaridevjp
- 株式会社メルペイ:https://jp.merpay.com/

mercari
PRO

August 25, 2022
Tweet

More Decks by mercari

Other Decks in Technology

Transcript

  1. Machine Learning Infrastructure with Feature Store and Vertex AI Li

    Software Engineer (Machine Learning)
  2. Li / @Li Software Engineer (Machine Learning) • 2017/4 joined

    Yahoo! JAPAN ◦ FrontEnd Engineer ◦ 2019/1~: Machine Learning Engineer • 2021/9 joined Merpay ◦ Machine Learning Engineer ◦ currently in charge of fraud prevention-related development.
  3. Agenda 01 Background 02 Machine Learning Infrastructure 03 What we

    think were good after 1 year of operation 04 Future work
  4. Agenda 01 Background 02 Machine Learning Infrastructure 03 What we

    think were good after 1 year of operation 04 Future work
  5. Fraud Prevention Models • Alert Filtering (multiple ML models) ◦

    ref: https://engineering.mercari.com/blog/entry/alertfiltering-ml/ • ChargeBack Detection (ML model) ◦ ref: https://engineering.mercari.com/en/blog/entry/chargeback-ml/ • Sub Account Detection • Suspicious Account Detection (rule-based logics) • Suspicious Action Detection (complex network) ◦ ref: https://engineering.mercari.com/blog/entry/complex-network-ml/ • (new) Account-based Chargeback Detection • (new) Account Takeover Detection • (new) Continuous transaction Detection • (new) Overdue Payment Detection
  6. t 2020 2021 Alert Filtering Model ChargeBack Detection Sub Account

    Detection Dimension ~ 40 Alert Filtering Model × 4 sub model ChargeBack Detection latest version Sub Account Detection Suspicious Account Detection × ~ 4 detection Suspicious Action Detection × 2 mass fraud detection Dimension ~ 170 Features • feature dimension increased by 4 times within 1 year • feature used by multiple models
  7. Agenda 01 Background 02 Machine Learning Infrastructure 03 What we

    think were good after 1 year of operation 04 Future work
  8. machine learning infrastructure

  9. Vertex AI ref: https://cloud.google.com/vertex-ai/docs/start/introduction-unified-platform?hl=ja

  10. Feature Store(Feast) ref: https://docs.feast.dev/

  11. Before Feature Store BigQuery GCS Model A Training Model B

    Training Model C Training …
  12. After Feature Store FeatureView1 FeatureView2 FeatureView3 Feature Service1 Feature Service2

    Feature Service3 FeatureView4 … … Model A Training Model B Training Model C Training … BigQuery GCS FEAST
  13. Agenda 01 Background 02 Machine Learning Infrastructure 03 What we

    think were good after 1 year of operation 04 Future work
  14. • components ◦ performs one step in ML workflow •

    common components ◦ commonly used components such as execute_bigquery_query • pipelines ◦ ML workflow, including all of the components in the workflow Elements Definition component pipeline
  15. • supervised learning process was similar between multiple models •

    use the same training pipeline with different config files to train multiple models which reduce the cost of modeling Common Pipelines
  16. Common Pipelines

  17. Version Control def train(): … pipeline.py component.py com1.yaml … com1.yaml

    com2.yaml compile compile
  18. Without Version Control

  19. With Version Control

  20. pipeline uses different versions of common components Version Control

  21. • we use one-for-all versioning for components and pipeline files

    • components and pipeline files are compiled and saved to gcs. • we use different gcs path for different versions Version Control
  22. Scheduled/On-demand Execution need to satisfy • train pipeline ◦ on-demand

    execution • predict pipeline ◦ scheduled execution • common pipeline ◦ execute with different config files
  23. Scheduled/On-demand Execution

  24. Continuous A/B Test

  25. ML Monitoring with Slack Notifier

  26. Agenda 01 Background 02 Machine Learning Infrastructure 03 What we

    think were good after 1 year of operation 04 Future work
  27. Feature Online Store with Stream Ingest WIP

  28. WIP Data Drift Detection and Re-Train

  29. None