Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sticker Recommendation Using Federated Learning

Sticker Recommendation Using Federated Learning

Haruka Kikuchi (LINE / Machine Learning Platform Department / Product Manager)

https://tech-verse.me/ja/sessions/46
https://tech-verse.me/en/sessions/46
https://tech-verse.me/ko/sessions/46

Tech-Verse2022

November 17, 2022
Tweet

More Decks by Tech-Verse2022

Other Decks in Technology

Transcript

  1. Who I Am - Haruka Kikuchi - Product Manager (ML)

    - Past Experience - Security Research (Prog. Lang.) - Human-Computer Interaction Research - Geo-spatial Data Analysis, etc.
  2. Agenda - What is Federated Learning (FL)? - 1st Target

    Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary
  3. Server-side Machine Learning (ML) Centralized server(s) collect data and process

    ML Output Output Output Output Output Output Output Output Training Inference ML
  4. Server-side Machine Learning (ML) Centralized server(s) collect data and process

    ML Log Log Log Log Log Log Log Log Training Inference ML
  5. On-Device ML Inferencing Client devices receive global ML model and

    run inference ML Training Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model
  6. On-Device ML Inferencing Client devices receive global ML model and

    run inference ML Training Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model Inference Inference Inference Inference Inference Inference
  7. Training Training Training Training Training Federated Learning (FL) Client On-device

    ML training + server aggregation ML Training Training Training
  8. Training Training Training Training Training Federated Learning (FL) Client On-device

    ML training + server aggregation ML Training Training Training
  9. Federated Learning (FL) Global model are sent to individual devices

    ML Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model Model Aggregation
  10. Federated Learning (FL) Global model are sent to individual devices

    ML Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model Inference Inference Inference Inference Inference Inference Model Aggregation
  11. Section Summary Server-side ML ! resourceful Lots of data /

    computation resource On-Device ML Inferencing ! responsive no network latency Recommendation User Interface
  12. Section Summary Server-side ML ! resourceful Lots of data /

    computation resource On-Device ML Inferencing ! responsive no network latency Federated Learning ! privacy preservation users don’t have to send raw data to server Recommendation User Interface Sensitive Data Treatment
  13. Agenda - What is Federated Learning (FL)? - 1st Target

    Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary
  14. - Sticker suggestions based on semantic labels - Incremental suggestions

    while text input, using pre-defined keywords associated with the each label Sticker Auto Suggest
  15. Agenda - What is Federated Learning (FL)? - 1st Target

    Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary
  16. Hybrid Approach Server-side ML ! resourceful Lots of data /

    computation resource ! responsive no network latency Federated Learning ! privacy preservation users don’t have to send raw data to server Recommendation User Interface Communication Info. On-Device ML Inferencing Server-side ML Federated Learning
  17. Hybrid Approach Server-side ML ! resourceful Lots of data /

    computation resource ! responsive no network latency Federated Learning ! privacy preservation users don’t have to send raw data to server Recommendation User Interface Communication Info. Candidate Generation (1st Stage) Reranking (2nd Stage) On-Device ML Inferencing Server-side ML Federated Learning
  18. 1st Stage: Candidate Generation - Input: - Sticker purchase/download log

    - User features (estimated demographics, etc.) - Output (intermediate) - Item embeddings - User embeddings - Final output - Item candidates (per user cluster) Input Final Output Intermediate Output
  19. 2nd Stage: Reranking - Input: - user embedding (fully personalized)

    - item embedding (for each candidate) - Output - score for each item - Client App. performs - inference, triggered by text input - training, triggered when device-idle (Personalized) Make use of intermediate output in 1st stage (Global) Input Output
  20. Candidate Generation (1st stage) Reranking (2nd stage) Personalization Sticker Candidates

    (1,000,000 → 100) Reorder stickers (100) Signal Sticker download (e.g. purchases) Sticker click/impression Inference Server Client device Training Server Client device (mostly) Comparison 1st stage vs. 2nd stage
  21. Agenda - What is Federated Learning (FL)? - 1st Target

    Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary
  22. Requirements as a Platform LINE Federated Learning (LFL) - Model

    upload without user ID - Differential Privacy (DP) mechanism Privacy Preservation Support multiple on-device ML instances - Separation of app. specific implementations from common FL functions
  23. Other Platform Features ONNX for OS-agnostic ML Runtime FL Model/Param

    A/B Test Local Model Training in Background Candidate Generation For Sticker Sticker keyboard Model Training Scheduler Model/Feature Ver. Management
  24. Other Platform Features ONNX for OS-agnostic ML Runtime FL Model/Param

    A/B Test Local Model Training in Background Candidate Generation For Sticker Sticker keyboard Model Training Scheduler Model/Feature Ver. Management
  25. Agenda - What is Federated Learning (FL)? - 1st Target

    Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary
  26. Privacy Preservation - Noise addition to the model on local

    devices (Local DP) Support Differential Privacy (DP) mechanism Minimization of data collection - Federated Learning (collect ML models on behalf of raw data) - Model upload by randomly-sampled users without user id
  27. FL with Local Differential Privacy Noise injection to local model

    • Aggregation • Transformation • Noise Injection Noisy Outputs Local Devices Server
  28. FL with Local Differential Privacy Noise injection to local model

    • Aggregation • Transformation • Noise Injection Noisy Outputs Local Devices Server Indistinguishable represented by ε ?
  29. FL with Local Differential Privacy Noise injection to local model

    • Aggregation • Transformation • Noise Injection Noisy Outputs Local Devices Server Indistinguishable represented by ε ? ?
  30. Current Status - As-is: set a weak value to evaluate

    the feasibility of FL - To-be: set a mature value that balances utility of FL and users’ privacy Seeking a privacy parameter ε Implementation of Local DP mechanisms with FL - Gaussian mechanism for local gradient (Local DP) - Averaging the aggregated local models (FL) - Local model upload without user id
  31. Agenda - What is Federated Learning (FL)? - 1st Target

    Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary
  32. Future Work - Seeking for LDP configuration - Shuffling mechanism

    Make LFL as a true LINE’s privacy preservation platform Expand FL-based reranking to all the stickers - Currently, sticker premium is the only service