Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sticker Recommendation Using Federated Learning

Sticker Recommendation Using Federated Learning

Haruka Kikuchi (LINE / Machine Learning Platform Department / Product Manager)

https://tech-verse.me/ja/sessions/46
https://tech-verse.me/en/sessions/46
https://tech-verse.me/ko/sessions/46

Tech-Verse2022

November 17, 2022
Tweet

More Decks by Tech-Verse2022

Other Decks in Technology

Transcript

  1. Who I Am
    - Haruka Kikuchi
    - Product Manager (ML)
    - Past Experience
    - Security Research (Prog. Lang.)
    - Human-Computer Interaction Research
    - Geo-spatial Data Analysis, etc.

    View full-size slide

  2. Agenda
    - What is Federated Learning (FL)?
    - 1st Target Application
    - ML Model Overview
    - LFL - LINE’s FL Platform
    - Privacy Preservation
    - Summary

    View full-size slide

  3. Server-side Machine Learning (ML)
    Centralized server(s) collect data and process ML
    Output
    Output
    Output
    Output
    Output
    Output
    Output
    Output
    Training Inference
    ML

    View full-size slide

  4. Server-side Machine Learning (ML)
    Centralized server(s) collect data and process ML
    Log
    Log Log
    Log Log
    Log Log
    Log
    Training Inference
    ML

    View full-size slide

  5. On-Device ML Inferencing
    Client devices receive global ML model and run inference
    ML
    Training
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model

    View full-size slide

  6. On-Device ML Inferencing
    Client devices receive global ML model and run inference
    ML
    Training
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Inference
    Inference
    Inference Inference
    Inference
    Inference

    View full-size slide

  7. Training
    Training
    Training
    Training
    Training
    Federated Learning (FL)
    Client On-device ML training + server aggregation
    ML
    Training
    Training
    Training

    View full-size slide

  8. Training
    Training
    Training
    Training
    Training
    Federated Learning (FL)
    Client On-device ML training + server aggregation
    ML
    Training
    Training
    Training

    View full-size slide

  9. Federated Learning (FL)
    Global model are sent to individual devices
    ML
    Model Aggregation

    View full-size slide

  10. Federated Learning (FL)
    Global model are sent to individual devices
    ML
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Model Aggregation

    View full-size slide

  11. Federated Learning (FL)
    Global model are sent to individual devices
    ML
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Global
    Model
    Inference
    Inference
    Inference Inference
    Inference
    Inference
    Model Aggregation

    View full-size slide

  12. Section Summary
    Server-side ML
    ! resourceful
    Lots of data /
    computation resource
    Recommendation

    View full-size slide

  13. Section Summary
    Server-side ML
    ! resourceful
    Lots of data /
    computation resource
    On-Device ML Inferencing
    ! responsive
    no network latency
    Recommendation User Interface

    View full-size slide

  14. Section Summary
    Server-side ML
    ! resourceful
    Lots of data /
    computation resource
    On-Device ML Inferencing
    ! responsive
    no network latency
    Federated Learning
    ! privacy preservation
    users don’t have to send
    raw data to server
    Recommendation User Interface Sensitive Data Treatment

    View full-size slide

  15. Agenda
    - What is Federated Learning (FL)?
    - 1st Target Application
    - ML Model Overview
    - LFL - LINE’s FL Platform
    - Privacy Preservation
    - Summary

    View full-size slide

  16. - Sticker suggestions based on semantic
    labels
    - Incremental suggestions while text
    input, using pre-defined keywords
    associated with the each label
    Sticker Auto Suggest

    View full-size slide

  17. Sticker Semantic Tags
    https://creator.line.me

    View full-size slide

  18. Sticker Semantic Tags
    https://creator.line.me

    View full-size slide

  19. Stickers Premium
    Subscription Service
    https://store.line.me/stickers-premium/landing/en

    View full-size slide

  20. Stickers Premium
    Subscription Service
    https://store.line.me/stickers-premium/landing/en

    View full-size slide

  21. Available Stickers in Auto-Suggest Area
    Purchased, etc.
    User-
    downloaded
    Auto-
    downloaded

    View full-size slide

  22. Available Stickers in Auto-Suggest Area
    Purchased, etc.
    User-
    downloaded
    Auto-
    downloaded

    View full-size slide

  23. Agenda
    - What is Federated Learning (FL)?
    - 1st Target Application
    - ML Model Overview
    - LFL - LINE’s FL Platform
    - Privacy Preservation
    - Summary

    View full-size slide

  24. Hybrid Approach
    Server-side ML
    ! resourceful
    Lots of data /
    computation resource
    ! responsive
    no network latency
    Federated Learning
    ! privacy preservation
    users don’t have to send
    raw data to server
    Recommendation User Interface Communication Info.
    On-Device ML Inferencing
    Server-side ML Federated Learning

    View full-size slide

  25. Hybrid Approach
    Server-side ML
    ! resourceful
    Lots of data /
    computation resource
    ! responsive
    no network latency
    Federated Learning
    ! privacy preservation
    users don’t have to send
    raw data to server
    Recommendation User Interface Communication Info.
    Candidate
    Generation
    (1st Stage)
    Reranking
    (2nd Stage)
    On-Device ML Inferencing
    Server-side ML Federated Learning

    View full-size slide

  26. 1st Stage: Candidate Generation
    - Input:
    - Sticker purchase/download log
    - User features (estimated demographics, etc.)
    - Output (intermediate)
    - Item embeddings
    - User embeddings
    - Final output
    - Item candidates (per user cluster)
    Input
    Final Output
    Intermediate
    Output

    View full-size slide

  27. 2nd Stage: Reranking
    - Input:
    - user embedding (fully personalized)
    - item embedding (for each candidate)
    - Output
    - score for each item
    - Client App. performs
    - inference, triggered by text input
    - training, triggered when device-idle
    (Personalized)
    Make use of intermediate output in 1st stage
    (Global)
    Input
    Output

    View full-size slide

  28. Candidate Generation
    (1st stage)
    Reranking
    (2nd stage)
    Personalization
    Sticker Candidates
    (1,000,000 → 100)
    Reorder stickers
    (100)
    Signal
    Sticker download
    (e.g. purchases)
    Sticker click/impression
    Inference Server Client device
    Training Server Client device (mostly)
    Comparison
    1st stage vs. 2nd stage

    View full-size slide

  29. Agenda
    - What is Federated Learning (FL)?
    - 1st Target Application
    - ML Model Overview
    - LFL - LINE’s FL Platform
    - Privacy Preservation
    - Summary

    View full-size slide

  30. Requirements as a Platform
    LINE Federated Learning (LFL)
    - Model upload without user ID
    - Differential Privacy (DP) mechanism
    Privacy Preservation
    Support multiple on-device ML instances
    - Separation of app. specific implementations from common FL functions

    View full-size slide

  31. System Architecture
    Separation of service specific ML functions from FL platform

    View full-size slide

  32. System Architecture
    Quadrants

    View full-size slide

  33. System Architecture
    2-Staged ML

    View full-size slide

  34. System Architecture
    On-device ML (2nd Stage)

    View full-size slide

  35. System Architecture
    On-device ML Inferencing (2nd Stage)

    View full-size slide

  36. System Architecture
    Federated Learning

    View full-size slide

  37. Other Platform Features
    ONNX for OS-agnostic
    ML Runtime
    FL Model/Param
    A/B Test
    Local Model Training
    in Background
    Candidate Generation
    For Sticker
    Sticker keyboard Model Training
    Scheduler
    Model/Feature
    Ver. Management

    View full-size slide

  38. Other Platform Features
    ONNX for OS-agnostic
    ML Runtime
    FL Model/Param
    A/B Test
    Local Model Training
    in Background
    Candidate Generation
    For Sticker
    Sticker keyboard Model Training
    Scheduler
    Model/Feature
    Ver. Management

    View full-size slide

  39. Please Check Tomorrow’s Session!
    Nov. 18th (Fri) 15:00-16:00 JST

    View full-size slide

  40. Agenda
    - What is Federated Learning (FL)?
    - 1st Target Application
    - ML Model Overview
    - LFL - LINE’s FL Platform
    - Privacy Preservation
    - Summary

    View full-size slide

  41. Privacy Preservation
    - Noise addition to the model on local devices (Local DP)
    Support Differential Privacy (DP) mechanism
    Minimization of data collection
    - Federated Learning (collect ML models on behalf of raw data)
    - Model upload by randomly-sampled users without user id

    View full-size slide

  42. System Architecture
    FL + Local DP

    View full-size slide

  43. FL with Local Differential Privacy
    Noise injection to local model
    • Aggregation
    • Transformation
    • Noise Injection
    Noisy Outputs
    Local Devices Server

    View full-size slide

  44. FL with Local Differential Privacy
    Noise injection to local model
    • Aggregation
    • Transformation
    • Noise Injection
    Noisy Outputs
    Local Devices Server
    Indistinguishable
    represented by ε
    ?

    View full-size slide

  45. FL with Local Differential Privacy
    Noise injection to local model
    • Aggregation
    • Transformation
    • Noise Injection
    Noisy Outputs
    Local Devices Server
    Indistinguishable
    represented by ε
    ?
    ?

    View full-size slide

  46. Current Status
    - As-is: set a weak value to evaluate the feasibility of FL
    - To-be: set a mature value that balances utility of FL and users’ privacy
    Seeking a privacy parameter ε
    Implementation of Local DP mechanisms with FL
    - Gaussian mechanism for local gradient (Local DP)
    - Averaging the aggregated local models (FL)
    - Local model upload without user id

    View full-size slide

  47. Agenda
    - What is Federated Learning (FL)?
    - 1st Target Application
    - ML Model Overview
    - LFL - LINE’s FL Platform
    - Privacy Preservation
    - Summary

    View full-size slide

  48. A/B Test Result
    5.6% uplift - Personalized sticker suggestions evoke explicit
    premium sticker package downloads

    View full-size slide

  49. Collaborations
    Multiple Locations w/ 30+ Engineers

    View full-size slide

  50. Collaborations
    Multiple Locations w/ 30+ Engineers
    KOREA
    KOREA
    TOKYO
    TOKYO
    FUKUOKA
    TOKYO

    View full-size slide

  51. Future Work
    - Seeking for LDP configuration
    - Shuffling mechanism
    Make LFL as a true LINE’s privacy preservation platform
    Expand FL-based reranking to all the stickers
    - Currently, sticker premium is the only service

    View full-size slide