Sticker Recommendation Using Federated Learning

Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Who I Am - Haruka Kikuchi - Product Manager (ML) - Past Experience - Security Research (Prog. Lang.) - Human-Computer Interaction Research - Geo-spatial Data Analysis, etc.

Slide 3

Slide 3 text

Agenda - What is Federated Learning (FL)? - 1st Target Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary

Slide 4

Slide 4 text

Server-side Machine Learning (ML) Centralized server(s) collect data and process ML Output Output Output Output Output Output Output Output Training Inference ML

Slide 5

Slide 5 text

Server-side Machine Learning (ML) Centralized server(s) collect data and process ML Log Log Log Log Log Log Log Log Training Inference ML

Slide 6

Slide 6 text

On-Device ML Inferencing Client devices receive global ML model and run inference ML Training Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model

Slide 7

Slide 7 text

On-Device ML Inferencing Client devices receive global ML model and run inference ML Training Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model Inference Inference Inference Inference Inference Inference

Slide 8

Slide 8 text

Training Training Training Training Training Federated Learning (FL) Client On-device ML training + server aggregation ML Training Training Training

Slide 9

Slide 9 text

Training Training Training Training Training Federated Learning (FL) Client On-device ML training + server aggregation ML Training Training Training

Slide 10

Slide 10 text

Federated Learning (FL) Global model are sent to individual devices ML Model Aggregation

Slide 11

Slide 11 text

Federated Learning (FL) Global model are sent to individual devices ML Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model Model Aggregation

Slide 12

Slide 12 text

Federated Learning (FL) Global model are sent to individual devices ML Global Model Global Model Global Model Global Model Global Model Global Model Global Model Global Model Inference Inference Inference Inference Inference Inference Model Aggregation

Slide 13

Slide 13 text

Section Summary Server-side ML ! resourceful Lots of data / computation resource Recommendation

Slide 14

Slide 14 text

Section Summary Server-side ML ! resourceful Lots of data / computation resource On-Device ML Inferencing ! responsive no network latency Recommendation User Interface

Slide 15

Slide 15 text

Section Summary Server-side ML ! resourceful Lots of data / computation resource On-Device ML Inferencing ! responsive no network latency Federated Learning ! privacy preservation users don’t have to send raw data to server Recommendation User Interface Sensitive Data Treatment

Slide 16

Slide 16 text

Agenda - What is Federated Learning (FL)? - 1st Target Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary

Slide 17

Slide 17 text

- Sticker suggestions based on semantic labels - Incremental suggestions while text input, using pre-defined keywords associated with the each label Sticker Auto Suggest

Slide 18

Slide 18 text

Sticker Semantic Tags https://creator.line.me

Slide 19

Slide 19 text

Sticker Semantic Tags https://creator.line.me

Slide 20

Slide 20 text

Stickers Premium Subscription Service https://store.line.me/stickers-premium/landing/en

Slide 21

Slide 21 text

Stickers Premium Subscription Service https://store.line.me/stickers-premium/landing/en

Slide 22

Slide 22 text

Available Stickers in Auto-Suggest Area Purchased, etc. User- downloaded Auto- downloaded

Slide 23

Slide 23 text

Available Stickers in Auto-Suggest Area Purchased, etc. User- downloaded Auto- downloaded

Slide 24

Slide 24 text

Agenda - What is Federated Learning (FL)? - 1st Target Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary

Slide 25

Slide 25 text

Hybrid Approach Server-side ML ! resourceful Lots of data / computation resource ! responsive no network latency Federated Learning ! privacy preservation users don’t have to send raw data to server Recommendation User Interface Communication Info. On-Device ML Inferencing Server-side ML Federated Learning

Slide 26

Slide 26 text

Hybrid Approach Server-side ML ! resourceful Lots of data / computation resource ! responsive no network latency Federated Learning ! privacy preservation users don’t have to send raw data to server Recommendation User Interface Communication Info. Candidate Generation (1st Stage) Reranking (2nd Stage) On-Device ML Inferencing Server-side ML Federated Learning

Slide 27

Slide 27 text

1st Stage: Candidate Generation - Input: - Sticker purchase/download log - User features (estimated demographics, etc.) - Output (intermediate) - Item embeddings - User embeddings - Final output - Item candidates (per user cluster) Input Final Output Intermediate Output

Slide 28

Slide 28 text

2nd Stage: Reranking - Input: - user embedding (fully personalized) - item embedding (for each candidate) - Output - score for each item - Client App. performs - inference, triggered by text input - training, triggered when device-idle (Personalized) Make use of intermediate output in 1st stage (Global) Input Output

Slide 29

Slide 29 text

Candidate Generation (1st stage) Reranking (2nd stage) Personalization Sticker Candidates (1,000,000 → 100) Reorder stickers (100) Signal Sticker download (e.g. purchases) Sticker click/impression Inference Server Client device Training Server Client device (mostly) Comparison 1st stage vs. 2nd stage

Slide 30

Slide 30 text

Agenda - What is Federated Learning (FL)? - 1st Target Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary

Slide 31

Slide 31 text

Requirements as a Platform LINE Federated Learning (LFL) - Model upload without user ID - Differential Privacy (DP) mechanism Privacy Preservation Support multiple on-device ML instances - Separation of app. specific implementations from common FL functions

Slide 32

Slide 32 text

System Architecture Separation of service specific ML functions from FL platform

Slide 33

Slide 33 text

System Architecture Quadrants

Slide 34

Slide 34 text

System Architecture 2-Staged ML

Slide 35

Slide 35 text

System Architecture On-device ML (2nd Stage)

Slide 36

Slide 36 text

System Architecture On-device ML Inferencing (2nd Stage)

Slide 37

Slide 37 text

System Architecture Federated Learning

Slide 38

Slide 38 text

Other Platform Features ONNX for OS-agnostic ML Runtime FL Model/Param A/B Test Local Model Training in Background Candidate Generation For Sticker Sticker keyboard Model Training Scheduler Model/Feature Ver. Management

Slide 39

Slide 39 text

Slide 40

Slide 40 text

Please Check Tomorrow’s Session! Nov. 18th (Fri) 15:00-16:00 JST

Slide 41

Slide 41 text

Agenda - What is Federated Learning (FL)? - 1st Target Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary

Slide 42

Slide 42 text

Privacy Preservation - Noise addition to the model on local devices (Local DP) Support Differential Privacy (DP) mechanism Minimization of data collection - Federated Learning (collect ML models on behalf of raw data) - Model upload by randomly-sampled users without user id

Slide 43

Slide 43 text

System Architecture FL + Local DP

Slide 44

Slide 44 text

FL with Local Differential Privacy Noise injection to local model • Aggregation • Transformation • Noise Injection Noisy Outputs Local Devices Server

Slide 45

Slide 45 text

FL with Local Differential Privacy Noise injection to local model • Aggregation • Transformation • Noise Injection Noisy Outputs Local Devices Server Indistinguishable represented by ε ?

Slide 46

Slide 46 text

FL with Local Differential Privacy Noise injection to local model • Aggregation • Transformation • Noise Injection Noisy Outputs Local Devices Server Indistinguishable represented by ε ? ?

Slide 47

Slide 47 text

Current Status - As-is: set a weak value to evaluate the feasibility of FL - To-be: set a mature value that balances utility of FL and users’ privacy Seeking a privacy parameter ε Implementation of Local DP mechanisms with FL - Gaussian mechanism for local gradient (Local DP) - Averaging the aggregated local models (FL) - Local model upload without user id

Slide 48

Slide 48 text

Agenda - What is Federated Learning (FL)? - 1st Target Application - ML Model Overview - LFL - LINE’s FL Platform - Privacy Preservation - Summary

Slide 49

Slide 49 text

A/B Test Result 5.6% uplift - Personalized sticker suggestions evoke explicit premium sticker package downloads

Slide 50

Slide 50 text

Collaborations Multiple Locations w/ 30+ Engineers

Slide 51

Slide 51 text

Collaborations Multiple Locations w/ 30+ Engineers KOREA KOREA TOKYO TOKYO FUKUOKA TOKYO

Slide 52

Slide 52 text

Future Work - Seeking for LDP configuration - Shuffling mechanism Make LFL as a true LINE’s privacy preservation platform Expand FL-based reranking to all the stickers - Currently, sticker premium is the only service