Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Anonymize Large-scale Sparse User Features at L...
Search
LINE Developers
March 07, 2019
Technology
2
3.7k
Anonymize Large-scale Sparse User Features at LINE Corp
2019/3/7 Machine Learning Production Pitch #1
Yeo Chaerim
LINE Developers
March 07, 2019
Tweet
Share
More Decks by LINE Developers
See All by LINE Developers
LINEスタンプのSREing事例集:大きなスパイクアクセスを捌くためのSREing
line_developers
3
2.4k
Java 21 Overview
line_developers
6
1.3k
Code Review Challenge: An example of a solution
line_developers
1
1.5k
KARTEのAPIサーバ化
line_developers
1
600
著作権とは何か?〜初歩的概念から権利利用法、侵害要件まで
line_developers
5
2.3k
生成AIと著作権 〜生成AIによって生じる著作権関連の課題と対処
line_developers
3
2.4k
マイクロサービスにおけるBFFアーキテクチャでのモジュラモノリスの導入
line_developers
9
3.8k
A/B Testing at LINE NEWS
line_developers
3
1.1k
LINEのサポートバージョンの考え方
line_developers
2
1.4k
Other Decks in Technology
See All in Technology
茨城の思い出を振り返る ~CDKのセキュリティを添えて~ / 20260201 Mitsutoshi Matsuo
shift_evolve
PRO
1
300
GitLab Duo Agent Platform × AGENTS.md で実現するSpec-Driven Development / GitLab Duo Agent Platform × AGENTS.md
n11sh1
0
140
20260204_Midosuji_Tech
takuyay0ne
1
160
クレジットカード決済基盤を支えるSRE - 厳格な監査とSRE運用の両立 (SRE Kaigi 2026)
capytan
6
2.8k
We Built for Predictability; The Workloads Didn’t Care
stahnma
0
140
Embedded SREの終わりを設計する 「なんとなく」から計画的な自立支援へ
sansantech
PRO
3
2.5k
外部キー制約の知っておいて欲しいこと - RDBMSを正しく使うために必要なこと / FOREIGN KEY Night
soudai
PRO
12
5.5k
コスト削減から「セキュリティと利便性」を担うプラットフォームへ
sansantech
PRO
3
1.5k
【Oracle Cloud ウェビナー】[Oracle AI Database + AWS] Oracle Database@AWSで広がるクラウドの新たな選択肢とAI時代のデータ戦略
oracle4engineer
PRO
2
150
ブロックテーマ、WordPress でウェブサイトをつくるということ / 2026.02.07 Gifu WordPress Meetup
torounit
0
190
Introduction to Bill One Development Engineer
sansan33
PRO
0
360
SREのプラクティスを用いた3領域同時 マネジメントへの挑戦 〜SRE・情シス・セキュリティを統合した チーム運営術〜
coconala_engineer
2
660
Featured
See All Featured
AI: The stuff that nobody shows you
jnunemaker
PRO
2
260
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
9.5k
Unsuck your backbone
ammeep
671
58k
The Spectacular Lies of Maps
axbom
PRO
1
520
Believing is Seeing
oripsolob
1
55
GraphQLとの向き合い方2022年版
quramy
50
14k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
0
3.4k
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
190
Producing Creativity
orderedlist
PRO
348
40k
How to Think Like a Performance Engineer
csswizardry
28
2.4k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
140
Navigating Team Friction
lara
192
16k
Transcript
ANONYMIZE LARGE-SCALE SPARSE USER FEATURES AT LINE CORP CHAERIM YEO,
LINE CORPORATION MACHINE LEARNING PRODUCTION PITCH #1, 2019/03/07
ABOUT ME l Chaerim Yeo(呂 彩林) l 2018.12 ~ LINE
Corporation l Account Platform Development Dept. l Ad performance optimization
Agenda • Z-Features • Y-Features • Evaluation • Conclusion
Z-FEATURES
WHAT ARE Z-FEATURES
WHAT ARE Z-FEATURES
WHAT ARE Z-FEATURES
WHAT ARE Z-FEATURES
WHAT ARE Z-FEATURES
BENEFIT OF Z-FEATURES Reusable Flexible
LIMITATION OF Z-FEATURES Human Interpretable Extremely Sparse
Y-FEATURES
BEYOND Z-FEATURES Obfuscation Dimensionality Reduction
BEYOND Z-FEATURES Obfuscation Dimensionality Reduction With keeping information as far
as possible
BEYOND Z-FEATURES Obfuscation Dimensionality Reduction SCDV https://arxiv.org/abs/1612.06778
OVERVIEW OF SCDV
INTEGRATE Z-FEATURES WITH SCDV
SYSTEM OVERVIEW
EVALUATION
DATA DIMENSION RELATIVE TO Z-FEATURES (LOG-SCALE) 0.0001 0.0010 0.0100 0.1000
1.0000 10.0000 100.0000 type1 type2 type3 type4 type5 type6 type7 type8 type9
DATA DENSITY LOG-SCALE 0.0000001 0.0000010 0.0000100 0.0001000 0.0010000 0.0100000 0.1000000
1.0000000 type1 type2 type3 type4 type5 type6 type7 type8 type9 z-features y-features
DATA SIZE RELATIVE TO Z-FEATURES 0.00 5.00 10.00 15.00 20.00
25.00 30.00 35.00 40.00 45.00 50.00 type1 type2 type3 type4 type5 type6 type7 type8 type9
USER DEMOGRAPHICS ESTIMATION MATRICS (RELATIVE TO Z-FEATURES) 0.95 0.96 0.97
0.98 0.99 1.00 1.01 1.02 gender age-group region precision recall f1-score
USER DEMOGRAPHICS ESTIMATION RUNNING TIME (RELATIVE TO Z-FEATURES) 0.00 0.05
0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 gender age-group region training prediction
CONCLUSION
CONCLUSION l Anonymize user features based on SCDV l Enough
to use in ML l Future works l Add workflow to production l Apply further dimensionality reduction l Auto encoders, PCA, …
THANK YOU