ڡ奍揾ාᑀ䋊ਹጱள䰐ஃஃ
ฎ䰼䋿僻嶆֕ӧŏ
Johnson Wu,
Data Scientist/Machine Learning Engineer
LINE Taiwan
Slide 2
Slide 2 text
• 1 3, 3 03
• ,. 023 @ 8C B98 ABC89 BA
• ,1
CB 9
Slide 3
Slide 3 text
Data Engineer
ML Engineer/Scientist?
Data Scientist?
Data Analyst
Slide 4
Slide 4 text
ML Engineer/Scientist?
Data Scientist?
Slide 5
Slide 5 text
• . / /
• / / / . /
• / / /
A day in the life of a Data Scientist
•
Slide 6
Slide 6 text
2 0
2 0
2
6/ 0 1
/
Among the data I processed
Figures matter
An amount that won’t make you feel as easy as you face at school
Slide 7
Slide 7 text
Skills that fit
Slide 8
Slide 8 text
Projects/activities I participated/ing in LINE
• LINE Fact Checker
• LINE Music TW
• TW User Tagging
• SPOT POI recommendation
….
• Internal research like automatic speech recognition,
reinforcement learning, recommendation task.
• Technical sharing like LINE Dev day, LINE Techpulse, R-ladies.
Slide 9
Slide 9 text
-
LINE數位當責計畫
Slide 10
Slide 10 text
How ML helps LINE Fact Checker?
Verified
Messages
Similar
Messages
Total Messages
ML Near-
Duplication
ML
Classification
Slide 11
Slide 11 text
ML models
• Pre-trained BERT
classifier: verify topics
• fine-tuned with TODAY
articles
• A Approximate Nearest
Neighbor model: find
similar articles
Slide 12
Slide 12 text
ML system under hood
Scheduling,
Orchestration
Serving
Training Data
Ingestion
Async
Execution
Model
Management
Model
Deployment
Index-based
Vector Search
(ANN)
Slide 13
Slide 13 text
LINE Music TW
Toward intelligent music service
Slide 14
Slide 14 text
Simple Data mining: Autocomplete
ranking
• 18M user query logs using sql to autocomplete
Slide 15
Slide 15 text
Music Knowledge graph on auto-suggest
composed compose
lyrics_written write_lyircs
sing_song same_album
Slide 16
Slide 16 text
Analysis on Music Knowledge graph
Slide 17
Slide 17 text
Research: time for learning
• Inside: RL study group, hackathon, and paper reading.
• Collaboration with NLP, AD teams in KR, JP side.
• External : e.g. Google/AWS/Microsoft workshop, conference.
Slide 18
Slide 18 text
Activities
• Let people (not your close colleagues) know what you are doing.
• Dev day, TechPulse, R-ladies, AI workshop, internal tech sharing x2.
Slide 19
Slide 19 text
Takeaway
• What I‘ve been doing: finding data-driven solutions for project needs
• Knowledge graph, Speech recognition, POI recommendation, Embedding, ML.
• Data x ML x optimization x speech
• Plain but fun J