Slide 1

Slide 1 text

ڡ奍揾ාᑀ䋊ਹጱள䰐ஃஃ ฎ䰼䋿僻嶆֕ӧŏ Johnson Wu, Data Scientist/Machine Learning Engineer LINE Taiwan

Slide 2

Slide 2 text

• 1 3, 3 03 • ,. 023 @ 8C B98 ABC89 BA • ,1 CB 9

Slide 3

Slide 3 text

Data Engineer ML Engineer/Scientist? Data Scientist? Data Analyst

Slide 4

Slide 4 text

ML Engineer/Scientist? Data Scientist?

Slide 5

Slide 5 text

• . / / • / / / . / • / / / A day in the life of a Data Scientist •

Slide 6

Slide 6 text

2 0 2 0 2 6/ 0 1 / Among the data I processed Figures matter An amount that won’t make you feel as easy as you face at school

Slide 7

Slide 7 text

Skills that fit

Slide 8

Slide 8 text

Projects/activities I participated/ing in LINE • LINE Fact Checker • LINE Music TW • TW User Tagging • SPOT POI recommendation …. • Internal research like automatic speech recognition, reinforcement learning, recommendation task. • Technical sharing like LINE Dev day, LINE Techpulse, R-ladies.

Slide 9

Slide 9 text

- LINE數位當責計畫

Slide 10

Slide 10 text

How ML helps LINE Fact Checker? Verified Messages Similar Messages Total Messages ML Near- Duplication ML Classification

Slide 11

Slide 11 text

ML models • Pre-trained BERT classifier: verify topics • fine-tuned with TODAY articles • A Approximate Nearest Neighbor model: find similar articles

Slide 12

Slide 12 text

ML system under hood Scheduling, Orchestration Serving Training Data Ingestion Async Execution Model Management Model Deployment Index-based Vector Search (ANN)

Slide 13

Slide 13 text

LINE Music TW Toward intelligent music service

Slide 14

Slide 14 text

Simple Data mining: Autocomplete ranking • 18M user query logs using sql to autocomplete

Slide 15

Slide 15 text

Music Knowledge graph on auto-suggest composed compose lyrics_written write_lyircs sing_song same_album

Slide 16

Slide 16 text

Analysis on Music Knowledge graph

Slide 17

Slide 17 text

Research: time for learning • Inside: RL study group, hackathon, and paper reading. • Collaboration with NLP, AD teams in KR, JP side. • External : e.g. Google/AWS/Microsoft workshop, conference.

Slide 18

Slide 18 text

Activities • Let people (not your close colleagues) know what you are doing. • Dev day, TechPulse, R-ladies, AI workshop, internal tech sharing x2.

Slide 19

Slide 19 text

Takeaway • What I‘ve been doing: finding data-driven solutions for project needs • Knowledge graph, Speech recognition, POI recommendation, Embedding, ML. • Data x ML x optimization x speech • Plain but fun J

Slide 20

Slide 20 text

About our Team : Data dev team TW

Slide 21

Slide 21 text

We are hiring!