Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Into the life of LINE Data Scientist

Into the life of LINE Data Scientist

"初級資料科學家的快樂往往就是這麼樸實無華但不枯燥" by LINE Data Dev - Johnson Wu at LINE Developer Meetup #11 https://linegroup.kktix.cc/events/20200410

LINE Developers Taiwan

April 10, 2020
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Technology

Transcript

  1. • 1 3, 3 03 • ,. 023 @ 8C

    B98 ABC89 BA • ,1 CB 9
  2. • . / / • / / / . /

    • / / / A day in the life of a Data Scientist •
  3. 2 0 2 0 2 6/ 0 1 / Among

    the data I processed Figures matter An amount that won’t make you feel as easy as you face at school
  4. Projects/activities I participated/ing in LINE • LINE Fact Checker •

    LINE Music TW • TW User Tagging • SPOT POI recommendation …. • Internal research like automatic speech recognition, reinforcement learning, recommendation task. • Technical sharing like LINE Dev day, LINE Techpulse, R-ladies.
  5. How ML helps LINE Fact Checker? Verified Messages Similar Messages

    Total Messages ML Near- Duplication ML Classification
  6. ML models • Pre-trained BERT classifier: verify topics • fine-tuned

    with TODAY articles • A Approximate Nearest Neighbor model: find similar articles
  7. ML system under hood Scheduling, Orchestration Serving Training Data Ingestion

    Async Execution Model Management Model Deployment Index-based Vector Search (ANN)
  8. Research: time for learning • Inside: RL study group, hackathon,

    and paper reading. • Collaboration with NLP, AD teams in KR, JP side. • External : e.g. Google/AWS/Microsoft workshop, conference.
  9. Activities • Let people (not your close colleagues) know what

    you are doing. • Dev day, TechPulse, R-ladies, AI workshop, internal tech sharing x2.
  10. Takeaway • What I‘ve been doing: finding data-driven solutions for

    project needs • Knowledge graph, Speech recognition, POI recommendation, Embedding, ML. • Data x ML x optimization x speech • Plain but fun J