Into the life of LINE Data Scientist

Into the life of LINE Data Scientist

"初級資料科學家的快樂往往就是這麼樸實無華但不枯燥" by LINE Data Dev - Johnson Wu at LINE Developer Meetup #11 https://linegroup.kktix.cc/events/20200410

2102a6b8760bd6f57f672805723dd83a?s=128

line_developers_tw

April 10, 2020
Tweet

Transcript

  1. ڡ奍揾ාᑀ䋊ਹጱள䰐ஃஃ ฎ䰼䋿僻嶆֕ӧŏ Johnson Wu, Data Scientist/Machine Learning Engineer LINE Taiwan

  2. • 1 3, 3 03 • ,. 023 @ 8C

    B98 ABC89 BA • ,1 CB 9
  3. Data Engineer ML Engineer/Scientist? Data Scientist? Data Analyst

  4. ML Engineer/Scientist? Data Scientist?

  5. • . / / • / / / . /

    • / / / A day in the life of a Data Scientist •
  6. 2 0 2 0 2 6/ 0 1 / Among

    the data I processed Figures matter An amount that won’t make you feel as easy as you face at school
  7. Skills that fit

  8. Projects/activities I participated/ing in LINE • LINE Fact Checker •

    LINE Music TW • TW User Tagging • SPOT POI recommendation …. • Internal research like automatic speech recognition, reinforcement learning, recommendation task. • Technical sharing like LINE Dev day, LINE Techpulse, R-ladies.
  9. - LINE數位當責計畫

  10. How ML helps LINE Fact Checker? Verified Messages Similar Messages

    Total Messages ML Near- Duplication ML Classification
  11. ML models • Pre-trained BERT classifier: verify topics • fine-tuned

    with TODAY articles • A Approximate Nearest Neighbor model: find similar articles
  12. ML system under hood Scheduling, Orchestration Serving Training Data Ingestion

    Async Execution Model Management Model Deployment Index-based Vector Search (ANN)
  13. LINE Music TW Toward intelligent music service   

      
  14. Simple Data mining: Autocomplete ranking • 18M user query logs

    using sql to autocomplete
  15. Music Knowledge graph on auto-suggest composed compose lyrics_written write_lyircs sing_song

    same_album
  16. Analysis on Music Knowledge graph

  17. Research: time for learning • Inside: RL study group, hackathon,

    and paper reading. • Collaboration with NLP, AD teams in KR, JP side. • External : e.g. Google/AWS/Microsoft workshop, conference.
  18. Activities • Let people (not your close colleagues) know what

    you are doing. • Dev day, TechPulse, R-ladies, AI workshop, internal tech sharing x2.
  19. Takeaway • What I‘ve been doing: finding data-driven solutions for

    project needs • Knowledge graph, Speech recognition, POI recommendation, Embedding, ML. • Data x ML x optimization x speech • Plain but fun J
  20. About our Team : Data dev team TW

  21. We are hiring!