Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE Dev Meetup 16 - Data Dev Team

LINE Dev Meetup 16 - Data Dev Team

LINE Taiwan Data Dev Team Introduction by Alice Lin @ LINE Developers Meetup 16

Event: https://linegroup.kktix.cc/events/20220324

Avatar for LINE Developers Taiwan

LINE Developers Taiwan

March 24, 2022
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Technology

Transcript

  1. 林昱辰(Alice Lin) Data Dev TECH FRESH 2021.09 - Now •

    B07 台⼤資訊管理學系 • ML/DL、NLP、Web Service • 旅遊、⽻球、攝影 1
  2. 資料工程 Data Engineering 資料科學 Data Science 資料分析 Data Analysis 資料工程/資料科學/資料分析分別是什麼?

    © LINE 資料搜集 資料清洗 資料倉儲 資料管線 機器學習 深度學習 模型開發與優化 數據運營 A/B Testing 商業洞見 報表建置 2
  3. Roles, Skills and Responsibility • Build and optimize data pipeline

    architecture • Assemble large, complex data sets that meet requirements Data Engineer Data Analyst Big data infra, SQL, ETL, message queuing • Interpret data, analyze results using statistical techniques • Identify, analyze, and interpret trends or patterns in complex data sets Statistics, Data Visualization, Business Knowledge SKILL RESPONSIBILITY Pipeline Biz • Select appropriate datasets and data representation methods • Research and implement appropriate ML algorithms Data Scientist Machine learning, deep learning, CV, NLP, Speech Model ML Svc Engineer • Build and scale machine learning infrastructure • Monitor model performance System infrastructure design, DevOps Service © LINE 3
  4. Data Governance / Common Platforms IU (with Data Governance) •

    Datalake/Datachain migration to IU • IU Portal and IU Web Renewal • MID download reduction plan MLU/Jutopia • Standard machine learning platform • MLOps CLOVA AI • CLOVA AI solutions Deeppocket/PicCell/… • Standard model serving platform • LINE AI Platform • And more … 5
  5. Concept: NLPaaS As-is Prepare labeled data Develop model Test model

    Training Deploy To-be Integrate with services Prepare labeled data Train model with NLPaaS Integrate with services © LINE DS/ML Applications 9
  6. Explainable AI - Let’s open the black box! Data Validation

    Performance Model Explainable Model Interpretation © LINE 12
  7. SHAP(SHapley Additive exPlanations) • shap values 原先是由一位博 弈理論大師 - 加州大學洛杉

    磯分校(UCLA)教授 Lloyd Shapley 提出,最初用以計算 某玩家貢獻度。 • SHAP 則是本篇 paper 提出的 一種解釋機器學習模型的方 法,其核心思想為計算每個 feature 對 output 的影響程度 (shapley value) 。 Download the paper: https://www.researchgate.net/publication/317062430_A_Uni fied_Approach_to_Interpreting_Model_Predictions 13
  8. import shap # define our explainer explainer = shap.Explainer(classifier) #

    input one sample shap_values = explainer(data[:1]) # text plot of the shap values shap.plots.text(shap_values[0]) 解釋 LINE Travel TW 文章分類器 • 挑選其中一筆資料,觀察其在“住宿”標 籤上的視覺化結果: © LINE • 判斷一篇文章屬於以下哪種類別 (Multiclass classifier): positive effect negative effect ['遊記', '住宿', '旅遊知識', '購物', '景點', '美食'] 14
  9. Coworking Time Management What did I learn… Data Study Group

    Internal Hackathon Keep Learning Scrum Coding Style Work-School- Life-Balance © LINE Communication Skill 15