Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE Dev Meetup 16 - Data Dev Team

LINE Dev Meetup 16 - Data Dev Team

LINE Taiwan Data Dev Team Introduction by Alice Lin @ LINE Developers Meetup 16

Event: https://linegroup.kktix.cc/events/20220324

LINE Developers Taiwan

March 24, 2022
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Technology

Transcript

  1. 林昱辰(Alice Lin) Data Dev TECH FRESH 2021.09 - Now •

    B07 台⼤資訊管理學系 • ML/DL、NLP、Web Service • 旅遊、⽻球、攝影 1
  2. 資料工程 Data Engineering 資料科學 Data Science 資料分析 Data Analysis 資料工程/資料科學/資料分析分別是什麼?

    © LINE 資料搜集 資料清洗 資料倉儲 資料管線 機器學習 深度學習 模型開發與優化 數據運營 A/B Testing 商業洞見 報表建置 2
  3. Roles, Skills and Responsibility • Build and optimize data pipeline

    architecture • Assemble large, complex data sets that meet requirements Data Engineer Data Analyst Big data infra, SQL, ETL, message queuing • Interpret data, analyze results using statistical techniques • Identify, analyze, and interpret trends or patterns in complex data sets Statistics, Data Visualization, Business Knowledge SKILL RESPONSIBILITY Pipeline Biz • Select appropriate datasets and data representation methods • Research and implement appropriate ML algorithms Data Scientist Machine learning, deep learning, CV, NLP, Speech Model ML Svc Engineer • Build and scale machine learning infrastructure • Monitor model performance System infrastructure design, DevOps Service © LINE 3
  4. Data Governance / Common Platforms IU (with Data Governance) •

    Datalake/Datachain migration to IU • IU Portal and IU Web Renewal • MID download reduction plan MLU/Jutopia • Standard machine learning platform • MLOps CLOVA AI • CLOVA AI solutions Deeppocket/PicCell/… • Standard model serving platform • LINE AI Platform • And more … 5
  5. Concept: NLPaaS As-is Prepare labeled data Develop model Test model

    Training Deploy To-be Integrate with services Prepare labeled data Train model with NLPaaS Integrate with services © LINE DS/ML Applications 9
  6. Explainable AI - Let’s open the black box! Data Validation

    Performance Model Explainable Model Interpretation © LINE 12
  7. SHAP(SHapley Additive exPlanations) • shap values 原先是由一位博 弈理論大師 - 加州大學洛杉

    磯分校(UCLA)教授 Lloyd Shapley 提出,最初用以計算 某玩家貢獻度。 • SHAP 則是本篇 paper 提出的 一種解釋機器學習模型的方 法,其核心思想為計算每個 feature 對 output 的影響程度 (shapley value) 。 Download the paper: https://www.researchgate.net/publication/317062430_A_Uni fied_Approach_to_Interpreting_Model_Predictions 13
  8. import shap # define our explainer explainer = shap.Explainer(classifier) #

    input one sample shap_values = explainer(data[:1]) # text plot of the shap values shap.plots.text(shap_values[0]) 解釋 LINE Travel TW 文章分類器 • 挑選其中一筆資料,觀察其在“住宿”標 籤上的視覺化結果: © LINE • 判斷一篇文章屬於以下哪種類別 (Multiclass classifier): positive effect negative effect ['遊記', '住宿', '旅遊知識', '購物', '景點', '美食'] 14
  9. Coworking Time Management What did I learn… Data Study Group

    Internal Hackathon Keep Learning Scrum Coding Style Work-School- Life-Balance © LINE Communication Skill 15