Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let's Play in DataPark - Nina Cheng

Let's Play in DataPark - Nina Cheng

20210319 LINE Developers Meetup #14 @ 台大集思會館

Event: https://linegroup.kktix.cc/events/20210319

7975b9fd58c8945ae1c6b38747de7f28?s=128

line_developers_tw2

March 19, 2021
Tweet

Transcript

  1. None
  2. Let’s Play in DataPark Nina Cheng LINE Taiwan Data Dev

  3. Nina Cheng LINE Taiwan Data Engineer #Hiking #CatLover

  4. Agenda • ML-enhanced LINE Services • Roles and Responsibilities in

    Data Team • ML Applications We Built • Differences Between School and Company • Build AI Products with MLOps
  5. ML-enhanced LINE Services

  6. USERS > 19M/d TODAY > 1M articles/y SHOPPING > 10M

    queries/m OA > 1B interactions/m Data in LINE We Are Facing
  7. ML-enhanced LINE Services

  8. Roles and Responsibilities in Data Team

  9. • Build and optimize data pipeline architecture • Assemble large,

    complex data sets that meet requirements • Select appropriate datasets and data representation methods • Research and implement appropriate ML algorithms Data Scientist Data Engineer ML Svc Engineer Data Analyst Skills and Responsibility Big data infra, SQL, ETL, message queuing Machine learning, deep learning, CV, NLP, Speech • Build and scale machine learning infrastructure • Monitor model performance System infrastructure design, DevOps • Interpret data, analyze results using statistical techniques • Identify, analyze, and interpret trends or patterns in complex data sets Statistics, Data Visualization, Business Knowledge SKILL RESPONSIBILITY Pipeline Model Service Biz
  10. ML Workflow Analyze

  11. DS DE MSE DA PM Biz Key Roles and Activities

    in ML Workflow DS DE DS DS DE DA MSE Data Labeling Feature Model Experiment Version control Model develop Hyper- parameters tuning Computing resources Validation Analysis Resources Scaling Performance Reliability Model decay Biz analysis Analyze
  12. ML Applications We Built

  13. MarTech NLP User Content

  14. Uplift Modeling Buy if not treated Buy if treated High

    Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables Customer Segmentation by Conversion and Treatment MarTech
  15. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables
  16. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables
  17. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables
  18. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables Uplift M odel
  19. Leverage Lookalike groups to Estimate Customer Uplift To overcome this

    counter-factual nature, uplift modeling crucially relies on randomized experiments Control Group (Yi(0)) Treatment Group (Yi(1)) Reference: Causal Inference and Uplift Modeling A review of the literature MarTech 秋田犬 柴犬 圖片來源:Google
  20. Estimate the Individual Treatment Effect - Training MarTech User behavior

    one month before the campaign started Treatment (秋田) Control (柴犬) Campaign period Treatment Response w/o treatment Response w treatment Browsing History Interaction Deposit Profile Interest Wealth 圖片來源:Google
  21. Estimate the Individual Treatment Effect - Predict MarTech User behavior

    one month before the campaign started Campaign period New Response w/o treatment Response w treatment Uplift Model Browsing History Interaction Deposit Profile Interest Wealth 圖片來源:Google
  22. Estimate the Individual Treatment Effect - Predict MarTech User behavior

    one month before the campaign started Campaign period New Response Uplift Uplift Model Browsing History Interaction Deposit Profile Interest Wealth 圖片來源:Google
  23. Uplift By Declines Persuadables Lost causes/ sure things Sleeping dog

    Reference: Causal Inference and Uplift Modeling A review of the literature MarTech
  24. Related Search for LINE SHOPPING NLP

  25. Related Search for LINE SHOPPING A service which will show

    a list of recommended keywords (extended words) when user are searching in LINE shopping. 1. Product spec - Helping users find the product they want quickly 2. Similar products - Pushing users to buy more When user searches “吸塵器”… NLP e.g. 吹風機、掃地機器人
  26. 26 Model • Learns to represent objects of different types

    into a common vectorial embedding space (not necessarily the same type as the items in the set) Data • User search log 吸塵器 dyson #吹風機 吹風機 #吸塵器 直立式 無線 Related Search for LINE SHOPPING NLP 無線吸塵器 直立式吸塵器 Dyson吸塵器 Dyson吹風機
  27. 27 Model • Learns to represent objects of different types

    into a common vectorial embedding space (not necessarily the same type as the items in the set) Data • User search log Similar Products 吸塵器 dyson #吹風機 吹風機 #吸塵器 直立式 無線 Related Search for LINE SHOPPING - 無線 吸塵器 #吸塵器 - 直立式 吸塵器 #吸塵器 - dyson 吸塵器 #吸塵器 - dyson 吹風機 #吹風機 Product Spec NLP
  28. NER Tokenizer Keyword Extraction Related Search Auto Complete Duplication Detector

    Multiword Term Article Classifier NLP module NLP Module
  29. Differences Between School and Company

  30. School Company Defined problem Given dataset Given metric Trying to

    know methods Beating the baseline Differences Between School and Company Discovering problem Preparing data Defining metric Seeking possible methods Applying to market 廣告投不好 User data ROI Uplift modeling
  31. ML Code Modeling is Only a Small Fraction of a

    Real-world ML System
  32. Build AI Products with MLOps

  33. Bringing Machine Learning to Production Model Data? Deploy? Evaluation Serving

    API Monitor? Retrain? Pipeline Preprocessing Renew Decay
  34. MLOps ML Dev Ops

  35. Reference: https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and- automation-pipelines-in-machine-learning System Design for MLOps

  36. Our ML Platform

  37. • Productize and Scale ML Faster • Seamless Collaboration Between

    Data Team Members • Lower Operational Costs and Easy to Maintain Benefits of MLOps
  38. ML/Data Platform Role DS DE DA PM Biz MSE Summary

  39. Welcome to join us! Data Scientist Data Engineer

  40. Thank you.