Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let's Play in DataPark - Nina Cheng

Let's Play in DataPark - Nina Cheng

20210319 LINE Developers Meetup #14 @ 台大集思會館

Event: https://linegroup.kktix.cc/events/20210319

line_developers_tw2

March 19, 2021
Tweet

More Decks by line_developers_tw2

Other Decks in Technology

Transcript

  1. Agenda • ML-enhanced LINE Services • Roles and Responsibilities in

    Data Team • ML Applications We Built • Differences Between School and Company • Build AI Products with MLOps
  2. USERS > 19M/d TODAY > 1M articles/y SHOPPING > 10M

    queries/m OA > 1B interactions/m Data in LINE We Are Facing
  3. • Build and optimize data pipeline architecture • Assemble large,

    complex data sets that meet requirements • Select appropriate datasets and data representation methods • Research and implement appropriate ML algorithms Data Scientist Data Engineer ML Svc Engineer Data Analyst Skills and Responsibility Big data infra, SQL, ETL, message queuing Machine learning, deep learning, CV, NLP, Speech • Build and scale machine learning infrastructure • Monitor model performance System infrastructure design, DevOps • Interpret data, analyze results using statistical techniques • Identify, analyze, and interpret trends or patterns in complex data sets Statistics, Data Visualization, Business Knowledge SKILL RESPONSIBILITY Pipeline Model Service Biz
  4. DS DE MSE DA PM Biz Key Roles and Activities

    in ML Workflow DS DE DS DS DE DA MSE Data Labeling Feature Model Experiment Version control Model develop Hyper- parameters tuning Computing resources Validation Analysis Resources Scaling Performance Reliability Model decay Biz analysis Analyze
  5. Uplift Modeling Buy if not treated Buy if treated High

    Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables Customer Segmentation by Conversion and Treatment MarTech
  6. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables
  7. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables
  8. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables
  9. Uplift Modeling Target Customers Who Are Persuadable MarTech Buy if

    not treated Buy if treated High Low Low High Sleeping Dog Lost Cause Sure Thing Persuadables Uplift M odel
  10. Leverage Lookalike groups to Estimate Customer Uplift To overcome this

    counter-factual nature, uplift modeling crucially relies on randomized experiments Control Group (Yi(0)) Treatment Group (Yi(1)) Reference: Causal Inference and Uplift Modeling A review of the literature MarTech 秋田犬 柴犬 圖片來源:Google
  11. Estimate the Individual Treatment Effect - Training MarTech User behavior

    one month before the campaign started Treatment (秋田) Control (柴犬) Campaign period Treatment Response w/o treatment Response w treatment Browsing History Interaction Deposit Profile Interest Wealth 圖片來源:Google
  12. Estimate the Individual Treatment Effect - Predict MarTech User behavior

    one month before the campaign started Campaign period New Response w/o treatment Response w treatment Uplift Model Browsing History Interaction Deposit Profile Interest Wealth 圖片來源:Google
  13. Estimate the Individual Treatment Effect - Predict MarTech User behavior

    one month before the campaign started Campaign period New Response Uplift Uplift Model Browsing History Interaction Deposit Profile Interest Wealth 圖片來源:Google
  14. Uplift By Declines Persuadables Lost causes/ sure things Sleeping dog

    Reference: Causal Inference and Uplift Modeling A review of the literature MarTech
  15. Related Search for LINE SHOPPING A service which will show

    a list of recommended keywords (extended words) when user are searching in LINE shopping. 1. Product spec - Helping users find the product they want quickly 2. Similar products - Pushing users to buy more When user searches “吸塵器”… NLP e.g. 吹風機、掃地機器人
  16. 26 Model • Learns to represent objects of different types

    into a common vectorial embedding space (not necessarily the same type as the items in the set) Data • User search log 吸塵器 dyson #吹風機 吹風機 #吸塵器 直立式 無線 Related Search for LINE SHOPPING NLP 無線吸塵器 直立式吸塵器 Dyson吸塵器 Dyson吹風機
  17. 27 Model • Learns to represent objects of different types

    into a common vectorial embedding space (not necessarily the same type as the items in the set) Data • User search log Similar Products 吸塵器 dyson #吹風機 吹風機 #吸塵器 直立式 無線 Related Search for LINE SHOPPING - 無線 吸塵器 #吸塵器 - 直立式 吸塵器 #吸塵器 - dyson 吸塵器 #吸塵器 - dyson 吹風機 #吹風機 Product Spec NLP
  18. NER Tokenizer Keyword Extraction Related Search Auto Complete Duplication Detector

    Multiword Term Article Classifier NLP module NLP Module
  19. School Company Defined problem Given dataset Given metric Trying to

    know methods Beating the baseline Differences Between School and Company Discovering problem Preparing data Defining metric Seeking possible methods Applying to market 廣告投不好 User data ROI Uplift modeling
  20. Bringing Machine Learning to Production Model Data? Deploy? Evaluation Serving

    API Monitor? Retrain? Pipeline Preprocessing Renew Decay
  21. • Productize and Scale ML Faster • Seamless Collaboration Between

    Data Team Members • Lower Operational Costs and Easy to Maintain Benefits of MLOps