Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A beginner's guide to Data Science - developHER remote edition 2020

Dânia Meira
November 07, 2020

A beginner's guide to Data Science - developHER remote edition 2020

KI, Datenwissenschaft, maschinelles Lernen, deep Lernen… was bedeutet das alles genau? Wie sieht ein datenwissenschaftliches Projekt aus und was kann maschinelles Lernen wirklich leisten? Was sind die Fähigkeiten der Datenexperten? Nimm an dieser Session teil, um die Antworten herauszufinden.

Sprache: 🇬🇧
Level: ★☆☆
Dauer: 60 Min.
Maximale Teilnehmer*innen: 30
Mit: Dânia

Dânia Meira

November 07, 2020
Tweet

More Decks by Dânia Meira

Other Decks in Technology

Transcript

  1. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira l A beginner’s guide to Data Science Introduction Data science, AI, Machine learning? Data Roles and Skills #datacareer Orientation 2 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  2. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira l A beginner’s guide to Data Science Introduction Data science, AI, Machine learning? Data Roles and Skills #datacareer Orientation 3 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  3. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira DÂNIA MEIRA • ML models for predictive analytics • #datacareer since 2012 • Former bootcamp teacher • Data scientist at myToys from 2018 to 2020 • Founding member, AI Guild linkedin.com/in/daniameira/ 4 Introduction 1/2 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  4. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira #datacommunity #datacareer #datalift linkedin.com/company/ai-guild twitter.com/ai_guild medium.com/ai-guild eventbrite.de/o/ai-guild-27115216103 bit.ly/youtube-ai-guild theguild.ai 5 Introduction 2/2
  5. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira l A beginner’s guide to Data Science Introduction Data science, AI, Machine learning? Data Roles and Skills #datacareer Orientation 6 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  6. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Artificial Intelligence (AI) Machine Learning (ML) Data Science (DS) A multidisciplinary field that uses scientific, computational and statistical methods to draw insights and build predictive models from data. Statistical techniques and algorithms that computer systems use to perform a specific task without explicit programming instructions, but instead processing data to detect patterns and inference. Deep Learning (DL) Type of ML methods based on artificial neural networks, algorithms inspired by the human brain, that learn from processing vast amounts of data. 7 Data Science, AI, Machine Learning? 1/8 A set of techniques that enable computers to perform specific tasks that mimic human intelligence using logic, if-else rules, and machine learning.
  7. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira AI Use Cases AI is good at focused tasks with a clear outcome: • It works best when there is a very large amount of training data. • It works well for specific cases where other methods fail - outlier detection, sparse matrix work. 8 Data Science, AI, Machine Learning? 2/8 https://pair.withgoogle.com/worksheet/user-needs.pdf
  8. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Data Science = Analytics + ML 9 Data Science A multidisciplinary field that uses scientific, computational and statistical methods to draw insights and build predictive models from data Analytics Task - understanding the business and using data to make better decisions Result - slide deck ML Task - Learning A to B mapping where A is the input and B is the output Result - software Data Science, AI, Machine Learning? 3/8
  9. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Examples of ML Input Output Application Picture Are there human faces? Photo tagging Loan application Will they repay the loan? Loan approvals Ad + User information Will user click on ad? Targeted online ads English sentence German sentence Language translation Recipe ingredients + customer reviews Will customer like the food? Food recommendation 10 Data Science, AI, Machine Learning? 4/8 Understand what are the pain points → Automate tasks not jobs
  10. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Where to start? • Identify the opportunity: Data Science Knowledge + Domain Knowledge • Define clear KPIs to establish what your model should predict and how. • Everyone’s on the same page about how the results can(not) be used to influence operations in your business from the very beginning. What DS can do What is valuable for your business How will we know if it helped or not? 11 Data Science, AI, Machine Learning? 5/8
  11. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Data Science Project Workflow • Cyclic workflow : Iteration • For Machine Learning: Prepare Data, Train + Evaluate Model, Deploy model. • For Analytics: Prepare Data, Analyze Data, Share Insights + Suggest Changes. Business & Data Understanding Evaluate Model Prepare Data Train Model Automate Model Serving Analyze Data Gather Results: Business Metrics and Process Performance Share Insights 12 Data Science, AI, Machine Learning? 6/8 Data Pipeline exists? Build Data Pipeline Testing, Deploying and Maintaining
  12. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Examples with code KevinLiao159/MyDataSciencePortfolio: Applying Data Science and Machine Learning to Solve Real World Business Problems 10 Data Science Projects | Data Science and Machine Learning 13 Data Science, AI, Machine Learning? 7/8
  13. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Intro to Machine Learning Tutorial in Kaggle A free online introduction to artificial intelligence for non-experts Learning and practicing - for free! Machine Learning for Everyone :: In simple words. With real-world examples. Yes, again 14 Data Science, AI, Machine Learning? 8/8
  14. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira l A beginner’s guide to Data Science Introduction Data science, AI, Machine learning? Data Roles and Skills #datacareer Orientation 15 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  15. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Data Science is a team sport! 16 Data Roles and Skills 1/5 Instead of one person with broad range of skill sets Combine experts and people with those skills on different levels to work together in a team Data Science Project Workflow Business & Data Understanding Evaluate Model Prepare Data Train Model Automate Model Serving Analyze Data Gather Results: Business Metrics and Process Performance Share Insights Build Data Pipeline
  16. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Tasks Understand business case, build features to train predictive models to address such use cases Skills Statistics, SQL, programming (e.g. python, R), ML & DL techniques. Data Scientist Tasks Business and data understanding to report on what happens Skills Descriptive analytics, SQL, statistics, dashboarding and visualization tools Data Analyst Data Engineer Tasks Build and maintain infrastructure and pipeline to collect, clean and pre-process data Skills Distributed systems, databases, software engineering Tasks Optimize, deploy and maintain machine learning models in production Skills Software engineering, devOps and systems architecture Machine Learning Engineer 17 Data Roles and Skills 2/5 Data Roles and Skills
  17. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Data Roles: Tasks ML Models Data Collection Data Quality Infrastructure Process Management Tools Monitoring Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation A/B Testing Data Engineer Data Scientist Data Analyst ML Engineer Data roles See also: “Hidden Technical Debt in Machine Learning System” by Sculley et al, Google inc, 2015 Machine Resource Management Configuration Business Logic 18 Data Roles and Skills 3/5
  18. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira ‚Cooking‘ data ML Models Data Collection Data Quality Infrastructure Process Management Tools Machine Resource Management Monitoring Configuration Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation Business Logic A/B Testing See also: Understanding a Machine Learning Workflow Through Food by Daniel Godoy Sowing Harvesting Choose recipe Prepare ingredients Customers tasting Kitchen Tasting Use utensils Try combinations of appliances and recipes Kitchen space and available appliances 19 Data Roles and Skills 4/5
  19. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Understanding data roles Create and use recipes to cook Check quality of ingredients and recipes Process ingredients at scale Turn a recipe into many dishes served efficiently Data Engineer Data Scientist Data Analyst ML Engineer 20 Data Roles and Skills 5/5
  20. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira l A beginner’s guide to Data Science Introduction Data science, AI, Machine learning? Data Roles and Skills #datacareer Orientation 21 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  21. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Skills gap in corporate Europe #datacareer Orientation 1/5 22 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  22. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Skills gap in German industry #datacareer Orientation 2/5 23 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  23. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Skills gap among AI players in Germany #datacareer Orientation 3/5 24 developHER Remote Edition 2020 - A beginner’s guide to Data Science - Dânia Meira
  24. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Promising use cases among AI players in Germany 25 #datacareer Orientation 4/5
  25. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira Prospective use cases in Germany 26 #datacareer Orientation 5/5
  26. developHER Remote Edition 2020 - A beginner’s guide to Data

    Science - Dânia Meira 27 Your predictable path to senior level Friday, 27 November at 12:00 CET