Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GW Data Club

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.
Avatar for Ese Ese
October 16, 2020

GW Data Club

A workshop showing an end-to-end data science solution

Avatar for Ese

Ese

October 16, 2020

More Decks by Ese

Other Decks in Technology

Transcript

  1. Eseoghene Emuraye • Statistical Consultant - Provide consultation in programming

    and machine learning Graduate Student, Data Science The George Washington University • Cognitive Engineer Consultant Intern - Worked with the New York State Government • Graduate Teaching Assistant (Natural Language Processing) - Assist students and grade assignments/exams • Hiking/exploring nature • Playing the piano Favorite way to pass time @eemuraye linkedin.com/in/eserichard
  2. Outline • Learning objectives • Understanding how data scientists work

    in the industry • Scope of work for Data Scientists/ML Engineers • Model Experimentation/Development phase • MLOps phase • The Data Science Lifecycle • Case study: Used Cars from cardekho.com • Deploying at scale • Key Takeaways
  3. Learning Objectives • Understand the roles played by data scientists/machine

    engineers in a team • Gain basic understanding of how data science in applied in the real world
  4. A typical team where a data scientist works Core team

    UI/UX Data Engineer Data Scientist/ML Engineer Software Engineer (Frontend/Backend/DevOps) Project Manager Team Lead Business Analysts Solution Architect Working Product
  5. Phase I: Model Development and Prototyping Phase II: Deployment and

    MLOps • Deploying model prototype at scale • Data Feedback loop • Model Monitoring • Continuous Delivery • Analyzing data • Deriving business insights • Prototyping models • Testing and Evaluation Scope of work for Data Scientists/ML Engineers
  6. Lifecycle of a data science project • Business Understanding •

    Discussion with stakeholders and subject matter experts • Data collection • Extracting the required data from its source • Creating a pipeline from the data source • Data Preparation • Cleaning the data • Sorting missing and null values • Removing irrelevant data • Exploratory Data Analysis • Understanding the data and features • Making visualizations • Statistical Analyses
  7. Lifecycle of a data science project • Modeling • Feature

    engineering • Model training • Model Evaluation • Using the right metrics for the right problem • Model Deployment • Integrating to a webservice • Monitoring • Communicating to management
  8. Case Study: Car Price Prediction CarDekho is an Indian car

    search company that helps users to buy new and used cars About CarDekho This project involves the use of features from the cardekho vehicle dataset available on Kaggle.com for car price prediction. Watch project live implementation from Krisk Naik YouTube channel here Github link here Scope
  9. Key Takeaways • Data Scientists/Machine Learning Engineers work in a

    team with other engineers and managers to deliver a complete solution • An end-to-end data science solution often requires understanding of basic software engineering best practices