Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyLadies - Machine learning for the curious but scared

PyLadies - Machine learning for the curious but scared

ellenkoenig

January 10, 2017
Tweet

More Decks by ellenkoenig

Other Decks in Technology

Transcript

  1. ONE WAY TO DEFINE LEARNING: LEARNING FROM EXPERIENCE BEING ABLE

    TO DEAL WITH NEW SITUATIONS BASED ON THE PAST
  2. OF HUMANS AND MACHINES WHAT HAPPENS DURING LEARNING? TRAINING DATA

    MACHINE LEARNING ALGORITHM MODEL FUNCTION (HYPOTHESIS) Input data about the world Processing by internal resources Learned represen- tation
  3. WHAT DOES THAT LOOK LIKE IN PRACTICE EXAMPLES Example Input

    data Learned Model Self-driving cars Terrain data (slope, roughness, etc.) Function mapping terrain to speed Price prediction engine Customer & market attributes and past prices Function mapping customer and market attributes to prices Gene sequence identification Lots and lots of genome data Clusters of re- occuring gene sequence patterns
  4. COMPONENTS OF A COMPLETE MACHINE LEARNING SYSTEM WHAT DOES A

    MACHINE NEED TO LEARN? TRAINING DATA INTERESTING DATA ML ALGORITHM MODEL (HYPOTHESIS) RESULT FEEDBACK
  5. TWO BASIC KINDS OF MACHINE LEARNING SUPERVISED VS UNSUPERVISED LEARNING

    User tastes User 1 likes The Clash User 23 likes Die Ärzte User 42 likes Helene Fischer User 1 likes The Sex Pistols User 42 likes Heino Rain Wind Umbrella? heavy light yes none light no light strong no light light yes none strong no Supervised Unsupervised
  6. THE STARTING POINT A BASIC WORKFLOW FOR WORKING ON MACHINE

    LEARNING PROBLEMS 1. Understand the problem and context 2. Understand & clean the data, create some features 3. For supervised learning: Split into training and test data 4. Evaluate different algorithms with default parameters 5. Optimize the parameters and compute the results 6. Interpret the results 7. Repeat with different features until you get useful results
  7. LEARN BY REPEATING THE WORKFLOW RINSE AND REPEAT PICK ONE

    TOOL TRY THE WORKFLOW PICK A (“TOY”) PROBLEM PICK A TYPE OF ALGORITHM
  8. WHY “TOY” PROBLEMS? “DIVIDE AND CONQUER”! 1. Understand the problem

    and context 2. Understand & clean the data, create some features 3. For supervised learning: Split into training and test data 4. Evaluate different algorithms with default parameters 5. Optimize the hyperparameters and compute the results 6. Interpret the results 7. Repeat with different features until you get useful results
  9. THE IMPORTANCE OF TOY PROBLEMS TUTORIAL: WOULD YOU SURVIVE THE

    TITANIC? https://blog.socialcops.com/engineering/machine-learning-python/
  10. DEPENDENCIES IF YOU DON’T HAVE THE NEEDED TOOLS INSTALLED YET…

    1. Download Anaconda with Python 3.5 https:// www.continuum.io/downloads 2. Go to your terminal: “conda  create  -­‐n  titanic  python==3.4  numpy  ipython   jupyter  pandas  scipy  matplotlib  scikit-­‐learn”   3. Terminal: “source  activate  titanic” 4. Terminal: “conda  install  -­‐c  conda-­‐forge   tensorflow”  (Linux / OS X, skip for Windows) 5. Ready to start the tutorial! :)
  11. WHERE TO CONTINUE RECOMMENDED RESOURCES FOR BEGINNERS (IN ORDER OF

    RECOMMENDATION) ▸ Tutorial for the “Kaggle Titanic Competition” (using R): http://trevorstephens.com/post/72916401642/titanic-getting- started-with-r ▸ Online courses (MOOCs): ▸ Udacity: Intro to Machine Learning: https://www.udacity.com/course/intro-to-machine-learning--ud120 (Excellent intro to applied ML using sci-kit learn and Python) ▸ Coursera: Machine Learning: https://www.coursera.org/learn/machine-learning (Friendly intro to the theory behind common ML algorithm) ▸ Machine Learning Mastery: Lots of self-study guides for ML learners http://machinelearningmastery.com/ ▸ UCI ML Repository: Collection of “Toy problems” for ML http://archive.ics.uci.edu/ml/datasets.html ▸ Toolkits: ▸ Scikit-Learn (Python, great online documentation): http://scikit-learn.org/stable/ ▸ stats package (many simple ML algorithms), pre-installed (R) Examples: http://www.statmethods.net/stats/ regression.html ▸ Book: Abu-Mostafa, Magdon-Ismail, Lin: Learning From Data - A Short Course (AMLbook.com ) (Good intro to more academic perspectives, notation and vocabulary on ML)
  12. LICENCE: CREATIVE COMMONS “ATTRIBUTION - SHARE ALIKE” 4.0 HTTPS:// CREATIVECOMMONS.ORG/LICENSES/BY-SA/4.0/

    IMAGE CREDITS ▸ Slide 1: http://work.caltech.edu/edx1.html , at 5:20 of the video ▸ Slide 3: http://www.thebluediamondgallery.com/highlighted/l/learning.html ▸ Slide 4: All https://pixabay.com/ ▸ Slide 5: https://en.wikipedia.org/wiki/Consciousness#/media/ File:Neural_Correlates_Of_Consciousness.jpg ▸ Slide 6: Based on https://commons.wikimedia.org/wiki/ File:Machine_Learning_Technique..JPG ▸ Slide 10 & 12: https://pixabay.com/ ▸ Slide 12: http://blog.kaggle.com/2016/07/21/approaching-almost-any-machine- learning-problem-abhishek-thakur/ (Abishek Thakur) ▸ SLide 13: