Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Data Mining!

Kevin Hale
December 05, 2011

Introduction to Data Mining!

Kevin Hale

December 05, 2011
Tweet

More Decks by Kevin Hale

Other Decks in Technology

Transcript

  1. Data Mining!
    An Introduction

    View Slide

  2. View Slide

  3. Wufoo.com

    View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. What is data mining?

    View Slide

  9. View Slide

  10. Collection? No!
    Extraction. Yup.

    View Slide

  11. View Slide

  12. 324 - 576 megapixels
    Stereo Audio 20-20,000hz
    10,000 Chemical Compounds
    5-6 Flavors
    Temperature / Pressure / Texture
    2.5 Petabytes
    Eyes
    Ears
    Nose
    Mouth
    Skin
    Memory

    View Slide

  13. View Slide

  14. View Slide

  15. The process of extracting patterns
    from large data sets.

    View Slide

  16. What are some examples of
    large data sets?

    View Slide

  17. Astronomy
    Biology
    Business
    Internet
    Government
    Religion

    View Slide

  18. View Slide

  19. View Slide

  20. View Slide

  21. View Slide

  22. View Slide

  23. View Slide

  24. Online Surveys

    View Slide

  25. Individuals, Developers, Designers,
    Non-Profits, Teachers, Students,
    Universities, Research, Real Estate,
    Marketing, Healthcare, Banks, SMBs

    View Slide

  26. View Slide

  27. What do they do with all that data?

    View Slide

  28. View Slide

  29. Positive / Negative
    Likert Scale
    Ratings
    Multiple Choice
    Open Feedback

    View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. What are some potential problems
    with data collected by asking?

    View Slide

  35. View Slide

  36. View Slide

  37. Data collection is just the first part.

    View Slide

  38. Association Rule Learning
    Clustering
    Classification
    Regression
    Visualization

    View Slide

  39. Statistics
    Artificial Intelligence
    Database Management

    View Slide

  40. Bayes Theorem (1700s)
    Regression Analysis (1800s)
    Neural Networks (1940s)
    Genetic Algorithms (1950s)
    Decision Tree Learning (1960s)
    Support Vector Machines (1990s)

    View Slide

  41. View Slide

  42. View Slide

  43. Google Flu Trends

    View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. Hans Rosling

    View Slide

  48. View Slide

  49. Recommendation Engines

    View Slide

  50. View Slide

  51. View Slide

  52. Relationships!

    View Slide

  53. View Slide

  54. View Slide

  55. View Slide

  56. Will my date have sex on the first date?
    Do you like the taste of beer?

    View Slide

  57. View Slide

  58. Assuming you were in the position to
    do so, would you launch nuclear
    weapons under any circumstances?
    82%

    View Slide

  59. In a certain light, wouldn't
    nuclear war be exciting?
    83%

    View Slide

  60. View Slide

  61. View Slide

  62. The Social Graph

    View Slide

  63. Privacy & Confidentiality Issues

    View Slide

  64. View Slide

  65. View Slide

  66. So that’s data mining!

    View Slide

  67. Thanks!

    View Slide